Skip to content

Latest commit

 

History

History
170 lines (147 loc) · 15.4 KB

About Domain-Specific Knowledge Bases.md

File metadata and controls

170 lines (147 loc) · 15.4 KB

Domain-Specific Knowledge Bases

📝 Tutorials and Surveys

  1. Constructing Domain-specific Knowledge Graphs (AAAI 2018) [Tutorial]
  2. Domain-specific Knowledge Graphs: A survey (2020) [Paper]

📝 Research Papers

General Domain-Specific KB Construction and Refinement

  1. Towards the Completion of a Domain-Specific Knowledge Base with Emerging Query Terms [PDF] (ICDE 2019) 🌟
  2. Demonstrating Spindra: A Geographic Knowledge Graph Management System [PDF, demo] (ICDE 2019) 🌟
  3. Domain Specific Knowledge Graphs as a Service to the Public (KDD 2020, Applied Data Science Track) 🌟
  4. Probase: A Probabilistic Taxonomy for Text Understanding (SIGMOD 2012)🌟

ProBase : Microsoft Conceptual Graph

  • iterative learning algorithm for extraction and taxonomy construction algorithm
  • a probabilistic framework
  • largest general-purpose taxonomy fully automatically constructed
  1. Semantic Enrichment of Data for AI Applications (DEEM 2021)

LLM for Domain-Specific KG Constrcution 🔥🔥🔥

  1. BEAR: Revolutionizing Service Domain Knowledge Graph Construction with LLM (ICSOC 2023) [Paper]

📝 Research Papers about Different Tasks on Domain-Specific KBs

Domain Specific NER

  1. Learning Named Entity Tagger using Domain-Specific Dictionary [Paper] [Notes]
  2. A Hybrid Generative/Discriminative Model for Rapid Prototyping of Domain-Specific Named Entity Recognition [Paper]
  3. CHEMNER: Fine-Grained Chemistry Named Entity Recognition with Ontology-Guided Distant Supervision (EMNLP 2021) [Paper]

Domain Specific EL

  1. SHINE+: A General Framework for Domain-Specific Entity Linking with Heterogeneous Information Networks (TKDE 2018) 🌟
  2. A Semantic Approach for Entity Linking by Diverse Knowledge Integration incorporating Role-Based Chunking ((ICCIDS 2019)
  3. Towards Linking Camouflaged Descriptions to Implicit Products in E-commerce (SIGIR 2020) [Paper]
  4. Medical Entity Disambiguation using Graph Neural Networks (SIGMOD 2021) 🌟

Taxonomies of Domain Specific KBs

  1. TiFi: Taxonomy Induction for Fictional Domains? (WWW 2019)

KBs in Specific Domains

🛒 Product KBs

Keynotes and Tutorials

  1. Amazon Product Graph [Slides]
  2. Self-Driving Product Understanding for Thousands of Categories (By Luna Dong, Keynote at Knowledge Graphs and E-commerce Workshop, San Diego, CA, August 2020) [Slides]
  3. Building a Broad Knowledge Graph for Products (By Luna Dong, Keynote at IEEE International Conference on Data Engineering (ICDE), Macau, China, April 2019) [Slides]

Research Papers

  1. AutoKnow: Self-Driving Knowledge Collection for Products of Thousands of Types (KDD 2020, Applied Data Science Track) [Paper]🌟
  2. GoodsKG - a Product Knowledge Graph Project [GitHub]
  3. AliCoCo: Alibaba E-commerce Cognitive Concept Net (SIGMOD 2020 Industry Track) [Paper] [Github]🌟
  4. Product Knowledge Graph Embedding for E-commerce (WSDM 2020) [Paper]🌟
  5. Towards Knowledge-Based Personalized Product Description Generation in E-commerce [Paper, applied science track] (KDD 2019) 🌟
  6. TXtract: Taxonomy-aware knowledge extraction for thousands of product categories (ACL 2020)
  7. Automatic validation of textual attribute values in eCommerce Catalog by learning with limited labeled data (KDD 2020) 🌟
  8. Octet: Online catalog taxonomy enrichment with self-supervision (KDD 2020) 🌟
  9. OpenTag: Open attribute value extraction from product profiles (KDD 2018) 🌟
  10. DEXTER: Large-scale discovery and extraction of product specifications on the Web (VLDB 2016) 🌟
  11. P-Companion: A principled framework for diversified complementary product recommendation (CIKM 2020)
  12. J-Recs: Principled and scalable recommendation justification (ICDM 2020)
  13. PAM: Understanding Product Images in Cross Product Category Attribute Extraction (KDD 2021) [Paper]
  14. AliCoCo2: Commonsense Knowledge Extraction, Representation and Application in E-commerce (KDD 2021) 🌟
  15. AliCG : Alibaba Conceptual Graph for Semantic Search (KDD 2021) 🌟
  16. AliMe KG : Alibaba domain knowledge graph in E-commerce (CIKM 2020)
  17. Embedding-based Product Retrieval in Taobao Search (KDD 2021) [Paper] 🌟
  18. Product Knowledge Graph Embedding for E-commerce (WSDM 2020) [Paper]
  19. Weakly-Supervised Opinion Summarization by Leveraging External Information (AAAI 2020)
  20. PGE: Robust Product Graph Embedding Learning for Error Detection (arxiv 2022, Luna's team) [Paper]

Datasets

  1. Web Data Commons - Gold Standard for Product Matching and Product Feature Extraction [Link]

💊 Meidcal KBs

Note: Medical entity linking is also referred to as medical concept normalization (MCN)

Research Papers

  1. MedPath: Augmenting Health Risk Prediction via Medical Knowledge Paths (WWW 2021)
  • Personalized KG to provided personalized prediction and explicit reasoning.
  • The major idea is borrowed from MHGRN (multi-hop graph): Scalable Multi-Hop Relational Reasoning for Knowledge-Aware Question Answering (EMNLP 2020) [Paper] [Notes in Chinese]
  1. Medical Entity Disambiguation using Graph Neural Networks (SIGMOD 2021) 🌟
  • This work introduces ED-GNN based on three representative GNNs (GraphSAGE, R-GCN, and MAGNN) for Medical ED.
  • There are two optimization techniques: (1) a novel strategy to represent entities mentioned in text snippets as a query graph; (2) an effective negative sampling strategy.
  1. Property Graph Schema Optimization for Domain-Specific Knowledge Graphs (ICDE 2021) 🌟
  2. MEDTO: Medical Data to Ontology Matching Using Hybrid Graph Neural Networks (KDD 2021) 🌟
  3. DETERRENT: Knowledge Guided Graph Attention Network for Detecting Healthcare Misinformation (KDD 2020) 🌟 [Paper] [GitHub] Healthcare Misinformation Detection
  • A novel problem of explainable healthcare misinformation detection (from the web) by leveraging medical knowledge graph to better capture the high-order relations between entities.
  • RGCN (with attention) for KG reasoning + text encoer of articles = learn the representation for each earticle, then formulate a classification problem to distinguish if a news is fake.
  • The support KG: KnowLife: a versatile approach for constructing a large knowledge graph for biomedical sciences [Paper] [Website]
  • Similar basic code (text+GRU+RGCN): Learning to Update Knowledge Graphs by Reading News (EMNLP 2019) [GitHub]
  1. CHEMNER: Fine-Grained Chemistry Named Entity Recognition with Ontology-Guided Distant Supervision (EMNLP 2021) [Paper]
  2. Kformer: Knowledge Injection in Transformer Feed-Forward Layers [Arxiv 2022]
  • There is a medical QA task in the experiment based on a [Medical KB].
  1. Taiyi: A Bilingual Fine-Tuned Large Language Model for Diverse Biomedical Tasks (Arxiv, Nov 2023) [Paper] 🔥
  • Datasets: a comprehensive collection of 140 existing biomedical text mining datasets (38 Chinese datasets and 102 English datasets)
  • Tasks: named entity recognition, relation extraction, text classification, question answering tasks
  1. LLMs Accelerate Annotation for Medical Information Extraction (PMLR 2023) [Paper] 🔥

Datasets

  1. PubMed
  2. MDX [Link]
  3. MIMIC-III [Reference]
  4. Bio CDR [Reference]
  5. NCBI [Reference], NCBID [Reference]
  6. ShARe [Reference]
  7. BioCreative [Reference]
  8. Summary from NormCo [Github]
  9. Datasets provided by [MedType]: [WikiMed] and [PubMedDS]
  10. Unified Medical Language System (UMLS): 4.2 million biomedical concepts, with 127 types
  • There is a UMLS Semantic Network for concept mapping to semantic types?
  1. MedMetions [Reference]
  2. Knowlife
  3. PrimeKG [Github]
  • Precision Medicine Knowledge Graph (PrimeKG) presents a holistic view of diseases. PrimeKG integrates 20 high-quality biomedical resources to describe 17,080 diseases with 4,050,249 relationships representing ten major biological scales.

Useful tools (mainly for NER and EL to preprecess the data)

  1. Resources Collection: AwesomeBioIE [GitHub]
  2. BioBERT for NER (and RE) BioBERT: a pre-trained biomedical language representation model for biomedical text mining [Paper] [GitHub]
  3. DeepMatcher for EM: Deep Learning for Entity Matching: A Design Space Exploration (SIGMOD 2018) [PDF] [Code and Data] 🌟
  4. NCEL for EL: Neural Collective Entity Linking (COLING 2018) [Paper] [Github]
  5. SciSpacy (as neural med-linker): SciSpaCy: Fast and Robust Models for Biomedical Natural Language Processing (arxiv 2019) [GitHub]
  6. cTAKES for medical entity linker (map named entities to UMLS concepts) [Reference]
  7. Quick-UMLS for medical entity linker
  8. MetaMap for medical entity linker (map biomedical mentions in text to UMLS concepts) [Tool]
  • MetaMapLite: reimplements baisc MetaMap with an additional emphasis on real-time processing and competitive performance [Tool]
  1. QuickUMLS [GitHub]
  2. MedaCy [GitHub] For NER
  • "After installing medaCy and medaCy's clinical model..." I come across the same issue as #210 and #209, will figure out later.
  1. An Advanced Review on Text Mining in Medicine [Website]

People

  1. Fatma Özcan [DBLP]
  2. Lei Chuan [Website]
  3. Xuan Wang [Website]

Materials

  1. The Construction and Applications of Medical KGs (in Chinese, 医疗领域图谱的构建与应用) [Link]

💰 Finance KBs

Survey and Intersting Discussion

  1. Financial Risk Analysis for SMEs with Graph-based Supply Chain Mining (IJCAI 2020, Special Track on AI in FinTech) [Paper]
  • The SME graph as well as the labeled data for supply chain mining are from Alipay.
  1. 综述 | GNN金融风控领域业界进展调研 [Link]

Datasets

  1. Fannie Mae Single-Family Loan Performance Data [Link 1] [Link 2]
  2. Data Set and Evaluation of Automated Construction of Financial Knowledge Graph [Link]
  3. 企业知识图谱 [Link]
  4. 金融时序超图(Finanical Temporal Hypergraph Ontology,FTHO) [Link]
  5. 基金知识图谱 [Link]
  6. 其他中文金融相关知识图谱数据集 [Link]

🙍 Personalized KB

Research Papers

  1. Personalized Knowledge Graph Summarization: From the Cloud to Your Pocket (ICDM 2019 Best Paper) [Paper]
  • Knapsack, submodula objective function, (1 − {1}{e})-approximation algorithm
  1. What is Normal, What is Strange, and What is Missing in a Knowledge Graph: Unified Characterization via Inductive Summarization (WWW 2020) [Paper]

📅 Event KGs

Research Papers

  1. Searching News Articles Using an Event Knowledge Graph Leveraged by Wikidata (WWW 2019) [Paper]
  2. NewsLink: Empowering Intuitive News Search with Knowledge Graphs (ICDE 2021)
  3. ASER: A Large-scale Eventuality Knowledge Graph (WWW 2020) [Paper] [Code]

💭 Opinion Graphs

Research Papers

  1. Constructing Explainable Opinion Graphs from Reviews (WWW 2021) [Paper] [Code]

📝 Others

  1. Knowledge-aware Assessment of Severity of Suicide Risk for Early Intervention (WWW 2019)
  2. FoodKG: A Semantics-Driven Knowledge Graph for Food Recommendation (ISWC 2019) [Github]
  3. Modern Natural Language Processing Techniques for Scientific Web Mining: Tasks, Data, and Tools (WWW 2022 tutorial)
  4. UUKG: Unified Urban Knowledge Graph Dataset for Urban Spatiotemporal Prediction (NeurIPS 2023, Datasets and Benchmarks Track) [Paper]