Skip to content

Latest commit

 

History

History
95 lines (89 loc) · 4.64 KB

README.md

File metadata and controls

95 lines (89 loc) · 4.64 KB

Introduction

  • Resource Description Framework (RDF)
    • Resource: object that has URI identification, such as web page, images, videos
    • Description: attributes, features and relations among resources
    • Framework: description model, language and syntax
    • Basic unit, triple pattern: Subject(主语) -- predicate(谓语) -- Object(宾语)
    • Hop: path from source entity to target entity
    • SPARQL: query language for RDF
      • Variable in RDF starts with '?' or '$'
  • Compared with deep learning, knowledge graph provides interpretable.

Principle

  • 1. Data
    • 1.1 Structured data (graph mapping)
      • E.g., relational database (i.e., Database2Rdf), open kg (i.e., linked data, graph mapping 图映射)
    • 1.2 Semi-structured data (wrapper)
      • E.g., web page, web table, Wikipedia infobox
    • 1.3 Unstructured data (information extraction, often in closed domin)
      • E.g., natural language, images, video
  • 2. Knowledge extraction from 1.3 Unstructured data
    • 2.1 Entity extraction
      • Value/number detection and recognition
      • Running example
      • Entity linking
        • Definition: Find the entity (i.e., entity mention 实体指称项) in text and linking it to existing knowledge graph
        • Entity Disambiguation 实体消歧, page rank
        • Co-reference Resolution (CR) 共指消解
    • 2.2 Relation extraction
      • Input: unstructured text, a group of entities. Output: a group of triplets, e.g., (First Entity, Second Entity, Relation Type)
      • Methods [1, 2]
        • Pattern / rule matching
          • Trigger word pattern
          • Dependency parsing pattern, verb is trigger word. Running example
        • Supervised method
          • Running example
          • Two classifier
            • Yes or no classifier determine if there is a relation
            • Relation classifier determine the exact relation
        • Semi-supervised method
          • Bootstrapping
            • Example [1].
          • Distant supervision
            • Input: unstructured text, database contains known entity relations. Output; a set of labeled data
            • Combination of bootstrapping and supervised. Cannot find new relationship. Example [1].
            • Deep model: PCNN
    • 2.3 Event extraction
      • Trigger word 触发词
      • Time 时间
      • Location 地点
      • Event detection and tracking
  • ** 3.Knowledge fusion**
    • Entity alignment
      • Similar description
      • Similar attribute - value
      • Similar neighbor entities
    • Schema matching
    • Instance matching
  • Knowledge representation learning
    • Convert entity and relationship into vectors
    • Application: link prediction (given S and P, predict O) or relation prediction, knowledge reasoning
    • Translation based Methods
      • TransE: head + relation = tail
        • Drawbacks
          • cannot process one-vs-multiple (一对多), multiple-vs-one (多对一) or multiple-vs-multiple (多对多) relationship
          • cannot process symmetric relationship
      • TransH
      • TransR

Tools

Graph database

Application

  • Question answering
    • How to convert natural language to query language ?
  • Searching
  • Recommender system
  • Event prediction
  • Knowledge reasoning
  • Financial

Examples

Reference

[1] NLP笔记-Relation Extraction
[2] 知识抽取-实体及关系抽取