Skip to content

Matchmaker - a tool for semi-supervised label matching

Notifications You must be signed in to change notification settings

sklarman/matchmaker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Matchmaker

A prototype of a tool for semi-automated (interactive) label matching.

The intended use-case scenario is focused on the task of matching labels extracted from text (noun phrases represented as bags of words) against the most relevant and semantically related labels of SKOS concepts from a given SKOS taxonomy, in order to annotate the text with SKOS concepts and/or extend the SKOS-based knowledge graph with new concepts originating in unstructured data.

Key components involved:

Example data includes:

This work is HEAVILY UNDER CONSTRUCTION!

A sample workflow:

  1. (supervised) training of a word matching classifier to account for inflectional variants of concepts and minor typos occurring in the source and target labels, e.g.:

logic - logics (true)

logic - logica (true)

logic - logically (true)

logic - login (false)

  1. applying the classifier for generating mappings from the words in the source and target labels to WordNet vocabulary;

intelligence -intelligentsia|intelligently|intelligent|intelligence

  1. generating bags of words out of noun phrases extracted from labels (here conference names and SKOS labels) e.g.:

logic programming artificial intelligence reasoning

Logic for Programming, Artificial Intelligence, and Reasoning - 19th International Conference, LPAR-19, Stellenbosch, South Africa, December 14-19, 2013. Proceedings, http://dblp.l3s.de/d2r/page/publications/conf/lpar/2013

automated reasoning

Automated reasoning (ACM:10003794)

  1. generating similarity matrices between source and target bags of words, e.g.:
+------------+--------------------+--------------------+--------------------+
|(NULL)      |automated           |reasoning           |(NULL)              |
+------------+--------------------+--------------------+--------------------+
|logic       |0.013245033112582781|0.043010752688172046|0.043010752688172046|
+------------+--------------------+--------------------+--------------------+
|programming |0.19444444444444445 |0.04395604395604396 |0.19444444444444445 |
+------------+--------------------+--------------------+--------------------+
|artificial  |0.013513513513513514|0.015625            |0.015625            |
+------------+--------------------+--------------------+--------------------+
|intelligence|0.053763440860215055|0.5                 |0.5                 |
+------------+--------------------+--------------------+--------------------+
|reasoning   |0.013333333333333334|1.0                 |1.0                 |
+------------+--------------------+--------------------+--------------------+
|(NULL)      |0.19444444444444445 |1.0                 |null                |
+------------+--------------------+--------------------+--------------------+

  1. Propagating the matching score information to the neighborhood SKOS concepts.

  2. Training a label matching classifier using users accept-reject responses to subsequently proposed matches.

  3. Generating mappings by means of the classifier

About

Matchmaker - a tool for semi-supervised label matching

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages