Linked Open Data-based knowledge panel built during a seminar at Karlsruhe Institute of Technology
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
backend-core
evaluation
frontend
preprocessing
.gitignore
README.md

README.md

Linked Open Data Seminar 2016 - Knowledge Panel

Description

This is a project in the context of the Linked Open Data (LOD) Seminar at AIFB at the Karlsruhe Institute of Technology. Goal was basically to integrate multiple LOD sources (in a first step only DBPedia and Yago) to build a knowledge panel or fact box (as known from Google or Wikipedia) on that basis. A major challenge was how to determine which properties of an entity, e.g. dbp:Karlsruhe are relevant and meaningful to be displayed to the user and which are not. Accordingly, a ranking of properties for specific entities or classes (rdf:type) of entities had to be elaborated, which is capable of ranking properties among multiple, distinct sources. While [1] already presented a good solution (although only working for one dataset, namely DBPedia) based on supervised machine learning, our approach is based of rather naive statistical metrics like TF-IDF. Our evaluation is based on rank biased overlap (RBO), as described in [2].

[1] Dessi, A., & Atzori, M. (2016). A machine-learning approach to ranking RDF properties. Future Generation Computer Systems, 54, 366–377. http://doi.org/10.1016/j.future.2015.04.018

[2] Webber, W., Moffat, A., & Zobel, J. (2010). A similarity measure for indefinite rankings. ACM Transactions on Information Systems, 28(4), 1–38. http://doi.org/10.1145/1852102.1852106

Implementation

The project consist of four software components.

  • Preprocessing scripts: Responsible for extracting statistics from LOD graphs and calculating TF and IDF on that base
  • Backend: Responsible for computing entity-specific, multi-source property ranking at runtime as well as constructing a combined JSON-LD serialized RDF graph from DBPedia and Yago on that base. Exposed as a RESTful webservice.
  • Frontend: Single Page App as user interface, which queries the backend based in a user input and prints a knowledge panel based on the response's RDF graph.
  • Evaluation: Scripts facilitating "manual" computation of RBO metrics for specific entities.

UML component diagram

UML sequence diagram

Team

License

MIT