Skip to content
No description or website provided.
Java
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.settings
bin
resources
src
.classpath
.gitattributes
.gitignore
.project
Communities-Graph-Obama.svg
README.md
hdt-api.jar
hdt-core.jar
pom.xml

README.md

LOD-Community-Detection

This is the JAVA source code of our 2018 ISWC paper Detecting Erroneous Identity Links on the Web using Network Metrics.

Goal of the experiments:

In this work, we show how network metrics such as the community structure of the owl:sameAs graph can be used in order to detect possibly erroneous identity statements. For detecting the community structure inside each equality set, we use the Louvain algorithm. Using the resulted communities, we assign an error degree to each owl:sameAs link. This error degree is a value between 0.0 (possibly correct link) and 1.0 (possibly erroneous).

This code requires two external resources for replicating our experiments in the paper:

  1. Download the sameAs.cc dataset.

This data set contains 558.9 million owl:sameAs links collected from the 2015 LOD Laundromat crawl of over 650K data documents from the Web. It is exposed in a single HDT file that is 5GB in size, and is publicly accessible via an LDF interface.

  1. Download the Equivalence Classes.

This data set of equivalence classes results from the closure of all 558 million owl:sameAs links in the sameAs.cc data set.

All necessary resources and results are also available in our sameAs.cc Identity Web service.

You can’t perform that action at this time.