Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Table dataset for disambiguation
Fetching latest commit…
Cannot retrieve the latest commit at this time.
This data set has been compiled from wikipedia online version in June 2013. The table columns have been manually annotated with dcterms:subject and rdf:type annotations. Format of the data: - Each file contains one table - Each line in the file represents a row in the table - The first row corresponds to the groundtruth annotations of rdf:type types - The second row corresponds to the groundtruth annotations of dcterms:subject types - Table columns are separated by semi-colon ";" - Multiple entries in one cell are seperated by comma (",") - The last line contains the corresponding column headers Related publications: under submission Contact person: Media Computer Science University of Passau stefan.zwicklbauer-at-uni-passau.de Number of tables: 50 Column statistics: min: 1 max: 5 mean: 2.64 total columns: 132 row statistics: min: 10 max: 232 mean: 54.14 total rows: 2707 Annotation statistics: total rdf:type annotations: 169 total dcterms:subject annotations: 160 mean rdf:type annotations: 1.29 mean dcterms:subject annotations: 1.21 total annotations: 329 mean annotations: 2.49