Skip to content

quhfus/table-disambiguation-corpus

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This data set has been compiled from wikipedia online version in June 2013. The table columns have been manually annotated with dcterms:subject and rdf:type annotations.

Format of the data:
- Each file contains one table
- Each line in the file represents a row in the table
- The first row corresponds to the groundtruth annotations of rdf:type types 
- The second row corresponds to the groundtruth annotations of dcterms:subject types
- Table columns are separated by semi-colon ";"
- Multiple entries in one cell are seperated by comma (",")
- The last line contains the corresponding column headers

Related publications:
under submission

Contact person:
Media Computer Science
University of Passau
stefan.zwicklbauer-at-uni-passau.de



Number of tables: 50

Column statistics:
min: 1
max: 5
mean: 2.64
total columns: 132

row statistics:
min: 10
max: 232
mean: 54.14
total rows: 2707


Annotation statistics:
total rdf:type annotations: 169
total dcterms:subject annotations: 160
mean rdf:type annotations: 1.29
mean dcterms:subject annotations: 1.21
total annotations: 329
mean annotations: 2.49

About

Table dataset for disambiguation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published