Add ConceptNet #15

cthoyt · 2020-06-27T12:57:32Z

Issue #2 brought attention to the ConceptNet as a possible dataset to include with PyKEEN. They provide a tab-separated dump of the database here. It is not pre-stratified into training/testing/evaluation sets.

Because this file has additional columns besides head, relation, and tail, its inclusion will also require an updated to the SingleTabbedDataset such that the usecols keyword argument can be specified in the dataset's __init__()

Blocked by #196 because splitting algorithm is currently too slow for big datasets (with more than ~5 million triples)

The text was updated successfully, but these errors were encountered:

Closes #15

cthoyt added the 💾 Dataset Related to datasets label Jun 27, 2020

cthoyt self-assigned this Jun 27, 2020

cthoyt added a commit that referenced this issue Nov 21, 2020

Add conceptnet

1e48146

Closes #15

cthoyt mentioned this issue Nov 21, 2020

Add ConceptNet #160

Merged

5 tasks

cthoyt mentioned this issue Dec 5, 2020

Splitting algorithm is too slow for large datasets #196

Closed

cthoyt closed this as completed in #160 Dec 10, 2020

cthoyt added a commit that referenced this issue Dec 10, 2020

Add ConceptNet (#160)

f25c011

Closes #15

cthoyt added the 💎 New Component label May 22, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ConceptNet #15

Add ConceptNet #15

cthoyt commented Jun 27, 2020 •

edited

Add ConceptNet #15

Add ConceptNet #15

Comments

cthoyt commented Jun 27, 2020 • edited

cthoyt commented Jun 27, 2020 •

edited