Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for enumerations of datasets #26

Open
Aklakan opened this issue Mar 24, 2015 · 1 comment
Open

Support for enumerations of datasets #26

Aklakan opened this issue Mar 24, 2015 · 1 comment

Comments

@Aklakan
Copy link

Aklakan commented Mar 24, 2015

In many cases it might be useful if it was possible to state that a certain dataset is comprised a set of other datasets.

Consider DBpedia:
There are several files under http://downloads.dbpedia.org/2014/en, each of can be considered as a distribution of a dataset that corresponds to its own content.

Now we may want to state that the dataset that corresponds to the data in the official DBpedia endpoint is comprised of a set of datasets that correspond to the individual files that were loaded into the endpoint.

If I understood correctly, currently this would be accomplished by using void:subset.
However, this would not allow to state multiple enumerations of datasets that make up a specific larger dataset.

Instead, I propose to introduce a DatasetEnumeration:
Such an enumeration explicitly states that a certain targetDataset is exactly comprised of a set of given source datasets (i.e. the triples referred to by the merge of the source dataset must be equal to that of the target dataset).

This would also allow stating multiple enumerations for a single dataset, such as a mixture of official enumerations and custom ones (such as pre-merged files (e.g. types, geo, properties + labels) of DBpedia hosted on a third party server).

Also, metadata can then be attached to enumerations themselves, such as who created them when.

:foo
    a dataid:DatasetEnumeration ;
    dcterms:modified "someDate"^^xsd:dateTime ;
    dcterms:contributer :user1235 ;
    skos:theme :official ; // Tag the enumeration as an official one so we can filter by that tag
    rdfs:comment "Datasets included in the official DBpedia"@en ;
    dataid:targetDataset :official-dbpedia-sparql-endpoint ;
    dataid:sourceDatasets (
        :dbpedia-types-2014,
        :dbpedia-mapped-properties-2014,
        :dbpedia-labels-en-2014,
        ...
    ) # Don't actually use blank nodes - it is just written like this for convenience.
    .
@chile12
Copy link
Contributor

chile12 commented Jan 20, 2016

void:subset should do the trick, in my opinion

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants