Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DCAT support for all datasets #1138

Closed
pietercolpaert opened this issue Jul 29, 2013 · 8 comments
Closed

DCAT support for all datasets #1138

pietercolpaert opened this issue Jul 29, 2013 · 8 comments

Comments

@pietercolpaert
Copy link

feature request

Allow people and machines to download a datadump which describes all datasets in a CKAN instance using the DCAT vocabulary. This datadump should have an option to be licensed according to the open knowledge definition.

why

  • Open meta-data!
  • This way people will be able to test generic tools on a lot of data

I know that we can already request a single dataset's meta-data as RDF using for instance http://datahub.io/dataset/thesesfr.rdf. To get a list of all the datasets we could query the API and then perform a lot of requests to get all the rdf. When trying this on data.gov, the fun ends quickly as http://catalog.data.gov/api/3/action/package_list gives a time out (see #1137)

@rossjones
Copy link
Contributor

You can schedule your own RDF dump using the rdf-export command-line tool (I suggest daily) the code for which is at https://github.com/okfn/ckan/blob/master/ckan/lib/cli.py#L488

The required command line args is in the doc at the link above.

@pietercolpaert
Copy link
Author

@rossjones but this is a tool for CKAN administrators on the server itself? I cannot use this tool to get a dump from data.gov right?

@rossjones
Copy link
Contributor

You're correct, you'd have to get the admins there to do it but it'll be a lot more efficient way of getting the data than 10s of thousands of API calls. At least until there is VoID support.

@pietercolpaert
Copy link
Author

Is this something that might get implemented quickly or is this a wontfix?

@rossjones
Copy link
Contributor

We have an open ticket for it on data.gov.uk, and then it depends whether it gets accepted into core CKAN. It isn't likely to be days until this is done, most likely weeks, and it won't be realtime (we have 9.5k datasets, the US has a lot more). And again, would need the administrator to set up the background task.

Easiest approach at present is to ask the data.gov team to enable the RDF dumps I think.

@pietercolpaert
Copy link
Author

Can you link us to that issue?

@rossjones
Copy link
Contributor

Afraid it isn't on github/openly accessible.

@kindly
Copy link
Contributor

kindly commented Sep 2, 2013

https://github.com/okfn/ckanext-dcat

This is being addressed here.

@kindly kindly closed this as completed Sep 2, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants