New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Export data (resources) and metadata (datapackage.json) in zip file #52
Comments
I to was surprised to learn that CSVs are replaced by URLs. Perhaps offering a choice would be helpful |
Exporting a ZIP file that includes the data as well as descriptor was one of the first things discussed when this extension was started. You can follow some of the discussion with @vitorbaptista here: #30. Basically this is something that on paper is relatively straight-forward but it's hard to implement in a way that is safe for all CKAN instances. Generating the zip file on demand it's dangerous (imagine exporting a dataset like this), so it's almost a given that the zipped data package should be generated asynchronously in a queue and stored at a given location. Starting from CKAN 2.7 we have really nice support for background jobs (that can also be enabled on 2.6). But this presents further problems as you need to trigger an update of the zip file whenever the dataset or the resources are updated to update the cached data package, so there might be a period when versions are out of sync. Then there is the issue of whether to run the async creation on demand (ie when someone wants to export the data package) or pregenerate all data packages on the background. The UI for the former is difficult to implement (what happens after the user clicks, and while the DP is being created), but the latter basically implies keeping a duplicate of all the upload files (minus compression rate) on all datasets, for a feature that might not be heavily used (until data packages take over the world of course). Not all maintainers might be keen on this. Perhaps the implementation I would be more keen on would be:
Even this one is not trivial to implement so if we decide to go for it I'd spec it more thoroughly A ballpark estimate for this feature is 4-7 days |
Source:
The text was updated successfully, but these errors were encountered: