-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Raw data available #4
Comments
What would be the use case? Would Amazon Glacier work or is this a short retrieval time service? |
This would want live retrieval so glacier is out but reduced redundancy would be fine. But that's a detail. |
@pudo does the 25k ETL spit out nice CSVs? Could they be auto-pushed to s3? |
OK, http://data.openspending.org has a nice index page :-) |
Is this linked from anywhere? |
@mk270 not yet ... |
rufuspollock
referenced
this issue
in openspending/osep
Jun 11, 2013
Taken from http://lists.okfn.org/pipermail/openspending-dev/2013-April/000739.html but much extended.
Closing because:
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Establish a raw data s3 bucket with cleaned OS data in it. It has following structure
Questions
bucket name / location?
Propose data.openspending.org
Nice index
Put in a directory index - https://github.com/rgrp/s3-bucket-listing
How do we get this out of OS atm
Can we do this at the DB level (even just using postgres copy!) (via API is impossible for large datasets - i imagine we can't stream 3gb of data down over the web app ...)
Why
I want to do analysis / queries on OS data that are not supported (or too "costly") by the API - cf #3 (e.g. what are top recipients of uk gov spending ...). To do this I need the raw CSV so I can load into my local postgres / hadoop / bigtable ...
Aside: this in fact could be the import format - these could be the cleaned files we loaded into OS (which would move most of the ETL out of OS core but that's a completely separate discussion ...)
The text was updated successfully, but these errors were encountered: