Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add electoral campaign donations datasets #169

Merged
merged 2 commits into from
Dec 4, 2017

Conversation

cuducos
Copy link
Collaborator

@cuducos cuducos commented Dec 4, 2017

What is the purpose of this Pull Request?
Add the electoral campaign donation datasets to the toolbox downloader.

What was done to achieve this purpose?
Outside the repo I uploaded the .xz files to S3 and here I added the files to the LATEST constant.

How to test if it really works?

from serenata_toolbox.datasets import fetch, fetch_latest_backup
files = (
   '2017-11-30-donations-candidates.xz',
    '2017-11-30-donations-committees.xz',
   '2017-11-30-donations-parties.xz'
)
for filename in files:
    fetch(filename, 'data/')

And check if the filer were downloaded successfully ; )

Who can help reviewing it?
@anaschwendler

@cuducos
Copy link
Collaborator Author

cuducos commented Dec 4, 2017

BTW:

FIx #165

And maybe it's useful to test the fetch_latest_backup function too:

from serenata_toolbox.datasets import fetch_latest_backup
fetch_latest_backup('data/')

That way we test if these new datasets are being downloaded by default in a default Serenata installation ; )

@anaschwendler
Copy link
Collaborator

🎉

What I did to test this PR:

  1. Cloned the project:
$ git clone git@github.com:datasciencebr/serenata-toolbox.git
  1. Change to its folder:
$ cd serenata-toolbox
  1. Change to @cuducos’ branch:
$ git fetch origin
$ git checkout -b cuducos-donation-data origin/cuducos-donation-data
$ git merge master
  1. Run the python fetch script:
>>> from serenata_toolbox.datasets import fetch, fetch_latest_backup
>>> files = (
   '2017-11-30-donations-candidates.xz',
    '2017-11-30-donations-committees.xz',
   '2017-11-30-donations-parties.xz'
)
>>> for filename in files:
    fetch(filename, 'data/')

The result:

Downloading 2017-11-30-donations-candidates.xz: 100%|█| 239M/239M [02:34<00:00, 1.54Mb/s]
Downloading 2017-11-30-donations-committees.xz: 100%|█████████████████████████████████████████████████| 5.64M/5.64M [00:03<00:00, 1.69Mb/s]
Downloading 2017-11-30-donations-parties.xz: 100%|████████████████████████████████████████████████████| 6.47M/6.47M [00:03<00:00, 1.72Mb/s]

And for fetch_latest_backup script:

>>> from serenata_toolbox.datasets import fetch_latest_backup
>>> fetch_latest_backup('data/')

Good! 🎉

@anaschwendler anaschwendler merged commit f98f244 into master Dec 4, 2017
@anaschwendler anaschwendler deleted the cuducos-donation-data branch December 4, 2017 14:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants