Skip to content

Commit

Permalink
Merge branch 'master' into official-missions-dataset
Browse files Browse the repository at this point in the history
  • Loading branch information
cuducos committed Jun 17, 2017
2 parents 60687b0 + bced7b3 commit 44b2f34
Show file tree
Hide file tree
Showing 13 changed files with 221 additions and 81 deletions.
85 changes: 53 additions & 32 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,23 +27,7 @@ Installation

::

$ pip install git+https://github.com/datasciencebr/serenata-toolbox.git#egg=serenata-toolbox

Development
------------

Clone the repo and use it within your virtualenv.

::

$ git clone https://github.com/datasciencebr/serenata-toolbox.git
$ python setup.py develop

We use `Elm's philosophy <https://github.com/elm-lang/elm-package#version-rules>`_ for version bumping:

* MICRO: the API is the same, no risk of breaking code
* MINOR: values have been added, existing values are unchanged
* MAJOR: existing values have been changed or removed
$ pip install git+https://github.com/datasciencebr/serenata-toolbox.git#egg=serenata-toolbox

Usage
-----
Expand All @@ -60,14 +44,14 @@ We have `plenty of them <https://github.com/datasciencebr/serenata-de-amor/blob/
from serenata_toolbox.datasets import Datasets
datasets = Datasets('/tmp/serenata-data/')
# now lets see what datasets are available
for dataset in datasets.remote.all:
# now lets see what are the latest datasets available
for dataset in datasets.downloader.LATEST:
print(dataset) # and you'll see a long list of datasets!
# now let's download one of them
# and let's download one of them
datasets.downloader.download('2016-12-06-reibursements.xz') # yay, you've just downloaded this dataset to /tmp/serenata-data/
# You can also get the most recent version of all datasets:
# you can also get the most recent version of all datasets:
latest = list(dataset.downloader.LATEST)
datasets.downloader.download(latest)
Expand All @@ -89,15 +73,15 @@ If you ever wonder how did we generated these datasets, this toolbox can help yo

.. code:: python
from serenata_toolbox.federal_senate.federal_senate_dataset import FederalSenateDataset
from serenata_toolbox.chamber_of_deputies.chamber_of_deputies_dataset import ChamberOfDeputiesDataset
from serenata_toolbox.federal_senate.dataset import Dataset
from serenata_toolbox.chamber_of_deputies.dataset import Dataset
senate = FederalSenateDataset('/tmp/serenata-data/')
senate = Dataset('/tmp/serenata-data/')
senate.fetch()
senate.translate()
senate.clean()
chamber = ChamberOfDeputiesDataset('/tmp/serenata-data/')
chamber = Dataset('/tmp/serenata-data/')
chamber.fetch()
chamber.translate()
chamber.clean()
Expand All @@ -111,17 +95,54 @@ The `full documentation <https://serenata_toolbox.readthedocs.io>`_ is still a w

$ cd docs
$ make clean;make rst;rm source/modules.rst;make html
Run Unit Test suite
-------------------

Contributing
------------

Within your `virtualenv <https://virtualenv.pypa.io/en/stable/>`_:

::

$ git clone https://github.com/datasciencebr/serenata-toolbox.git
$ python setup.py develop

Always add tests to your contribution — if you want to test it locally before opening the PR:

::

$ python -m unittest discover tests

Source Code
-----------
When the tests are passing, also check for coverage of the modules you edited or added — if you want to check it before opening the PR:

::

$ pip install coverage
$ coverage run -m unittest discover tests
$ coverage html
$ open htmlcov/index.html

Follow `PEP8 <https://www.python.org/dev/peps/pep-0008/>`_ and best practices implemented by `Landscape <https://landscape.io>`_ in the `veryhigh` strictness level — if you want to check them locally before opening the PR:

::

$ pip install prospector
$ prospector -s veryhigh serenata_toolbox

If this report includes issues related to `import` section of your files, `isort <https://github.com/timothycrosley/isort>`_ can help you:

Feel free to fork, evaluate and contribute to this project.
::

$ pip install isort
$ isort **/*.py --diff

Always suggest a version bump. We use `Elm's philosophy <https://github.com/elm-lang/elm-package#version-rules>`_ for version bumping:

* MICRO: the API is the same, no risk of breaking code
* MINOR: values have been added, existing values are unchanged
* MAJOR: existing values have been changed or removed

And finally take *The Zen of Python* into account:

::

Source: https://github.com/datasciencebr/serenata-toolbox/
$ python -m this
7 changes: 7 additions & 0 deletions docs/source/modules.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
serenata_toolbox
================

.. toctree::
:maxdepth: 4

serenata_toolbox
62 changes: 62 additions & 0 deletions docs/source/serenata_toolbox.chamber_of_deputies.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
serenata_toolbox.chamber_of_deputies package
============================================

Submodules
----------

serenata_toolbox.chamber_of_deputies.dataset module
---------------------------------------------------

.. automodule:: serenata_toolbox.chamber_of_deputies.dataset
:members:
:undoc-members:
:show-inheritance:

serenata_toolbox.chamber_of_deputies.deputies_dataset module
------------------------------------------------------------

.. automodule:: serenata_toolbox.chamber_of_deputies.deputies_dataset
:members:
:undoc-members:
:show-inheritance:

serenata_toolbox.chamber_of_deputies.presences_dataset module
-------------------------------------------------------------

.. automodule:: serenata_toolbox.chamber_of_deputies.presences_dataset
:members:
:undoc-members:
:show-inheritance:

serenata_toolbox.chamber_of_deputies.reimbursements module
----------------------------------------------------------

.. automodule:: serenata_toolbox.chamber_of_deputies.reimbursements
:members:
:undoc-members:
:show-inheritance:

serenata_toolbox.chamber_of_deputies.session_start_times_dataset module
-----------------------------------------------------------------------

.. automodule:: serenata_toolbox.chamber_of_deputies.session_start_times_dataset
:members:
:undoc-members:
:show-inheritance:

serenata_toolbox.chamber_of_deputies.speeches_dataset module
------------------------------------------------------------

.. automodule:: serenata_toolbox.chamber_of_deputies.speeches_dataset
:members:
:undoc-members:
:show-inheritance:


Module contents
---------------

.. automodule:: serenata_toolbox.chamber_of_deputies
:members:
:undoc-members:
:show-inheritance:
54 changes: 54 additions & 0 deletions docs/source/serenata_toolbox.datasets.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
serenata_toolbox.datasets package
=================================

Submodules
----------

serenata_toolbox.datasets.contextmanager module
-----------------------------------------------

.. automodule:: serenata_toolbox.datasets.contextmanager
:members:
:undoc-members:
:show-inheritance:

serenata_toolbox.datasets.downloader module
-------------------------------------------

.. automodule:: serenata_toolbox.datasets.downloader
:members:
:undoc-members:
:show-inheritance:

serenata_toolbox.datasets.helpers module
----------------------------------------

.. automodule:: serenata_toolbox.datasets.helpers
:members:
:undoc-members:
:show-inheritance:

serenata_toolbox.datasets.local module
--------------------------------------

.. automodule:: serenata_toolbox.datasets.local
:members:
:undoc-members:
:show-inheritance:

serenata_toolbox.datasets.remote module
---------------------------------------

.. automodule:: serenata_toolbox.datasets.remote
:members:
:undoc-members:
:show-inheritance:


Module contents
---------------

.. automodule:: serenata_toolbox.datasets
:members:
:undoc-members:
:show-inheritance:
22 changes: 22 additions & 0 deletions docs/source/serenata_toolbox.federal_senate.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
serenata_toolbox.federal_senate package
=======================================

Submodules
----------

serenata_toolbox.federal_senate.dataset module
----------------------------------------------

.. automodule:: serenata_toolbox.federal_senate.dataset
:members:
:undoc-members:
:show-inheritance:


Module contents
---------------

.. automodule:: serenata_toolbox.federal_senate
:members:
:undoc-members:
:show-inheritance:
39 changes: 6 additions & 33 deletions docs/source/serenata_toolbox.rst
Original file line number Diff line number Diff line change
@@ -1,41 +1,14 @@
serenata_toolbox package
========================

Submodules
----------
Subpackages
-----------

serenata_toolbox.chamber_of_deputies_dataset module
------------------------------------

.. automodule:: serenata_toolbox.chamber_of_deputies_dataset
:members:
:undoc-members:
:show-inheritance:

serenata_toolbox.datasets module
--------------------------------

.. automodule:: serenata_toolbox.datasets
:members:
:undoc-members:
:show-inheritance:

serenata_toolbox.reimbursements module
--------------------------------------

.. automodule:: serenata_toolbox.reimbursements
:members:
:undoc-members:
:show-inheritance:

serenata_toolbox.xml2csv module
-------------------------------

.. automodule:: serenata_toolbox.xml2csv
:members:
:undoc-members:
:show-inheritance:
.. toctree::

serenata_toolbox.chamber_of_deputies
serenata_toolbox.datasets
serenata_toolbox.federal_senate

Module contents
---------------
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
import pandas as pd
from .reimbursements import Reimbursements

class ChamberOfDeputiesDataset:
class Dataset:

YEARS = [n for n in range(2009, date.today().year+1)]

Expand Down
3 changes: 2 additions & 1 deletion serenata_toolbox/datasets/downloader.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,8 @@ class Downloader:
'2017-05-29-deputies.xz',
'2017-05-29-presences.xz',
'2017-05-29-sessions.xz',
'2017-05-29-speeches.xz'
'2017-05-29-speeches.xz',
'2017-06-11-congresspeople-social-accounts.xz',
)

def __init__(self, target, **kwargs):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
import pandas as pd


class FederalSenateDataset:
class Dataset:
URL = 'http://www.senado.gov.br/transparencia/LAI/verba/{}.csv'

LAST_YEAR = date.today().year + 1
Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,5 +33,5 @@
'serenata_toolbox.datasets'
],
url=REPO_URL,
version='10.1.0'
version='11.1.1'
)
4 changes: 2 additions & 2 deletions tests/journey/test_chamber_of_deputies_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,14 @@
from shutil import rmtree
from unittest import main, TestCase

from serenata_toolbox.chamber_of_deputies.chamber_of_deputies_dataset import ChamberOfDeputiesDataset
from serenata_toolbox.chamber_of_deputies.dataset import Dataset

class TestChamberOfDeputiesDataset(TestCase):

def setUp(self):
self.path = mkdtemp(prefix='serenata-')
print(self.path)
self.subject = ChamberOfDeputiesDataset(self.path)
self.subject = Dataset(self.path)
self.years = [n for n in range(2009, date.today().year + 1)]


Expand Down
4 changes: 2 additions & 2 deletions tests/journey/test_federal_senate_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,13 @@
from tempfile import gettempdir
from unittest import TestCase

from serenata_toolbox.federal_senate.federal_senate_dataset import FederalSenateDataset
from serenata_toolbox.federal_senate.dataset import Dataset


class TestJourneyFederalSenateDataset(TestCase):
def setUp(self):
self.path = gettempdir()
self.subject = FederalSenateDataset(self.path)
self.subject = Dataset(self.path)

def test_journey_federal_senate_dataset(self):
# fetch_saves_raw_files
Expand Down
Loading

0 comments on commit 44b2f34

Please sign in to comment.