Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot use ChamberDataset() to download data #206

Closed
mnunes opened this issue Dec 26, 2018 · 7 comments
Closed

Cannot use ChamberDataset() to download data #206

mnunes opened this issue Dec 26, 2018 · 7 comments

Comments

@mnunes
Copy link

mnunes commented Dec 26, 2018

I was trying to download chamber of deputies reimbursements data using the following code:

import numpy
from serenata_toolbox.chamber_of_deputies.reimbursements import Reimbursements as ChamberDataset

years = numpy.arange(2009, 2019, 1)

for j in years:
    chamber = ChamberDataset(j, 'data_camara/')
    chamber()

However, this is the error I get when I run my code:

Traceback (most recent call last):
  File "<stdin>", line 3, in <module>
  File "/usr/local/lib/python3.7/site-packages/serenata_toolbox/chamber_of_deputies/reimbursements.py", line 28, in __call__
    self.fetch()
  File "/usr/local/lib/python3.7/site-packages/serenata_toolbox/chamber_of_deputies/reimbursements.py", line 35, in fetch
    urlretrieve(URL.format(self.year), file_path)
  File "/usr/local/Cellar/python/3.7.1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 247, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "/usr/local/Cellar/python/3.7.1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/local/Cellar/python/3.7.1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 531, in open
    response = meth(req, response)
  File "/usr/local/Cellar/python/3.7.1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 641, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/local/Cellar/python/3.7.1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 569, in error
    return self._call_chain(*args)
  File "/usr/local/Cellar/python/3.7.1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 503, in _call_chain
    result = func(*args)
  File "/usr/local/Cellar/python/3.7.1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 649, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found

Since I am getting a 404 error, I opened reimbursements.py to see what is happening and it seems there is a problem on https://www.camara.leg.br/ website. Do you have any idea what is going on and how can it be fixed?

Senate data can be downloaded without any issue.

@mnunes mnunes changed the title Cannot use ChamberDataset() to download data Cannot use ChamberDataset() to download data Dec 26, 2018
@cuducos
Copy link
Collaborator

cuducos commented Dec 27, 2018

it seems there is a problem on https://www.camara.leg.br/ website.

Yep, that's the case. Have they changed the URL or was the service temporarily down?

@mnunes
Copy link
Author

mnunes commented Dec 27, 2018

Have they changed the URL or was the service temporarily down?

It seems some data is not available anymore. I did some research and if you try the following URLs, you get a 404 error:

https://www.camara.leg.br/cotas/Ano-2009.csv.zip
https://www.camara.leg.br/cotas/Ano-2010.csv.zip
https://www.camara.leg.br/cotas/Ano-2011.csv.zip
https://www.camara.leg.br/cotas/Ano-2012.csv.zip

However, you can download data for the last 6 years:

https://www.camara.leg.br/cotas/Ano-2013.csv.zip
https://www.camara.leg.br/cotas/Ano-2014.csv.zip
https://www.camara.leg.br/cotas/Ano-2015.csv.zip
https://www.camara.leg.br/cotas/Ano-2016.csv.zip
https://www.camara.leg.br/cotas/Ano-2017.csv.zip
https://www.camara.leg.br/cotas/Ano-2018.csv.zip

I think someone deleted reimbursement data from 2009 to 2012. It does not look like an URL change, as you can infer from the above links.

@cuducos
Copy link
Collaborator

cuducos commented Dec 27, 2018

That's interesting: I'm tweeting and tagging them, but feel free to send an official request and share what you find here ; )

@jedibruno
Copy link

I have submitted yesterday (27/12/2018) a formal request for information about this issue to the Chamber of Deputies. Legally, they have roughly until the end of January 2019 to give an official answer. As soon as I received something I'll update here.

Just for the record, the formal request text (in Portuguese) was this one below. In case anything else is missing, just give me the tip and I can make a new request.

--- FOIA REQUEST ---

Em 27/12/2018 tentou-se acessar os conjuntos de dados públicos da Cota de Exercício da Atividade Parlamentar (CEAP), referentes aos exercícios de 2009, 2010, 2011 e 2012, disponibilizados pela Câmara dos Deputados nas seguintes URLs (link):

https://www.camara.leg.br/cotas/Ano-2009.csv.zip
https://www.camara.leg.br/cotas/Ano-2010.csv.zip
https://www.camara.leg.br/cotas/Ano-2011.csv.zip
https://www.camara.leg.br/cotas/Ano-2012.csv.zip

Entretanto, não foi possível acessar nenhum dos conjuntos de dados públicos em questão. Em virtude disso, requisitamos acesso às informações listadas abaixo. Para facilitar a compreensão das informações fornecidas, requisitamos que cada item seja respondido separadamente, indicando o item a que se refere:
1 – Por quais motivos, de fato e de direito, os conjuntos de dados dos exercícios referidos estão indisponíveis?
1.1 – Qual é o prazo, aproximado ou estimado, para o restabelecimento dos dados em questão?
1.2 – Caso os conjuntos de dados referido não sejam mais disponibilizados por motivo permanente:
1.2.1 – Por quais motivos, de fato e de direito, isso ocorre?
1.2.2 – Qual foi a autoridade pública que a supressão dos dados públicos em questão? Qual o seu nome e cargo?
1.2.3 – Requisitamos acesso ao inteiro teor digitalizado do ato administrativo e respectivo parecer que tenha autorizado a supressão dos dados públicos em questão.

Observação: a descrição técnica da ausência dos dados pode ser localizada nesta URL (em Inglês): #206


@mnunes
Copy link
Author

mnunes commented Dec 28, 2018

I just checked and the data are back. This issue can be closed now.

@jedibruno
Copy link

jedibruno commented Dec 28, 2018 via email

@cuducos
Copy link
Collaborator

cuducos commented Dec 29, 2018

Closed as requested.

@cuducos cuducos closed this as completed Dec 29, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants