Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include Brazilian breakdown per state #92

Open
felipequintella opened this issue Apr 14, 2020 · 7 comments
Open

Include Brazilian breakdown per state #92

felipequintella opened this issue Apr 14, 2020 · 7 comments

Comments

@felipequintella
Copy link

Brazil is a big country with many different hotspots. Including a breakdown per state (as per US, Canada, Australia) would help visualize what's happening. This should be fairly easy from the Health Ministry data (https://covid.saude.gov.br/)

@rpkoller
Copy link
Contributor

rpkoller commented Apr 27, 2020

problem is they don't provide a direct download link, at least not on the linked page. if you click the download csv file there you get a generated temporal dl link. :/

@felipequintella
Copy link
Author

felipequintella commented Apr 27, 2020

I noticed that too... I've actually been trying to scrape their website for that CSV link for the past week, and I think I finally managed. Or until they change something again...
Scraping and final data is here:
https://github.com/felipequintella/covid19-brazil-scraper
https://raw.githubusercontent.com/felipequintella/covid19-brazil-scraper/master/brazil.csv

I've also forked covidtrends and included the breakdown there as well. If you think it is worth it, let me know and I can try a pull request.
https://github.com/felipequintella/covidtrends
Final product here:
https://covid19.felipequintella.com/

Edit: of course, scraping their data also means the final data may be considered not as official, accurate and reliable as one might want for the project. Let me know what you guys think ;)
"We don’t want to become a repository of many datasets, as it’s difficult for us to vouch for their accuracy and reliability."

@rpkoller
Copy link
Contributor

hm basically at the end it is @aatishb choice how to handle things. on one hand its cool that you provide the opportunity with scraping the existing data from the government page. but imho i would be careful never the less about any further step for data aggregation for any country. but i've searched on google for "covid saude.gov.br csv" which lead me in here: https://brasil.io/dataset/covid19/boletim/ You found that too already?

i dont speak, i guess it is portuguese in brazil, so unsure if i understand everything correctly. i've only utilized translate.google.com a little. but Leia a documentação dessa tabela lead to the following repo here: https://github.com/turicas/covid19-br/blob/master/api.md#boletim . suppose from there a query string might be crafted for their api? i guess the complete download there https://data.brasil.io/dataset/covid19/caso.csv.gz would be a little bit too extensive ;))) goes down even to the city level data wise it looks. the csv is 2,2mb in size. ;))) might be easier for a native speaker to find his or her way around there.

@rpkoller
Copy link
Contributor

But tried the examples on the Github repo in Paw and I got a 301 for querying https://brasil.io/api/dataset/covid19/caso/data?is_last=True&state=AL :/

@rpkoller
Copy link
Contributor

guess the caso.csv.gz is the smallest but most complete at the same time available. it is also listed here alongside other versions, all provided sha512 checksums: https://data.brasil.io/dataset/covid19/_meta/list.html

@felipequintella the only thing i have issues understanding just with google translate. is that dataset aggregated by brazilian offcials/government employees? meaning is that a official data source or is that aggregation based on voluntary work?

@felipequintella
Copy link
Author

Hey @rpkoller , I'll take a look at these and see how they are collecting/compiling and verifying the data. I had not found it before, looks good though! I'll revert later.

@stees
Copy link

stees commented May 16, 2020

@rpkoller as per this link, they say all the data comes from each state health department, so yes, I would say it's aggregated by the authorities and brasil.io just compiles them in one big csv. This is their report on that.

Filtering the caso_full.csv by place_type == state would yield the result Felipe is talking about.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants