-
Notifications
You must be signed in to change notification settings - Fork 29
Data for México is wrong after 2020-04-16 #66
Comments
Yes, it's true, we are having trouble running automated scripts. These scripts replace the "missing" to "zero" but we will use blank spaces to indicate absence of data, these days we are fixing it, we already have a program that detects errors. #59 An important fact is that the countries are not reporting the amount to the resolution that we want in the case of recovered. We recommend only using the data corresponding to "Confirmed" and "Deaths" In the case of Mexico, this data has not been recovered since March, we will try to parse with the data that was recently released. Datos Abiertos México By now, we have 12.61% of errors in Confirmed cases. Data detailed -> Errors.csv Data it's still dirty, we need more colaborators to keep clean data. Our script to detect errors https://bit.ly/2RNkNZc |
On Sat, Apr 18, 2020 at 09:06:50AM -0700, ZurMaD wrote:
Yes, it's true, we are having trouble running automated scripts. These scripts replace the "missing" to "zero" but we will use blank spaces to indicate absence of data, these days we are fixing it, we already have a program that detects errors. #59
I'd rather have 'missing' or blank space to zero. I saw some
discussion about how to handle missing values in the site, but I don't
recall if there is any agreement. Meanwhile, I'll just skip
inconsistent (decreasing) data.
Is it safe to use the lack of a date as a consistent indicator of missing data?
An important fact is that the countries are not reporting the amount to the resolution that we want in the case of recovered. We recommend only using the data corresponding to "Confirmed" and "Deaths"
OK. Thanks for the tip. And thanks for finding, harvesting and making
available the data.
Best regards,
Luis
…--
o
W. Luis Mochán, | tel:(52)(777)329-1734 /<(*)
Instituto de Ciencias Físicas, UNAM | fax:(52)(777)317-5388 `>/ /\
Apdo. Postal 48-3, 62251 | (*)/\/ \
Cuernavaca, Morelos, México | mochan@fis.unam.mx /\_/\__/
GPG: 791EB9EB, C949 3F81 6D9B 1191 9A16 C2DF 5F0A C52B 791E B9EB
|
It would be best to be guided by decreasing data. Thanks, we're algo looking for extra data to do a correlation study like in China (https://github.com/DataScienceResearchPeru/covid-19_latinoamerica_extra) |
I'm maintaining Mexico's data. I've been a bit busy doing some major changes in my own repo in which I scrape the data out of the official pdfs and (soon!) the open data. |
I've just pushed the data to the repository, as well as replacing all the "missing" strings to blank spaces in Mexico's info. (8cff184) |
Yes, I'm creating a script to change 0 to '' (blank spaces), comming soon |
¡Gracias!
…On Sat, Apr 18, 2020 at 04:15:13PM -0700, Gabriel Alfonso Carranco-Sapiéns wrote:
I've just pushed the data to the repository, as well as replacing all the "missing" strings to blank spaces in Mexico's info. (4c5c44e)
Unless there's something additional, I think we can close down this issue.
--
You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub:
#66 (comment)
--
o
W. Luis Mochán, | tel:(52)(777)329-1734 /<(*)
Instituto de Ciencias Físicas, UNAM | fax:(52)(777)317-5388 `>/ /\
Apdo. Postal 48-3, 62251 | (*)/\/ \
Cuernavaca, Morelos, México | mochan@fis.unam.mx /\_/\__/
GPG: 791EB9EB, C949 3F81 6D9B 1191 9A16 C2DF 5F0A C52B 791E B9EB
|
All data for México seems wrong in files 2020-04-16.csv, 2020-04-17.csv and 2020-04-18.csv
A typical row reads
The date is missing and there are zeroes for all entries. Yesterday some data was 'missing' but not '0'.
The text was updated successfully, but these errors were encountered: