update · CSSEGISandData/COVID-19@0cea9b2

aatishb · 2020-03-11T01:38:02Z

Hi,

I think are there some issues with this update:

Republic of Korea and South Korea are the same country
Iran and Iran (Islamic Republic of) are the same country
Hong Kong and Hong Kong (SAR) are the same country
Many entries for 3/10/20 have missing data

eugene-yang · 2020-03-11T02:43:06Z

Also, there is a duplicate entry for Taiwan as well.
The updated version includes an entry "Taiwan, Taipei and environs" which is inconsistent with the previous records, which were using "Taiwan, Taiwan".

vladchel · 2020-03-11T02:56:29Z

The countries rename produced bad data. We have two records:

,South Korea,36.0,128.0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,2,2,6,8,10,12,13,13,16,17,28,28,35,35,42,44,50,53,
,Republic of Korea,36.0,128.0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,54

Old name haven't the last data and the new have all zeros except the last. The same for other renamed records.

halfvector · 2020-03-11T02:56:52Z

Thank you for adding State-level US data.
To add to what @aatishb said: all US cities are missing data for 3/10/20.

eugene-yang · 2020-03-11T03:00:07Z

This inconsistency might be coming from the latest daily report. Might be changing sources.
https://github.com/CSSEGISandData/COVID-19/blob/d417797d85170de9eb10b34186cf112ef2536426/csse_covid_19_data/csse_covid_19_daily_reports/03-10-2020.csv

aatishb · 2020-03-11T03:29:25Z

There's an issue open for this: #405

Moelf · 2020-03-11T05:11:08Z

US is still 605......

sjmackenzie · 2020-03-11T07:43:56Z

I'm assuming the Hong Kong rename has obliterated Hong Kong from the "Total Confirmed" section of the website?

JBrooks137 · 2020-03-11T10:18:55Z

Hi,
It looks like there's some inconsistency between state level and county level data in the US? Take Washington as an example - the new 'Washington' state classification shows 267 cases for 10-Mar but from county level data I only get to 162

Tweeb123 · 2020-03-11T19:40:19Z

In addition to @JBrooks137 comments, having double data for state and city/county is confusing. No other countries have double data like this. Please review.

danslee · 2020-03-12T05:21:30Z

First, let me say that I really really appreciate this data set. I'm sure I represent a large number of academics and software/data savvy people when I say that this has been an invaluable resource.

That said, I think it behooves you to maintain as much forwards and backwards compatibility as possible. If the structure of data is going to change, it's much better to give some advance warning and preserve backwards compatibility in existing files. If a format change is absolutely necessary, you can mark the old file deprecated and create a new file with a new name and new notation. The current time series files are broken going backwards and forwards for the COVID-19/csse_covid_19_data/csse_covid_19_time_series/time_series_19*.csv files are severely broken.

Please don't break the who_covid_19_situation_reports/who_covid_19_sit_rep_time_series/who_covid_19_sit_rep_time_series.csv without considering some of the steps I've outlined above. Thanks!

eugene-yang · 2020-03-12T05:28:33Z

@danslee You have a really great point. But given the fact that the team working on this might not have a strong CS/Data Science background, I'm not sure whether they would have the capacity of maintaining this repo with these compatibilities.

Having said that -- I think a better way is to either help them with this repository or create a fork/seperate repo to support better usability.
I am not down-playing their contribution, I do think they have provided us a great resource, but I think people have different priorities and I would really like them to focus on the correctness and speedy update of the numbers.

Just my two cents.

danslee · 2020-03-12T07:03:01Z

@eugene-yang I'd love to help out, but the only data I can see is what they have checked into the repo. I am working on some scripts which will unify the time-series csv files with unified naming schemes and such, but have run into what are clearly some doubly entered data points around 2020-03-10 and 03-11 which brings the integrity of the entire file into question. Hopefully, the morning will bring some order to the data chaos.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit

There are no files selected for viewing

13 comments on commit `0cea9b2`

aatishb commented on `0cea9b2` Mar 11, 2020

eugene-yang commented on `0cea9b2` Mar 11, 2020

vladchel commented on `0cea9b2` Mar 11, 2020

halfvector commented on `0cea9b2` Mar 11, 2020

eugene-yang commented on `0cea9b2` Mar 11, 2020

aatishb commented on `0cea9b2` Mar 11, 2020

Moelf commented on `0cea9b2` Mar 11, 2020

sjmackenzie commented on `0cea9b2` Mar 11, 2020

JBrooks137 commented on `0cea9b2` Mar 11, 2020

Tweeb123 commented on `0cea9b2` Mar 11, 2020

danslee commented on `0cea9b2` Mar 12, 2020

eugene-yang commented on `0cea9b2` Mar 12, 2020

danslee commented on `0cea9b2` Mar 12, 2020

Commit

There are no files selected for viewing

13 comments on commit 0cea9b2

aatishb commented on 0cea9b2 Mar 11, 2020

Choose a reason for hiding this comment

eugene-yang commented on 0cea9b2 Mar 11, 2020

Choose a reason for hiding this comment

vladchel commented on 0cea9b2 Mar 11, 2020

Choose a reason for hiding this comment

halfvector commented on 0cea9b2 Mar 11, 2020

Choose a reason for hiding this comment

eugene-yang commented on 0cea9b2 Mar 11, 2020

Choose a reason for hiding this comment

aatishb commented on 0cea9b2 Mar 11, 2020

Choose a reason for hiding this comment

Moelf commented on 0cea9b2 Mar 11, 2020

Choose a reason for hiding this comment

sjmackenzie commented on 0cea9b2 Mar 11, 2020

Choose a reason for hiding this comment

JBrooks137 commented on 0cea9b2 Mar 11, 2020

Choose a reason for hiding this comment

Tweeb123 commented on 0cea9b2 Mar 11, 2020

Choose a reason for hiding this comment

danslee commented on 0cea9b2 Mar 12, 2020

Choose a reason for hiding this comment

eugene-yang commented on 0cea9b2 Mar 12, 2020

Choose a reason for hiding this comment

danslee commented on 0cea9b2 Mar 12, 2020

Choose a reason for hiding this comment

13 comments on commit `0cea9b2`

aatishb commented on `0cea9b2` Mar 11, 2020

eugene-yang commented on `0cea9b2` Mar 11, 2020

vladchel commented on `0cea9b2` Mar 11, 2020

halfvector commented on `0cea9b2` Mar 11, 2020

eugene-yang commented on `0cea9b2` Mar 11, 2020

aatishb commented on `0cea9b2` Mar 11, 2020

Moelf commented on `0cea9b2` Mar 11, 2020

sjmackenzie commented on `0cea9b2` Mar 11, 2020

JBrooks137 commented on `0cea9b2` Mar 11, 2020

Tweeb123 commented on `0cea9b2` Mar 11, 2020

danslee commented on `0cea9b2` Mar 12, 2020

eugene-yang commented on `0cea9b2` Mar 12, 2020

danslee commented on `0cea9b2` Mar 12, 2020