Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New York City Borough Level Data now live #3084

Open
CSSEGISandData opened this issue Aug 31, 2020 · 3 comments
Open

New York City Borough Level Data now live #3084

CSSEGISandData opened this issue Aug 31, 2020 · 3 comments

Comments

@CSSEGISandData
Copy link
Owner

@CSSEGISandData CSSEGISandData commented Aug 31, 2020

Hello all. After receiving lots of feedback on this decision, we are moving to providing borough level data for New York City. This Git issue will explain our methods for filling in historical data in our time series.

Consistent with our approach to collecting cases, historical cases have been sourced directly from the state of New York website.

Accessing the historical deaths has been slightly trickier. Consistent with other locations and the recommendations of the CDC, our reported deaths are the sum of confirmed and probable deaths.

Confirmed deaths at the Borough level are available on the GitHub managed by the NYC Department of Health, specifically in the following file. Of note, this file appears to be often revised, so our data is current as of today (August 31) but these numbers are subject to change.

Probable deaths at the borough level are available through daily commits to this file. Within these commits, probable deaths in the “unknown” or “data pending” categories are being placed in the “unassigned, New York” entry. Unfortunately, the probable deaths at this granularity are only available beginning May 18th. To generate the probable deaths prior to this date, the time series of total deaths in all of New York City presented on their GitHub in this file was used. To calculate probable deaths prior to May 18th, the following equation was used:

Probable deaths = (Total deaths reported in New York City) - (confirmed deaths at the borough level via the NYC health Github)

These probable deaths have been placed into the “Unassigned, New York” category. Of note, on the day where the logic changes from depending on the two New York City time series files to one of the time series files and the daily commits, we perceive a rise of 462 deaths. We believe that this is due to historical revisions to the borough level time series that haven’t been made to the city total time series. The prior logic, by definition, results in total deaths being equivalent to those reported in the city time series for that date. The strongest evidence for this is that the sum of deaths reported on May 19 using the second approach is greater than the number of deaths reported in the time series file until May 26. That total deaths converge the closer one gets to the present strongly suggests that this is do to redistributing deaths back in time. As far as we can tell, this is the sole approach to obtain borough-level data for the entirety of the pandemic.

pull bot pushed a commit to bencefoldi-cicero/COVID-19 that referenced this issue Aug 31, 2020
@youyanggu
Copy link

@youyanggu youyanggu commented Sep 1, 2020

There is a small discrepancy in the naming, if you guys have time to fix (created Issue #3088):

time_series_covid19_confirmed_US.csv:

Admin2 = New York
Combined_Key = New York, New York, US

time_series_covid19_deaths_US.csv:

Admin2 = New York
Combined_Key = New York City, New York, US
Population = 8,336,817 (should be ~1,629,000)

Lucas-Czarnecki added a commit to Lucas-Czarnecki/COVID-19-CLEANED-JHUCSSE that referenced this issue Sep 2, 2020
Adjust data according to JHU's shift to reporting borough level data. See: CSSEGISandData/COVID-19#3084
@Rational-IM
Copy link

@Rational-IM Rational-IM commented Sep 2, 2020

For those that are trying to find the boroughs for New York City, here it is (the official county names in brackets):

image

@rgreasons
Copy link

@rgreasons rgreasons commented Sep 2, 2020

Do you anticipate providing historical Active or Recovered numbers for each Borough?

Lucas-Czarnecki added a commit to Lucas-Czarnecki/COVID-19-CLEANED-JHUCSSE that referenced this issue Sep 3, 2020
This commit addresses past daily reports that are missing values such as FIPS codes and geographical coordinates due to outdated Admin2 names; including, JHU's new approach to reporting New York cases at the borough-level (see CSSEGISandData/COVID-19#3084) as well as counties such as Williamson, Tennessee.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants