Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes in U.S. Reporting #382

Open
CSSEGISandData opened this issue Mar 10, 2020 · 59 comments
Open

Changes in U.S. Reporting #382

CSSEGISandData opened this issue Mar 10, 2020 · 59 comments

Comments

@CSSEGISandData
Copy link
Owner

@CSSEGISandData CSSEGISandData commented Mar 10, 2020

In light of the increasing rate of cases being reported domestically in the U.S., and in order to retain timeliness and accuracy, we have switched from reporting at the county level to state level.

@mw32
Copy link

@mw32 mw32 commented Mar 10, 2020

Can't this be made dynamic? Zoom level would be one way to do this as discussed in #363?

@MuffleKerfuffle
Copy link

@MuffleKerfuffle MuffleKerfuffle commented Mar 10, 2020

Granularity might trump accuracy IMHO. If the bulk pings on this site are gen pop related I doubt they are just wondering about their state. For instance, TX is larger than some countries so total cases in TX isn’t real helpful when assessing risk levels.
Maybe the prep changes for gen pop will change slightly of immediate local cases are filed.
For instance, I might not travel to Austin if I know there are 30 cases in the vicinity. The detail also gives ammunition for reasoning as it related to events and business meetings.

@CSSEGISandData
Copy link
Owner Author

@CSSEGISandData CSSEGISandData commented Mar 10, 2020

Please stand by as the team is discussing upon this topic and the steps moving forward.

@JHEASTON
Copy link

@JHEASTON JHEASTON commented Mar 10, 2020

County, City, and any level of granularity, is very important for this issue. Perhaps a category of "pending" while it's being determined, but then once determined, then list into appropriate county/city.

@sethdeckard
Copy link

@sethdeckard sethdeckard commented Mar 10, 2020

Thank you for reconsidering this, the granularity would be much appreciated for those of us in larger states.

@BrandonRCopeland
Copy link

@BrandonRCopeland BrandonRCopeland commented Mar 10, 2020

Very much agreed that granularity is critical here for the US. As stated before, states like CA and TX are larger than many countries, and knowing the granular cases by county is of the utmost importance. It's worth a slight delay so that we could have the granular county data as well for the US.

@jwa5426
Copy link

@jwa5426 jwa5426 commented Mar 10, 2020

I’m here to voice my support for what others have said. Displaying cases by state is not helpful to the vast majority of the public who aren’t flying or otherwise traveling across state borders. By allowing finer granularity, members of the public can make informed decisions regarding their risk level during their day-to-day life - going to work, shopping, visiting friends, etc.

@Korywon
Copy link

@Korywon Korywon commented Mar 10, 2020

I agree. Harris County alone has a massive population compared to the rest of Texas. As an example, it would take you literal days to drive across Texas. A lot of people that I know in Harris County rely on that specific information to make decisions. I personally don't see how that contributes to accuracy as it just becomes part of a bigger number. It makes it really difficult to discern where the cases actually lie.

@myanaros
Copy link

@myanaros myanaros commented Mar 10, 2020

Dallas County Texas here, was disappointed to see the granularity disappear hours after seeing the first case in my area pop up.

Thank you to this whole team for the awesome work being done to help keep people in the know.

@pnisita
Copy link

@pnisita pnisita commented Mar 10, 2020

There are some states that have counties that are larger than other entire states and not having that local data makes the data virtually useless. For instance, I live in long Island and if our results are grouped with Albany or Buffalo, it does us no good at all.

@xtronaltic
Copy link

@xtronaltic xtronaltic commented Mar 10, 2020

For the people of America, please change it back.

@rdfedor
Copy link

@rdfedor rdfedor commented Mar 10, 2020

This change will hurt many efforts by individuals to stay informed as to areas they should avoid and instead of having the granularity to see what counties have the reports, show what states to avoid even though many parts of states or countries are unaffected by this. I have a daughter with Cystic Fibrosis who will be visiting me this summer and now because of the loss of granularity, it makes the case that even though I live in Texas with a few number of cases thus far, that the entire state is a hazard that should be avoided rather than giving me the information I need to avoid the areas that have active outbreaks.

As others have stated, there's ways to aggregate the data on particular zoom levels as to avoid the performance degradation of showing which can then show the county levels once you pass a particular zoom. This would be a more ideal implementation as to resolve the performance impacts without loosing visibility into the useful information that many people go to this map to look for.

@aleksandar-jovicic
Copy link

@aleksandar-jovicic aleksandar-jovicic commented Mar 10, 2020

Does finer granularity increase efforts spent to collect data or it is just because visual representation. Or let rephrase question. Are you still collecting data based on county level but just decide to aggregate it for presentational purposes or your source just provide data on state level?

Is it possible to provide another feed containing data on fine granular level and do dynamic aggregation for presentation level in existing API call?

@CorwinOA
Copy link

@CorwinOA CorwinOA commented Mar 10, 2020

Wanted you all to know how much we all value and respect your work, especially given how this started and grew and the manual nature of some of your work.

I’ll admit that state-by-state data doesn’t have the same impact on me, both as someone with a public safety history and as an observer. I hope you’re able to bring back county level reporting or find a way to crowd source some of the labor that was causing you to shift towards state level reporting.

If it’s only going to be state from here on out, I suggest you change the map to state shading and color coding to represent volume. As an example, your change now makes it appear that my county accounts for my entire state’s caseload. Needless to say that was a shock when I flipped it open this morning.

Seriously, you all are awesome for doing this and sharing it!

@BickieSmalls
Copy link

@BickieSmalls BickieSmalls commented Mar 10, 2020

Can we please include the county data available in the repo even if the main dashboard does not reflect county data anymore?

@SchlittDataSci
Copy link

@SchlittDataSci SchlittDataSci commented Mar 10, 2020

Chiming in here, this has been an amazing resource, but in the long run people are going to need to treat this like the weather for granular, local, risk based decision making. For example, "if I'm in a high risk group, are there sufficient cases in such county that I should reconsider activities there"

The state-wide data is going to be a bit less useful to modelers and analysts, but vastly less useful to regular individuals trying to plan their days safely in the coming months as this saturates the map.

@MickMickle
Copy link

@MickMickle MickMickle commented Mar 10, 2020

I fully agree with the need to display cases down to at least county level in the U.S. I was very disappointed yesterday when that information was consolidated into single dots in each state, seemingly randomly placed within the states, even though the size of the dots still did reflect the magnitude of cases. I was hoping that the loss of granularity was just a symptom of the site being overloaded. So I was instantly motivated to find another tracking site. Let's face it: yours is the best, but this state-level only display is a significant degradation.

Collecting, entering, and displaying all of the case location and status data must be an enormous and laborious undertaking -- I can't even fathom it, and I can't fully express how appreciative I am -- Thank you! But if you are actually still entering the county location, please continue to have the map display those locations by county. The people of America do need that information for all the reasons already given by others.

If you are concerned that the map display becomes too complex for the viewer if it has too many dots and circles, don't worry about that. Users of your dashboard will figure it out -- it's very intuitive. If it's just too labor intensive, tell us what we can do to help. Donations? Publicity?

@jawz101
Copy link

@jawz101 jawz101 commented Mar 10, 2020

All I know is I would like to at least have the wiki open to list any Official data sources for any geographically formatted data feeds people find for anywhere in the world. (i.e.- not copies of copies or user-maintained feeds.)

@jocooper43016
Copy link

@jocooper43016 jocooper43016 commented Mar 11, 2020

Please stand by as the team is discussing upon this topic and the steps moving forward.

Please please please bring back at least county granularity

@JeremyIglehart
Copy link

@JeremyIglehart JeremyIglehart commented Mar 11, 2020

Pennsylvania here - I appreciate all of the hard work you are doing and understand the decision you have made. I must admit, however, that - I will no longer be using this to check what is important to me. I am willing to donate money to your cause if it meant being able to go back to the county level.

Although, Pennsylvania is not as big as Texas - it's still quite a large territory - taking several hours to drive from east to west or vise versa. It's much more important to me to know what is going on more directly around me. Much of my family lives in Pennsylvania - but everyone in different counties. I now have no idea how to check on how things are going in a level of detail that actually helps me. As a result, I will most likely stop using this tool :(

My county has a detailed map also using Esri - you may be able to use links like this to help achieve state-level or even more granular than county level for the counties that have gone to the effort.

In summary, please consider:

  • Adding county level back.
  • Taking donations.
  • Adding links to sites like my county has provided where it makes sense to.
@jheasley322
Copy link

@jheasley322 jheasley322 commented Mar 11, 2020

Particularly in light of the geographic distance between cases (southern and northern california being hundreds of miles apart as one example), the ability to cluster based on county data is crucial. i am happy to provide some etl recources if needed. If not to the main file, perhaps a supplemental consolidated file that can give us the info we need. thanks!

@matthewrj
Copy link

@matthewrj matthewrj commented Mar 13, 2020

Where does the county level data come from? I can't find it in any of the listed data sources.

@rajrao
Copy link

@rajrao rajrao commented Mar 13, 2020

Please bring back county level info for the US. State level just is no granular enough.

@cwacht
Copy link

@cwacht cwacht commented Mar 13, 2020

It seems like the county/city data has been coming from the individual state websites.

Click the states on the map on the CDC website
https://www.cdc.gov/coronavirus/2019-ncov/cases-in-us.html#reporting-cases

Then navigate the state website to find where they are reporting county/city level data.

For example
New York: https://health.ny.gov/diseases/communicable/coronavirus/
Washington: https://www.doh.wa.gov/Emergencies/Coronavirus

California seems to require you to get the data from the individual county/city websites:
San Francisco: https://www.sfdph.org/dph/alerts/coronavirus.asp
Alameda: http://www.acphd.org/2019-ncov.aspx

This could explain why collecting county level data on a daily basis is no longer viable. Maybe we can split up the work?

@piccolbo
Copy link

@piccolbo piccolbo commented Mar 13, 2020

The way the county level was included led to incorrect summaries. It has to be crystal clear what entries are sums of other entries, if any. Like if you want to have state level and county level in the same file, you have to have a field county and a field state, or a field name and a field entity type. You can't expect people apply regex trickery, look for commas or what not and get it right all the time.

@wldflwr
Copy link

@wldflwr wldflwr commented Mar 13, 2020

Thank you for all your hard work, and dedication towards keeping us informed! I am troubled by the most recent change...
I’d like to see USA grand total return to the left window pane; selecting the country in the window pane was user friendlier when keeping track of overall confirmed, deaths, recovered and active cases in the USA. I did also appreciate the county-wide mapping but understandably as the virus spreads the map would become challenging to select button for county info. I think single dot on each state is fine, but when red dot is selected perhaps an additional window could appear in right window pane detailing individual counties. At this time no data is appears in the left window pane anymore-hoping you return USA total to the left window as previously.
Thank you again for all that you are doing.

@becare-rocket
Copy link

@becare-rocket becare-rocket commented Mar 13, 2020

My hack is to use the larger of the state based data or the county/city based data so as to have a long yet continuous time series.

@DavidGeeraerts
Copy link

@DavidGeeraerts DavidGeeraerts commented Mar 13, 2020

If the State websites (i.e. Washington State) would just use TABLE HTML tags, then it would be very easy to use HTML tools to automate getting County level data. I've put in a request for WDOH for them to fix their website, but I've not heard back. Before they changed the website layout, they were using TABLE tags, made it real easy to automate getting the data.

@mpfriesen
Copy link

@mpfriesen mpfriesen commented Mar 13, 2020

If the State websites (i.e. Washington State) would just use TABLE HTML tags, then it would be very easy to use HTML tools to automate getting County level data. I've put in a request for WDOH for them to fix their website, but I've not heard back. Before they changed the website layout, they were using TABLE tags, made it real easy to automate getting the data.

Yes, I was trying to figure out yesterday why my Python scraper stopped working there. Thanks for bugging them.

@longsyntax
Copy link

@longsyntax longsyntax commented Mar 13, 2020

@DavidGeeraerts @mpfriesen Are you guys already working on scraping county-level data? We're trying to consolidate efforts for a community-maintained repo that has county level data over at #558

@DavidGeeraerts
Copy link

@DavidGeeraerts DavidGeeraerts commented Mar 13, 2020

@longsyntax The Washington Counties are centrally reporting to Washington Department of Health, so all the County data is available from WDoH. I'm maintaining my own dashboard COVID-19-Dashboard. If there's a distributed effort to maintain County level data for the States, I'm game for doing it for Washington State.

@mpfriesen
Copy link

@mpfriesen mpfriesen commented Mar 13, 2020

@longsyntax I'm maintaining a page here for The Oregonian: https://projects.oregonlive.com/coronavirus/. I can do county-level for Oregon.

@MickMickle
Copy link

@MickMickle MickMickle commented Mar 14, 2020

The FAQ dated March 13 for the map explains that they plan to go back to county level sometime:

Why does the map report only state-level data in the United States instead of county-level data? In light of the increasing rate of cases being reported in the United States and worldwide, and in order to retain timeliness and accuracy, the map switched from reporting at the county level to the state level on March 10. The team expects to return to county-level reporting once it feels confident the platform can provide the most accurate, timely reports from local jurisdictions as the virus rapidly advances.

Why is a point on the map located on my city or neighborhood?
All points shown on the map are based on geographic centroids, and are not representative of a specific address, building or any location at a spatial scale finer than a city. Click on each point on the map to obtain information associated with each reported case. When the map is reporting state-specific data, the points are located in the center of each state. When the map is reporting county-specific data, the points are placed precisely at the geographic center for those jurisdictions.

@MarkMMullin
Copy link

@MarkMMullin MarkMMullin commented Mar 15, 2020

I've read the FAQs and appreciate the issue - with increasing spread comes a gut busting # of lines to represent detail data within the states. That said, unless JHU is calling that level of data suspect, I hope this level of detail returns. Pure state level modeling is a little grainy.

@reyemtm
Copy link

@reyemtm reyemtm commented Mar 15, 2020

For those wanting county level data, the University of Virginia is posting county level data, it just needs parsed out. http://nssac.bii.virginia.edu/covid-19/dashboard/ I believe the county numbers are just for confirmed cases. The counties are also missing Census codes or lat lngs. For example:

name,Region,Last Update,Confirmed,Deaths,Recovered
Ohio,USA,2020-03-15 03:00:00 * CTY: Cuyahoga 11; Butler 4; Stark 3; Summit 2; Trumbull 2; Belmont 2; Lucas  1; Franklin 1; Lorain 1; Tuscarawas 1,28,0,0

If anyone creates a json parsed version of this data, please post a link or conversion function.

@DavidGeeraerts
Copy link

@DavidGeeraerts DavidGeeraerts commented Mar 15, 2020

See ticket #558
The data csv files will be posted here:
http://blog.lazd.net/coronadatascraper/

@JimBudde
Copy link

@JimBudde JimBudde commented Mar 16, 2020

I haven't read all the comments here but here's my 2c on keeping stats at the state level

  1. I completely appreciate the amount of work to capture all this information so adding county level detail, without some level of automation seems daunting.
  2. I think it is more important to get as accurate a number as possible in as timely a fashion as possible
  3. I believe there are Public Health risks announcing stats at too granular a level at this point in time. We need containment and if people think there are zero or very few cases in their area of interest, it invites complacency
  4. I can't imagine State Public Health agencies aren't tracking things at their own county/city level so they can manage resources, etc so adding them here is really for just "us"
@reyemtm
Copy link

@reyemtm reyemtm commented Mar 17, 2020

#4 is a dangerous sentiment - public transparency should be the goal here. Also, we have mass complacency, it seems doubtful anything could make it worse. And as far as accuracy, this data is valid in terms of reported, symptomatic cases, but hardly accurate due to extreme lack of testing of asymptomatic carriers. For personal use I may just start tracking my own state from their daily reports and use this data as a sort of reference.

@JimBudde
Copy link

@JimBudde JimBudde commented Mar 17, 2020

Re: #4. I agree with transparency but I have been in the middle of major events and while you want to get as much info out to the public/communications group, your main goal is to return to normal operations as quickly as possible, not keeping a website up to date with data. Hence, groups from around the globe have stepped in.

And as far as accuracy, this data is valid in terms of reported, symptomatic cases, but hardly accurate due to extreme lack of testing of asymptomatic carriers

I don't know what 'accurate' means. There will always be an unknown number of infected people; most of us with regular cold, or even flu just stay home, nurse ourselves back to health and were never tested.

@reyemtm
Copy link

@reyemtm reyemtm commented Mar 17, 2020

An more accurate count of infections will give a higher confidence to the death rate. This can only come about if asymptomatic people or people mild symptoms are tested. That is what I mean by accurate. But in terms of the discussion at hand, if validity and expediency is our main concern, I can see the argument for omitting the county data.

@JHEASTON
Copy link

@JHEASTON JHEASTON commented Mar 23, 2020

Thank you Thank you Thank you!

@JeremyIglehart
Copy link

@JeremyIglehart JeremyIglehart commented Mar 23, 2020

It's so good to see county level data back today, THANK YOU!!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet