-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
articles/data #59
Comments
Hello, very good job, we will use it soon for pedagogic and research purposes !! but unfortunately ... btw i think that the French data are not " up to date": many restrictions are finished or less intense in France for 3 days (like school, transports ...) 13/05/2020 please have a look to : http://www.leparisien.fr/societe/coronavirus-dernier-jour-de-confinement-en-france-suivez-notre-direct-10-05-2020-8313930.php best regards + |
Hello, many thanks for your feedback! For policy measures, we are using the data by "Oxford Covid-19 Government Response Tracker" Thanks! |
Hi Team, Thanks for these sources of data it is really useful. I have one doubt regarding one of sources-: https://storage.covid19datahub.io/data-2.csv The number of tests is 0 for all states and for alabama the tests column seemed to have correct til 31st January 2021 while it has incorrect data from 1st February. Please can you let me know about the same. |
Hi @aman091291, thank you for your message! We are using the data from the U.S. Department of Health & Human Services. It seems that the access to the data has changed and was returning only the first 1000 rows. I fixed this and now tests and hospitalizations should be back and up-to-date for all states (please allow about 1 hour for the workflow to complete). Please note that we stopped updating the pre-processed data. Switch to the raw data if you are still using the pre-processed files. E,g.: Thanks! |
Hi @eguidotti I could find that data is now updated in raw file but I am currently using covid19dh package in python using below statement which is still not picking up the correct data df = covid19("USA", level = 2, start = date(2020,1,1), verbose = False). Please can you check the same if the python package is referring correct raw data file. Thanks! |
Are you using covid19dh version 2.3.0? |
@eguidotti I am using version 1.0.0 as version 2.3.0 creates a tuple while loading the data. df = covid19("USA", level = 2, start = date(2020,1,1), verbose = False) Please can you suggest way to load this data incase we are to use version 2.3.0 |
Yes the tuple returns both the data and the data sources. Please have a look at the description here https://pypi.org/project/covid19dh/ The following should work in v2.3.0: df, src = covid19("USA", level = 2, start = date(2020,1,1), verbose = False) Hope this helps! |
Hi @MalteKurz, thanks for your feedback and for taking the time to double-check the vintage data. Unfortunately, some vintage containers are missing or not complete due to technical errors that may have happened in the past. We do our best to promptly fix the pipeline in case of issues. However, it is not always possible to spot the problem and fix it in 24h. In those cases, the vintage data for the day are lost. We don't want to fill them retroactively as the vintage data are guaranteed to be frozen screenshots of the data taken on that day. We should have a coverage around 99% of the days since we started but unfortunately not 100%. |
Hello, thank you for your effort! It is very helpful for my essay.And I want to confirm one thing that If I want to download the latest data about US, should I download the raw data for Administrative area level 2? |
Hello, thanks for your message! Yes you can download the raw data for US level 2 (state-wise data). Level 1 is for national-wide data, while level 3 is at county level. If you need to ensure reproducibility for your work you should instead download the latest available vintage file. This is a frozen snapshot of the data that will not change in the future. |
Thank you so much!
Best wishes
Zhihui
获取 Outlook for iOS<https://aka.ms/o0ukef>
…________________________________
发件人: Emanuele Guidotti ***@***.***>
发送时间: Thursday, June 3, 2021 8:57:53 PM
收件人: covid19datahub/COVID19 ***@***.***>
抄送: Zhihui Yang ***@***.***>; Comment ***@***.***>
主题: Re: [covid19datahub/COVID19] articles/data (#59)
External email to Cardiff University - Take care when replying/opening attachments or links.
Nid ebost mewnol o Brifysgol Caerdydd yw hwn - Cymerwch ofal wrth ateb/agor atodiadau neu ddolenni.
Hello, thanks for your message! Yes you can download the raw data for US level 2 (state-wise data). Level 1 is for national-wide data, while level 3 is at county level. If you need to ensure reproducibility for your work you should instead download the latest available vintage file. This is a frozen snapshot of the data that will not change in the future.
―
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fcovid19datahub%2FCOVID19%2Fissues%2F59%23issuecomment-853847706&data=04%7C01%7CYangZ63%40cardiff.ac.uk%7C80aa9bdf068d4961b5c608d9268f3465%7Cbdb74b3095684856bdbf06759778fcbc%7C1%7C0%7C637583218774723113%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=YKFzQ4PBpyd9gAkTku4prsnijoAU0JQjoe5CWycGfJ8%3D&reserved=0>, or unsubscribe<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAUKU6OUX5XH7JFFBRTKQQQLTQ535DANCNFSM4M7Z52JA&data=04%7C01%7CYangZ63%40cardiff.ac.uk%7C80aa9bdf068d4961b5c608d9268f3465%7Cbdb74b3095684856bdbf06759778fcbc%7C1%7C0%7C637583218774723113%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=HMWLd4q5vv7Y5HFfLCvDYJB0gVpTVqcXs34%2Bn2xj%2F8E%3D&reserved=0>.
|
Hello,
I want to know if there could be vaccination data of the U.S. from December 14, 2020😭😭? Because there are empty during that period.
Best wishes
Zhihui Yang
获取 Outlook for iOS<https://aka.ms/o0ukef>
…________________________________
发件人: Emanuele Guidotti ***@***.***>
发送时间: Thursday, June 3, 2021 8:57:53 PM
收件人: covid19datahub/COVID19 ***@***.***>
抄送: Zhihui Yang ***@***.***>; Comment ***@***.***>
主题: Re: [covid19datahub/COVID19] articles/data (#59)
External email to Cardiff University - Take care when replying/opening attachments or links.
Nid ebost mewnol o Brifysgol Caerdydd yw hwn - Cymerwch ofal wrth ateb/agor atodiadau neu ddolenni.
Hello, thanks for your message! Yes you can download the raw data for US level 2 (state-wise data). Level 1 is for national-wide data, while level 3 is at county level. If you need to ensure reproducibility for your work you should instead download the latest available vintage file. This is a frozen snapshot of the data that will not change in the future.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fcovid19datahub%2FCOVID19%2Fissues%2F59%23issuecomment-853847706&data=04%7C01%7CYangZ63%40cardiff.ac.uk%7C80aa9bdf068d4961b5c608d9268f3465%7Cbdb74b3095684856bdbf06759778fcbc%7C1%7C0%7C637583218774723113%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=YKFzQ4PBpyd9gAkTku4prsnijoAU0JQjoe5CWycGfJ8%3D&reserved=0>, or unsubscribe<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAUKU6OUX5XH7JFFBRTKQQQLTQ535DANCNFSM4M7Z52JA&data=04%7C01%7CYangZ63%40cardiff.ac.uk%7C80aa9bdf068d4961b5c608d9268f3465%7Cbdb74b3095684856bdbf06759778fcbc%7C1%7C0%7C637583218774723113%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=HMWLd4q5vv7Y5HFfLCvDYJB0gVpTVqcXs34%2Bn2xj%2F8E%3D&reserved=0>.
|
Hi @Zhihui123-123, I'm afraid it's not possible or, at least, I'm not aware of data providers for the number of vaccines since December 14, 2020. Have you seen these data somewhere else? |
Hello,
Thank you. No , I have not seen these data in other places.
Best wishes
Zhihui
获取 Outlook for iOS<https://aka.ms/o0ukef>
…________________________________
发件人: Emanuele Guidotti ***@***.***>
发送时间: Tuesday, June 8, 2021 12:12:20 AM
收件人: covid19datahub/COVID19 ***@***.***>
抄送: Zhihui Yang ***@***.***>; Mention ***@***.***>
主题: Re: [covid19datahub/COVID19] articles/data (#59)
External email to Cardiff University - Take care when replying/opening attachments or links.
Nid ebost mewnol o Brifysgol Caerdydd yw hwn - Cymerwch ofal wrth ateb/agor atodiadau neu ddolenni.
Hi @Zhihui123-123<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FZhihui123-123&data=04%7C01%7CYangZ63%40cardiff.ac.uk%7C8ea1d303a98c42d2266108d929cf081b%7Cbdb74b3095684856bdbf06759778fcbc%7C1%7C0%7C637586791439741428%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=JbGIx7TiYQ1kIZKdrDqS8m%2BY7DFC4hoZYIf7wxgUFpw%3D&reserved=0>, I'm afraid it's not possible or, at least, I'm not aware of data providers for the number of vaccines since December 14, 2020. Have you seen these data somewhere else?
―
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fcovid19datahub%2FCOVID19%2Fissues%2F59%23issuecomment-856072290&data=04%7C01%7CYangZ63%40cardiff.ac.uk%7C8ea1d303a98c42d2266108d929cf081b%7Cbdb74b3095684856bdbf06759778fcbc%7C1%7C0%7C637586791439751139%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=hJj6fi5sIY1pIlNT1QxbqKVtb1isuiwVGhLZ92es4zI%3D&reserved=0>, or unsubscribe<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAUKU6OVPOQC37XL5NEMYIK3TRTVWJANCNFSM4M7Z52JA&data=04%7C01%7CYangZ63%40cardiff.ac.uk%7C8ea1d303a98c42d2266108d929cf081b%7Cbdb74b3095684856bdbf06759778fcbc%7C1%7C0%7C637586791439751139%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=HHHP82b7tpKsg944Sf3bJp6Vs0ETMnjFGSIGiVLZ%2BeE%3D&reserved=0>.
|
Hi,
Do you have any chance to fix this problem? Also, I think your vintage datasets after 2021-11-14 have one day lag. That is, for example, vintage data set 2021-11-30 contains 2021-11-29 data but not 2021-11-30. I think this is a matter of choice and nothing problematic here. Can you confirm this? Best, |
Hi @yasin-simsek, thanks for you message.
Unfortunately there are (very) few vinatege datasets that are missing due to technical issues and it is not possible to backfill them. Please see #59 (comment)
Yes, this is due to reporting delays. After 14 November 2021, the date of the vintage dataset is the date when the snapshot was taken. Typically, the data available on day Before 14 November 2021, the vintage data were generated with a delay of 48 hours to make sure all the observations are complete (so that the dataset at time |
Hi, @eguidotti, thanks to you and the Team for a great dataset! This is the first time using the dataset, and I thought I followed the instructions correctly and downloaded the “LatestData” .gzip, unzipped it to a .csv file and then read it into R with data.table. Reading the structure of the data indicated there were approximately 2.6 million records. I then tried to construct an epicurve using the dataset based on the date_confirmed variable but unexpectedly found that the the last confirmed_date data point was in 2020 June— is that correct? My impression was that the data project has been collecting public data points on diagnosed cases from the start of the Pandemic until know (2022). Should I be using another variable in the data set or try down Ling another dataset file? I appreciate any assistance you can provide. Thank you :-) |
Hi @SugarRayLua, thanks for giving a try to the dataset! Which data file are you using? Because as of today (28 April 2022) I see that:
Anyway, I suggest using directly the package You can install the package install.packages("COVID19") And import the data: library(COVID19)
x <- covid19(level = 1) # -> national level. Use 2 or 3 for finer-grained data
That's correct. typing: max(x$date) I get |
Thank you very much, @eguidotti. I did get a more complete range of data points when I loaded the dataset with the COVID19 package as you suggested. It is also helpful that the COVID19 converts the date columns (which seem to be strings in the raw dataset) to date format for the user which means one can directly feed the dataset into the incidence or incidence2 R packages. The only thing that I preferred with the raw dataset is that the raw dataset gave me data in line listing format; it seems that the COVID19 package instead gives me summary data for each day—- is that correct? |
No, it's not correct. The two formats should be the same (except for type conversion upon import). Can you provide the link to the "raw data" you are using? I don't understand which file it is |
@eguidotti, from this website where I’m posting the comments: |
(Covid19 package not giving me that granular information; one caveat that might be key; I wasn’t able to download level = 3 data [memory error]? with COVID19 package that I could if I used data.table directly to read in raw data file. Perhaps the level = 3 data contains the line listing information) |
There are no "line listing" information in the database or symptom dates. There is only one date per day with the corresponding variables. I can't quite understand which is the file you are using (?) With the following code, you can read the level 3 data (highlighted in the picture) in R: library(data.table)
x <- fread("https://storage.covid19datahub.io/level/3.csv.gz") Maybe you are trying to unzip and read the file https://storage.covid19datahub.io/latest.db.gz ? This is a SQLite file. It must be read with SQLite. It is not a CSV file that can be read with data.table Hope this helps! |
Thanks, @eguidotti, for all your help— sorry for the confusion. I’ll look into this more tonight and get back to you. I also was reviewing a Swiss COVID19 dataset so maybe inadvertently mislabel the datasets I was reading. |
@eguidotti, I figured it out—I got the COVID-19 Data Hub confused with the data from the Open COVID-19 Data Working Group which is described at the folllowing site: https://github.com/beoutbreakprepared/nCoV2019 Their group has line list data from COVID-19 Pandemic available for download, and their download file is labeled “latestdata.csv” which I confused with the dataset I downloaded form this COVID-19 Data Hub site. It was actually then the Open COVID-19 Data Working Group data site that didn’t seem to have updated data and for whom I should have directed my initial question to. As you can see from the structure of that data base, the Open COVID-19 Data Working Group has variables which store symptom onset: 'data.frame': 2676311 obs. of 33 variables: I apology for confusing the two datasets. Have a good day and upcoming weekend :- ) |
Thanks for the update @SugarRayLua! Best, Emanuele |
Any sources for a shapefile for the boundaries of the admin level 2(county) shapes? |
You can download shapefiles from GADM and merge with this database via the |
Thank you!
Have a good week.
Sincerely,
Mike
…Sent from my iPhone
On Aug 23, 2022, at 6:19 AM, Emanuele Guidotti ***@***.***> wrote:
You can download shapefiles from GADM and merge with this database via the key_gadm variable. A short description of key_gadm is provided in the online documentation. More details can be found in the paper https://www.nature.com/articles/s41597-022-01245-1 Hope this helps!
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.
|
Thank you for this great work. How we can download Iraqi COVID19 data at Governorate level (level 2)? |
There are no data for Iraq at level 2 in the database. Do you know where to find them? I would be interested in adding them |
Thanks for the data. I use them in my research. |
@doctorprog55 Thank you for your message! This is fixed now |
Data • COVID-19 Data Hub
https://covid19datahub.io/articles/data.html
The text was updated successfully, but these errors were encountered: