graph with information day-by-day #314
Comments
|
I can't manage programs, but I really appreciate those who make graphics to best understanding trends. ¿Could you do something like these ones, in my link, but not only 4000 people, but more realistic universe? watch this: https://www.washingtonpost.com/graphics/2020/health/coronavirus-how-epidemics-spread-and-end/ |
|
Hi @aster94 I was looking for this and it's great! I can even just copy it into an online python compiler like https://repl.it/languages/python3 and it works out of the box. Thanks a lot! |
|
Hi
I commented the line Another remark : I tried to run the code for 'US' (much lines) and it failed ,
maybe you have to grouby your data ? On my side , I use the other folder called "csse_covid_19_time_series" perhaps could give quicker result ? Buona notte |
|
@wingstonruballos these are very nice graph/animations but "realistic" in this context is a very hard word, make a computer model of the spread of covid-19 is outside my ability, it would be like making a computer model of the global warming (not so easy) @bigbenhur I am happy you found it useful, if you have any suggestion please write here @JiPiBi it is a problem with the python module, i have reported it back: porimol/countryinfo#6 and i am using the solution they proposed
I think this could be due to the fact that
in my opinion the result would be similar, i choosed to use the folder |
|
I come back on the US issue , I think now that it is linked in fact to the use of comma as separator for the first field and for the global csv line (got the same issue trying to import in access) Example with US : ['"Travis', ' CA (From Diamond Princess)"', 'US', '2020-02-24T23:33:02', '0', '0', '0', '38.2721', '-121.9399'] ie 9 elements, 1 more than for other countries Example with 'Italy' ['', 'Italy', '2020-03-08T18:03:04', '7375', '366', '622', '43.0000', '12.0000']] only 8 PS : I read your link about countryinfo and also applied the patch proposed for open and it also worked, thanks |
|
@JiPiBi I think you're right about the comma in the 'Province/State' column. Also, 'US', 'Canada', 'Mainland China' are not unique entries in the 'Country/Region' column |
|
@davidacollins I got no issue using panda with "csse_covid_19_time_series" in that way
and after I used groupby to sum up on second field |
|
Another remark about countryinfo : US is not recognized by wikipedia , you must use United States the same for Mainland China and China, so perhaps you must use a dictionary to translate the name of the countries used in csv files into readable countries for wikipedia |
|
I am without a computer right now but you should be able to make a few fix in the code Try adding these lines: Fix for the comma: add this after data.append |
|
Below is the link to my code for processing the raw data into this format:
Type can be Infected, Recovered, Deaths or Current. Feel free to try it out. https://gist.github.com/qtangs/fd3948ac33a7c1ea6620d941269a04c8 |
|
Also see this to get latest day by day graph for all infected countries: https://www.coronavirusstats.live/ |
|
@aster94 In my opinion, it should be better to use a dictionary to deal with the issue on countryinfo names , because it would avoid multiple if, and it could be changed as needed. For the same reason, for the comma issue , I think that you could be less specific than filtering some states, and use a regular expression or test the length after splitting , f.e. if equal to 9, join the 2 firsts But it is your code and you remain the boss on it :-) |
|
@aster94 To fix the comma in the field I used:
which combined the split columns and removed the unnecessary column, but left me a |
|
As suggested above, you could test the length after comma splitting, and join the 2 firsts strings if equal to 9 ? |
|
I tried that :
and it works for US with a dictionary but not completely sufficient,
I think that in my code with pandas, as I keep the 'Province/State' column, I have not that issue and I can filter by 'Province/State' (I also copy the 'Country/Region' in the 'Province/State' when the cell is empty, ie for the majority of countries for the moment ) . I will try |
|
@aster94 I like the idea of scoring how good a Country is doing, and I will try to integrate something like that into my project ASAP If you want, you can check out my repo , maybe it could be inspiring in some way also regarding what you are doing. My project is basically made of:
Pls check out my repo here: And this is a link to the live webpage presenting all the meaningful aggregations for today: Pls note that, being tha core file a Jupyter Notebook, I also put a link on top of it allowing you to run the code live in Google Colab. Just click the icon on top of the Notebook and a Goolgle Colab window will pop up, ready to execute |
|
Good Day! Please check it and tell me if you run into any problem, I would try to solve them
@JiPiBi I did as you proposed, can you check my solution and see if you have a better idea?
You were completely right, now i am filtering based only on columns number (if 7 or 9 it will join the first two) @r-lomba I am checking your repo, very interesting! not very nice to see how it is proceding from the italian point of view |
|
@aster94
It was the reason I proposed this strange r = [r[0][1:]+','+r[1][:-1]] +r[2:] to get rid of unnecessary characters , but if you have no trouble with it As in pandas I make sum even if there is only one line , I tried to simplify a bit your code :
It seems to work too |
|
yes @JiPiBi you are right i need to write that down somewhere otherwise it could bring some people into mistake! |
|
@aster94 If possible, I would be interested to read your script to understand that possibility |
|
All the countries that are split into provinces don't seem to have accurate dates in the graph. |
|
@JiPiBi have a look in my repo, i pubblished the python script to do it (it needs very little human intervention) @PaFiK1999 i think that this is up to the maintainers of this repo |
|
@aster94 But as I'm not so familiar with github possibilities, and even if it is a stupid question, I dare to ask : do you run the code from your desktop or laptop and commit the result in github or directly run in github ? |
|
yes i just run the file from my laptop and it creates the graph and push all the data to github making a commit |
|
@aster94 in the end I have worked on the idea of "country scoring", and decided that I didn't want to calculate an arbitrary score using whatever formula. There are too many concurring variables here and the result would very likely be a flawed score, in my opinion But I did something else: starting from the collected samples I have implemented polynomial fitting on the datapoints. This is, in practice, logistic regression that captures the most close polynomial modeling the data samples From this polynomial, I extract and draw its second derivative. This allows to further calculate:
Inflection points especially are very interesting, and we know that they happen when the second derivative crosses zero. If it crosses zero heading upwards, the "original" polynomial trend is "increasing". If it crosses zero heading downwards, the trend is "decreasing" I have seen that this correctly captures trends that are not visible to the naked eye, of course there is nothing guaranteeing that these small trends are stable, and they could vary the next day, but still they would be captured correctly even "tomorrow", and especially they are a fact, and not an arbitrary prediction An example of such "advanced" charts that you can now easily produce using my code would be the following (here the trend is obvious and my approach is of course useless, but for many countries these days we are in a less obvious situation): As you told me you would have been interested in my work on the scoring aspect of the problem, I wanted to send you an update :) My repo is here: |
|
Thanks for sharing your approach @r-lomba it is indeed very interesting and the trends seems to reflect the real worl. Unfortunately since this repo seems to not following a constant practise over country names and update of the data making graph out if this is more difficult day after day |



Hello,
I was curious about the day-by-day situation of covid in my country so i wrote a simple python script that using this repo make a multi-bar graph. For every day there are 3 bars: blue for new cases, green for recovered and red for deaths.
Here the output of the program, you may find this interesting:
Also i made a dirty attempt to create a "score" to measure how well a country is responding to the COVID emergency (the highter the better):


Basically the score is:
(daily recovered / all positives) / ((daily confirmed / (population of the country / 10000)) * (daily deaths / all positives))code and updated graph moved to a repo: https://github.com/aster94/COVID-19
The text was updated successfully, but these errors were encountered: