# China's COVID-19 situation
*March 29, 2022*

China has been dealing with a surge in COVID-19 cases, and Shanghai is facing another lockdown. Let's do some visualizations for stories on this. First, import pandas and read in Johns Hopkins data.

In [21]:
import pandas as pd

raw = pd.read_csv("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv")

raw.head()

Unnamed: 0,Province/State,Country/Region,Lat,Long,1/22/20,1/23/20,1/24/20,1/25/20,1/26/20,1/27/20,...,3/25/22,3/26/22,3/27/22,3/28/22,3/29/22,3/30/22,3/31/22,4/1/22,4/2/22,4/3/22
0,,Afghanistan,33.93911,67.709953,0,0,0,0,0,0,...,177321,177321,177520,177602,177658,177716,177747,177782,177803,177827
1,,Albania,41.1533,20.1683,0,0,0,0,0,0,...,273318,273387,273432,273432,273529,273608,273677,273759,273823,273870
2,,Algeria,28.0339,1.6596,0,0,0,0,0,0,...,265612,265621,265629,265641,265651,265662,265671,265679,265684,265691
3,,Andorra,42.5063,1.5218,0,0,0,0,0,0,...,39713,39713,39713,39713,39713,40024,40024,40024,40024,40024
4,,Angola,-11.2027,17.8739,0,0,0,0,0,0,...,99102,99106,99115,99115,99138,99138,99169,99194,99194,99194


### China situation

We'll start by reshaping the table to show what we're interested in here: COVID-19 cases in China on the most recent day on record.

In [22]:
china = raw.loc[raw["Country/Region"] == "China", ["Province/State", '4/2/22', '4/3/22']]

Because the JHU data is confirmed cases (cumulative), let's find the new cases from yesterday to today and show it in the same shape.

In [23]:
china["new_cases"] = china["4/3/22"] - china["4/2/22"]

china.loc[:, ["Province/State", "new_cases"]]

china.to_clipboard()

Unnamed: 0,Province/State,new_cases
59,Anhui,4
60,Beijing,4
61,Chongqing,0
62,Fujian,19
63,Gansu,0
64,Guangdong,13
65,Guangxi,1
66,Guizhou,2
67,Hainan,6
68,Hebei,3


I can take this and map it using Datawrapper (check out the result [here](https://www.datawrapper.de/_/E9WZb/)).

### Shanghai situation

Now let's check out the situation in Shanghai over time. We'll start by reshaping the table so that dates are the index. We also drop some rows that came from our transpose.

In [24]:
shanghai = (raw
    .loc[raw["Province/State"] == "Shanghai", :]
    .transpose()
    .drop(["Province/State", "Country/Region", "Lat", "Long"])
    )

shanghai

Unnamed: 0,84
1/22/20,9
1/23/20,16
1/24/20,20
1/25/20,33
1/26/20,40
...,...
3/30/22,6089
3/31/22,6454
4/1/22,6716
4/2/22,7160


Now we convert the index to datatime.

In [25]:
shanghai.index = pd.to_datetime(shanghai.index)

And calculate the difference (ie. new cases, rather than cumulative cases) and filter by date to show data from Oct. 2021.

In [26]:
shanghai = (shanghai
    .loc[shanghai.index >= "2021-10-01", :]
    .diff()
    .dropna()
)

shanghai

Unnamed: 0,84
2021-10-02,7
2021-10-03,3
2021-10-04,6
2021-10-05,4
2021-10-06,4
...,...
2022-03-30,358
2022-03-31,365
2022-04-01,262
2022-04-02,444


I also want to add a "rolling" column, because it graphs much more nicely.

In [27]:
shanghai["cases_rolling"] = (shanghai
    .rolling(7).mean()
)

shanghai

Unnamed: 0,84,cases_rolling
2021-10-02,7,
2021-10-03,3,
2021-10-04,6,
2021-10-05,4,
2021-10-06,4,
...,...,...
2022-03-30,358,142.142857
2022-03-31,365,188.428571
2022-04-01,262,219.142857
2022-04-02,444,275.000000


Let's also change the column names to be more descriptive.

In [28]:
shanghai.columns = ["Daily new cases", "7-day average"]

When you use `.rolling(7)`, the first 7 rows become NaN, so we drop them.

In [29]:
shanghai.dropna().to_clipboard()

And that's it! We [copy the data over to Datawrapper for graphing](https://www.datawrapper.de/_/Elfyq/).

\-30\-