# "Covid19 exploration",
> "An EDA of covid19 data using the UK governments python API and Altair for graphics",
- toc:true,
- branch: master,
- badges: true,
- comments: false,
- author: Ifan Johnston,
- categories: [covid, eda]

In order to grab the covid data from the API, we need to define the filters and the structure of the request. We will just grab all of the nations data for now.

In [73]:
from uk_covid19 import Cov19API
import altair as alt

all_nations=[
    "areaType=nation"
]

cases_and_deaths = {
    "date": "date",
    "areaName": "areaName",
    "areaCode": "areaCode",
    "newCasesByPublishDate": "newCasesByPublishDate",
    "cumCasesByPublishDate": "cumCasesByPublishDate",
    "newDeathsByDeathDate": "newDeathsByDeathDate",
    "cumDeathsByDeathDate": "cumDeathsByDeathDate"
}

uk_cases = Cov19API(filters=all_nations, structure=cases_and_deaths).get_dataframe().fillna(0)
uk_cases['dailyChange'] = uk_cases.newCasesByPublishDate - uk_cases.newCasesByPublishDate.shift(-1).dropna()
uk_cases.head()

Unnamed: 0,date,areaName,areaCode,newCasesByPublishDate,cumCasesByPublishDate,newDeathsByDeathDate,cumDeathsByDeathDate,dailyChange
0,2020-11-29,England,E92000001,10054,1390923.0,0.0,0.0,-3269.0
1,2020-11-28,England,E92000001,13323,1380869.0,66.0,61507.0,-234.0
2,2020-11-27,England,E92000001,13557,1367546.0,236.0,61441.0,-1080.0
3,2020-11-26,England,E92000001,14637,1355272.0,323.0,61205.0,-1256.0
4,2020-11-25,England,E92000001,15893,1340635.0,413.0,60882.0,6039.0


Note that we added an extra column which has the daily change in new cases. To do this we shifted the column up by one (with `.shift(-1)`) follwed by dropping the `NaN` that appears because of that lag.

First we have a graph which shows the daily change in the number of new cases for each country. This number jumps up and down all over the place, which is likely due to delay in reporting of new cases over the weekend. Another interesting thing is that it looks like the daily cases in Wales experienced a much shorter period of calm over the summer (calm in the sense of daily cases not jumping up and down).

In [72]:
#collapse
alt.Chart(uk_cases).mark_line().encode(
    x="monthdate(date)",
    y="dailyChange",
    tooltip='dailyChange',
    color=alt.condition(
        alt.datum.dailyChange > 0,
        alt.value("orange"),  # The positive color
        alt.value("blue")  # The negative color
    ),
    row='areaName'
).properties(width=800).resolve_scale(y='independent').interactive()

Next is a bar chart of the number of new cases in each country, where the blue bars are days when the number of new cases dropped from the previous days. Again we see that Wales saw a longer period of raising and falling cases compared to the other countries.

Notice that the most recent bars (on the right of the graph) for England and Northen Ireland are fairly consistently blue (which means the cases are dropping), while we don't see that same pattern in Wales and Scotland. 

In [70]:
#collapse
alt.Chart(uk_cases).mark_bar().encode(
    x="monthdate(date)",
    y="newCasesByPublishDate",
    tooltip='newCasesByPublishDate',
    color=alt.condition(
        alt.datum.dailyChange > 0,
        alt.value("orange"),  # The positive color
        alt.value("blue")  # The negative color
    ),
    row='areaName'
).properties(width=800).resolve_scale(y='independent').interactive()