In [6]:
import datetime as dt
import numpy
import math
import pandas as pd

# Covid Data

The data used regarding covid analysis is downloaded from [Our world in data provided by Oxford](https://ourworldindata.org/covid-google-mobility-trends). <br>
To tackle the Coronavirus pandemic, countries across the world have implemented a range of stringent policies, including stay-at-home ‘lockdowns‘; school and workplace closures; cancellation of events and public gatherings; and restrictions on public transport.<br>
These measures were implemented to slow the spread of the virus by enforcing physical distance between people. How effective have these policies been in reducing human movement? What impact has it had on how people across the world work; live; and where they visit?
<br><br>
This new dataset from Google measures visitor numbers to specific categories of location (e.g. grocery stores; parks; train stations) every day and compares this change relative to baseline day before the pandemic outbreak. Baseline days represent a normal value for that day of the week, given as median value over the five‑week period from January 3rd to February 6th 2020. Measuring it relative to a normal value for that day of the week is helpful because people obviously often have different routines on weekends versus weekdays.
<br><br>
On [Google’s website](https://www.google.com/covid19/mobility/) the data is only visualized in pdfs – one for each country. They present Google’s data in [interactive charts](https://ourworldindata.org/covid-google-mobility-trends) to make it easier to see changes over time in a given country; and how specific policies may have affected (or not) behavior across communities. <br><br>

The amount of day-to-day variability in the raw data can make it difficult to understand how overall movements are changing over time. To make this easier to understand they have converted the raw data into the rolling seven-day average.<br><br>
From here 'changes-visitors-covid.csv' has been downloaded

In [9]:
DATA_PATH="covidoxford/data/"
OUTPUT_PATH="covidoxford/data/"

To provide the week name so that it can be compared with other files. The decided length for a week is 8 and number of weeks per month are 4.

In [11]:
week_length=8
n_weeks_per_month=4

## Making a dictionary with key as week and value as the dates falling that week<br>
For loop over 3 years namely 2019, 2020, 2021, over 12 months i.e. 1 to 12 and over 4 weeks (weeks starting from 1) and day starting from 1st. (i.e. starting from 2019-01-01).<br><br>Making a dictionary (named week_dates) with keys as start of week and values as the days falling in that week. Fetching date after every 8 days for 4 times(4 weeks in a month) (say 2019-01-01, 2019-01-09, 2019-01-17, 2019-01-25).<br><br>The innermost loop is running from 0 to 7 as we have found 1 date in the last date. So, for every date extracted in the above step we are appending consecutive 7 more dates. The only one more constraint is all the dates should be of same month, as soon as month changes the innermost loop breaks, giving dates of same month only.<br><br> Finally the dictionary is made as say: Key = 1.2019.week1 and its value is [2019-01-01, 2019-01-02, 2019-01-03, 2019-01-04, 2019-01-05, 2019-01-06, 2019-01-07, 2019-01-08]

In [None]:
week_dates={}
months=[i+1 for i in range(12)]
years=[2019,2020,2021]
for year in years:
    for month in months:
        for week in range(1,n_weeks_per_month+1,1):
            date=dt.date(year,month,1)+dt.timedelta(days=week_length*(week-1))
            dates=[date]
            for i in range(week_length-1):
                new_date=date+dt.timedelta(days=1)
                if new_date.month==date.month:
                    dates.append(new_date)
                    date=new_date
                else:
                    break
            week_dates[str(month)+"."+str(year)+"."+"week"+str(week)]=dates                

Prepare 2 csv files names covid_trends_daily.csv and covid_trends_8days.csv with metric, date and value as fileds.

In [None]:
fp_daily=open(OUTPUT_PATH+"covid_trends_daily.csv","w")
fp_daily.write(",".join(["metric","date","value"])+"\n")
fp_8days=open(OUTPUT_PATH+"covid_trends_8days.csv","w")
fp_8days.write(",".join(["metric","week","value"])+"\n")

The data is extracted for India only. For every required metric from metrics list, the metric_daily dictionary is made with key as date and value as it's date string. The type of string is changed to date.<br> <br>

To store data in daily form run a for loop over metri_daily dictionary and (example) write retail_and_recreation as metric,	17.2.2020 as date	0.667 as metric value. <br><br>

To store data in 8-day form run a for loop over week_dates dictionary, take balues of particular week (i.e. 8 days) and take value of metric for the last most date common in week_dates and metric daily and write (example) retail_and_recreation as metric,	17.2.2020 as date	0.667 as metric value.


In [None]:
df=pd.read_csv(DATA_PATH+"changes-visitors-covid.csv")
metrics=["retail_and_recreation","grocery_and_pharmacy","residential","transit_stations","parks","workplaces"]
def get_daily_metric(df,metric):
    temp=df[(df["Entity"]=="India")&(~df[metric].isnull())].set_index("Day")[metric].to_dict()
    metric_daily={}
    for date_str in temp:
        values=[int(i) for i in date_str.split("-")]
        date=dt.date(values[0],values[1],values[2])
        metric_daily[date]=temp[date_str]
    return metric_daily
for metric in metrics:
    metric_daily=get_daily_metric(df,metric)
    
    for date in metric_daily:
        date_str=str(int(date.day))+"."+str(int(date.month))+"."+str(int(date.year))
        fp_daily.write(metric+","+date_str+","+str(metric_daily[date])+"\n")
    
    for week in week_dates:
        value=None
        for date in week_dates[week]:
            if date in metric_daily:
                value=metric_daily[date]
        if value is not None:
            fp_8days.write(metric+","+week+","+str(value)+"\n")

Few more metrics related to covid cases in terms of deaths and stringency index are added in both the files(daily and 8-day) using the same as described above. The information/data has been taken from 'covid_daily_data.csv' and downloaded from [deaths csv file](https://ourworldindata.org/covid-deaths)

In [None]:
df=pd.read_csv(DATA_PATH+"covid_daily_data.csv")
metrics=["new_cases_smoothed_per_million","new_deaths_smoothed_per_million","stringency_index"]
def get_daily_metric(df,metric):
    temp=df[(df["location"]=="India")&(~df[metric].isnull())].set_index("date")[metric].to_dict()
    metric_daily={}
    for date_str in temp:
        values=[int(i) for i in date_str.split("-")]
        date=dt.date(values[0],values[1],values[2])
        metric_daily[date]=temp[date_str]
    return metric_daily
for metric in metrics:
    metric_daily=get_daily_metric(df,metric)
    
    for date in metric_daily:
        date_str=str(int(date.day))+"."+str(int(date.month))+"."+str(int(date.year))
        fp_daily.write(metric+","+date_str+","+str(metric_daily[date])+"\n")
    
    for week in week_dates:
        value=None
        for date in week_dates[week]:
            if date in metric_daily:
                value=metric_daily[date]
        if value is not None:
            fp_8days.write(metric+","+week+","+str(value)+"\n")

In [None]:
fp_daily.close()
fp_8days.close()