In [None]:
!pip install plotly==4.12.0 -q
!pip install geopandas -q
!pip install shapely -q

In [None]:
# Set your own project id here
from google.cloud import bigquery
from google.cloud import storage
import plotly.express as px
import pandas as pd

import warnings

from pandas.core.common import SettingWithCopyWarning

warnings.simplefilter(action="ignore", category=SettingWithCopyWarning)

PROJECT_ID = 'stunning-object-277601'
bigquery_client = bigquery.Client(project=PROJECT_ID)

## Summary

This notebook summarizes my analysis of short-term COVID-19 related impacts on KPI measurements. It is of my opinion that based on the data provided, excluding KPI measurements for the 2020 year will allow for a clearer assessment of the true impact of existing environmental policies or services. The visualizations below will serve to illustrate the following:

* COVID-19 impacts on KPIs related to emmission reduction programs for air travel
* COVID-19 impacts on KPIs related to emmission reduction programs for road traffic
  

### COVID-19 impacts on KPIs related to emmission reduction programs for air travel

#### Overview

It is estimated that approximately 2.4% of global CO2 emmissions are directly attributed to air travel, and the industry as a whole is responsible for 5% of global warming. Though this contribution may seem minor, it is important to note that only a very small percentage of the world flys frequently, with estimates for even more developed countries like the UK and US having around half of the population fly in any given year, and just 12-15% are frequent fliers. Though there is no exact data, Dan Rutherford, shipping and aviation director at the International Council on Clean Transportation (ICCT), a US-based non-profit, estimates just 3% of the global population take regular flights. In fact, if everyone in the world took just one long-haul flight per year, aircraft emissions would far exceed the US’s entire CO2 emissions, according to ICCT analysis [Reference](https://www.bbc.com/future/article/20200218-climate-change-how-to-cut-your-carbon-emissions-when-flying).

While improving fuel efficiency is gradually reducing the emissions per passenger, it is not keeping up with the rapid increase in total passenger numbers, which were projected to double in the next 20 years [Reference](https://www.iata.org/en/pressroom/pr/2018-10-24-02/). 

Considering these factors, programs targeting GHG reductions from aviation related sources aim to substitute non-essential air travel with other modes of transportation in the short term and continue efficiency and sustainable aviation innovation in the long term.


#### Analysis

The data for this analysis was obtained from a COVID-19 related mobility analysis study conducted by GEOTAB. More details about the analysis and data can be found here (https://data.geotab.com/covid-19-mobility-impact). 

##### Understanding the visualization
* The diagram below illustrates the change in Airport traffic volume relative to a baseline period (1st Feb to 15th March 2020) for 28 high-traffic airports in the US, Canada, Chile and Australia. 
* The data is aggregated by month, where each bar represents the distribution of the changes of each airport from the baseline period.
* The visualization below is a Box-And-Whiskers plot, where each box represents the region of central tendency, and the tails representing the extremes of deviations from the baseline period
* A box is generated for each month from March to October
* Points outside the tails are considered outliers
* The dotted line represents the reference period

**The data indicates that while airport traffic, a proxy of passeger demand, has increased from it's lowest periods in April and May, it is still significantly lower that was it was prior to the pandemic period.**


In [None]:
PROJECT_ID = 'bigquery-public-data'
DATASET_ID = 'covid19_geotab_mobility_impact'
TABLE_ID = 'airport_traffic'

dataset_ref = bigquery_client.dataset(DATASET_ID, 
                             project=PROJECT_ID)
dataset = bigquery_client.get_dataset(dataset_ref)
table = bigquery_client.get_table(f"{PROJECT_ID}.{DATASET_ID}.{TABLE_ID}")

airport_traffic = bigquery_client.list_rows(table).to_dataframe()

airport_traffic['month'] = airport_traffic['date'].apply(lambda x: x.strftime("%b"))
airport_traffic['month_int']  = airport_traffic['date'].apply(lambda x: x.month)

airport_traffic_montly= airport_traffic.groupby(['airport_name', 'city', 'state_region', 'country_name', 'month', 'month_int'], as_index=False)['percent_of_baseline'].mean()
airport_traffic_montly = airport_traffic_montly.sort_values(['airport_name',  'month_int'])

fig = px.box(airport_traffic_montly, 
             y='percent_of_baseline', 
             x='month',
             hover_name="airport_name", 
             hover_data=['airport_name', 'city', 'state_region', 'country_name'],
             points = "outliers",
             labels={
                 "percent_of_baseline": "Percentage of Baseline",
                 'airport_name': "Airport", 
                 'city': "City", 
                 'state_region': "State", 
                 'country_name': "Country",
                 'month': "Month (2020)"
                    })

fig.update_yaxes(tickvals=list(range(-100,110,10)),ticksuffix="%")
fig.add_hline(y=100, line_dash="dot",
              annotation_text="Baseline Activity Level", 
              annotation_position="top right")

fig.update_traces(marker_color='crimson')

#### Recommendations

Due to the COVID-19 pandemic, passenger demand is not expected to return to the 2019 level for atleast 4 to 9 years [Reference](https://www.thestreet.com/mishtalk/economics/how-long-will-it-take-for-the-airline-industry-to-recover). Therefore, I recommend that a greater emphasis be placed on KPIs related to increasing fuel effeciency and innovation in the short term, with an emphasis on alternative travel modes when passenger demand returns. 

### COVID-19 impacts on KPIs related to emmission reduction programs for road transport

#### Overview

Transport emissions — which primarily involve road, rail, air and marine transportation — accounted for over 24% of global CO2 emissions in 2016. They're also expected to grow at a faster rate than that from any other sector, posing a major challenge to efforts to reduce emissions in line with the Paris Agreement and other global goals. In terms of transport modes, 72% of global transport emissions come from road vehicles, which accounted for 80% of the rise in emissions from 1970-2010 [Reference](https://www.wri.org/blog/2019/10/everything-you-need-know-about-fastest-growing-source-global-emissions-transport#:~:text=In%20terms%20of%20transport%20modes,and%20international%20and%20coastal%20shipping.).

In order to address this, cities accross the world have implemented a variety of policies and programs to reduce road transport related emmissions. Here are a few examples [Reference](https://www.theguardian.com/environment/2016/may/17/how-are-cities-around-the-world-tackling-air-pollution):
* Paris bans cars in many historic central districts at weekends, imposes odd-even bans on vehicles, makes public transport free during major pollution events and encourages car- and bike-sharing programmes.
* Delhi bans all new large diesel cars and SUVs with engines of more than 2,000CC and to phase out tens of thousands of diesel taxis. The city has experimented with  and is now encouraging Uber-style minibuses on demand.
* Oslo plans to halve its climate emissions by 2020 and proposes a large no-car zone, the building of 40 miles of new bike lanes, steep congestion charges, a rush-hour fee for motorists, and the removal of many parking spaces.
* Zurich has capped the number of parking spaces in the city, only allows a certain number of cars into the city at any one time, and is building more car-free areas, plazas, tram lines and pedestrianised streets. 

Given the nature of these programs, leveraging KPIs directly measure the reduction of road transport traffic in the short term, and measuring the implementation of environmentally sustainable alternatives in the long term would serve as effective measures of success.

#### Analysis

The data for this analysis was also obtained from a COVID-19 related mobility analysis study conducted by GEOTAB. More details about the analysis and data can be found here (https://data.geotab.com/covid-19-mobility-impact). 



The first diagram below illustrates the overall change in city congestion levels relative to a baseline period. The city congestion metric is calculated by first averaging four separate intersection metrics across an entire city. These individual intersection metrics include: number of trips, average speed, average total time stopped and average number of stops through each intersection. These metrics are then individually normalized using the following equation:


(Current Hour Value - Minimum Value) / (Maximum Value - Minimum Value)


The maximum and minimum values are derived from data between February 1st 2020 and the current date. The individual normalized metrics are combined via equal-weighted averaging to yield a single congestion metric per city per hour. The distribution of the measurements are grouped for each city by month, where each bar represents the distribution of congestion values for the city for the given month.



**The data indicates that COVID-19 has a lasting and prolonged effect of reducing city congestion levels where even in October, city congestion levels are still significantly lower than prior to the pandemic period.**

In [None]:
PROJECT_ID = 'bigquery-public-data'
DATASET_ID = 'covid19_geotab_mobility_impact'
TABLE_ID = 'city_congestion'

dataset_ref = bigquery_client.dataset(DATASET_ID, 
                             project=PROJECT_ID)
dataset = bigquery_client.get_dataset(dataset_ref)
table = bigquery_client.get_table(f"{PROJECT_ID}.{DATASET_ID}.{TABLE_ID}")

city_congestion = bigquery_client.list_rows(table).to_dataframe()
city_congestion['month'] = city_congestion['date_time'].apply(lambda x: x.strftime("%b"))
city_congestion['month_int']  = city_congestion['date_time'].apply(lambda x: x.month)
city_congestion = city_congestion.sort_values(['city_name', 'date_time'])

fig = px.box(city_congestion[city_congestion['month_int']>=2], 
             y='percent_congestion', 
             x='month',
             color = 'city_name',
             hover_name="city_name", 
             hover_data=['city_name', 'date_time'],
             labels = {
                 "percent_congestion": "Percentage of Baseline City Congestion",
                 'date_time': "Date", 
                 'city_name': "City", 
                 'month': "Month (2020)"
                    })
fig.add_hline(y=100, line_dash="dot",
              annotation_text="Baseline Congestion Level", 
              annotation_position="bottom right")
fig.show()

The second diagram below illustrates the overall the change in volume of commercial activity each day (at local time) from March 16 onwards, as measured by the number of trips taken. Volume of activity is calculated on a relative basis using data from Feb. 1st and March 15th, 2020 as a benchmark, controlled for day-of-week. The data is further broken down by the type of business activity and is available for each state/province in the US, Canada and Mexico (see https://data.geotab.com/covid-19-mobility-impact/commercial-vehicle-traffic-analysis).


**The data indicates that although overall commercial activity has been significantly impacted by COVID-19 during the months of March through to May, it is currently recovered to pre-COVID levels. The recovery however has not been uniform, with warehouse activity significantly increasing, which may be attributed to the increase in online and ecommerce activity.**

In [None]:
PROJECT_ID = 'bigquery-public-data'
DATASET_ID = 'covid19_geotab_mobility_impact'
TABLE_ID = 'lookup_region'

dataset_ref = bigquery_client.dataset(DATASET_ID, 
                             project=PROJECT_ID)
dataset = bigquery_client.get_dataset(dataset_ref)
table = bigquery_client.get_table(f"{PROJECT_ID}.{DATASET_ID}.{TABLE_ID}")

lookup_region = bigquery_client.list_rows(table).to_dataframe()

PROJECT_ID = 'stunning-object-277601'
DATASET_ID = 'cdp_unlocking_climate_solutions_questions_analysis'
TABLE_ID = 'geotab_mobility_impact__commercial_traffic'

dataset_ref = bigquery_client.dataset(DATASET_ID, 
                             project=PROJECT_ID)
dataset = bigquery_client.get_dataset(dataset_ref)
table = bigquery_client.get_table(f"{PROJECT_ID}.{DATASET_ID}.{TABLE_ID}")

commercial_traffic = bigquery_client.list_rows(table).to_dataframe()

commercial_traffic = commercial_traffic.merge(lookup_region, on='country_iso_code_2', how = "left")

commercial_traffic_province = commercial_traffic.dropna(axis = 0, subset = ['country_iso_code_2'])

commercial_traffic_province = commercial_traffic.dropna(axis = 0, subset = ['country_iso_code_2'])
commercial_traffic_province['month'] = commercial_traffic_province['date'].apply(lambda x: x.strftime("%b"))
commercial_traffic_province['month_int']  = commercial_traffic_province['date'].apply(lambda x: x.month)
commercial_traffic_province = commercial_traffic_province.groupby(['country_iso_code_2', 'month', 'month_int'], as_index=False)[['percent_of_baseline_activity',
                                                                                                                                 'percent_of_baseline_commercial', 
                                                                                                                                 'percent_of_baseline_industrial',
                                                                                                                                 'percent_of_baseline_warehouse',
                                                                                                                                 'percent_of_baseline_grocery_store',
                                                                                                                                 'percent_of_baseline_other_retail']].agg(lambda x: x.mean(skipna=False))
commercial_traffic_province_month = pd.wide_to_long(commercial_traffic_province,
                                              stubnames='percent_of_baseline',
                                              i=['country_iso_code_2', 'month', 'month_int'], j='activity_type',
                                              sep='_', suffix='\w+'
                                             ).reset_index()

commercial_traffic_province_month.sort_values(['country_iso_code_2', 'month_int'], inplace = True)

fig = px.box(commercial_traffic_province_month, 
             y='percent_of_baseline', 
             x='month',
             color = 'activity_type',
             hover_name="country_iso_code_2", 
             hover_data=['country_iso_code_2'],
             labels = {
                 "percent_of_baseline": "Percentage of Baseline Congestion",
                 'activity_type': "Activity Type", 
                 'country_iso_code_2': "Country", 
                 'month': "Month (2020)"
                    })
fig.add_hline(y=100, line_dash="dot",
              annotation_text="Baseline Congestion Level", 
              annotation_position="right")
fig.show()

The third diagram below leverages the same data source and measure as the prior, but illustrates the impact on commercial activity broken down by industry for the US and Canada.

The data indicates the following:

* **The reduction of commercial activity during the pandemic is more pronounced in the US than in Canada, with the construction, public sector and hospitality industries being affected the most.**
* **The reduction of commercial activity is not uniform across industries, with customer services, healthcare and hospitality being affected the most.**


In [None]:
PROJECT_ID = 'bigquery-public-data'
DATASET_ID = 'covid19_geotab_mobility_impact'
TABLE_ID = 'commercial_traffic_by_industry'

dataset_ref = bigquery_client.dataset(DATASET_ID, 
                             project=PROJECT_ID)
dataset = bigquery_client.get_dataset(dataset_ref)
table = bigquery_client.get_table(f"{PROJECT_ID}.{DATASET_ID}.{TABLE_ID}")

commercial_traffic_by_industry = bigquery_client.list_rows(table).to_dataframe()

commercial_traffic_by_industry_country = commercial_traffic_by_industry.dropna(axis = 0, subset = ['alpha_code_3'])
commercial_traffic_by_industry_country['month'] = commercial_traffic_by_industry_country['date'].apply(lambda x: x.strftime("%b"))
commercial_traffic_by_industry_country['month_int']  = commercial_traffic_by_industry_country['date'].apply(lambda x: x.month)
commercial_traffic_by_industry_country_month = commercial_traffic_by_industry_country.groupby(['alpha_code_3','industry' ,"month",'month_int'], as_index=False)['percent_of_baseline'].agg(lambda x: x.mean(skipna=False))
commercial_traffic_by_industry_country_month = commercial_traffic_by_industry_country_month.sort_values(['alpha_code_3','industry' ,"month_int"])


fig = px.bar(commercial_traffic_by_industry_country_month[~commercial_traffic_by_industry_country_month['industry'].isin(["Software", 'Organizations','Holding Companies & Conglomerates', 'Insurance', 'Media & Internet'])], 
             x="month", y="percent_of_baseline", color="industry", facet_col="industry", facet_row="alpha_code_3",
             labels = {
                 "percent_of_baseline": "% of Baseline Congestion",
                 'industry': "Industry Type", 
                 'alpha_code_3': "Country", 
                 'month': "Month"
                    },
            height = 1000)

def clean_faucet_labels(label):
    if "Country" in label:
        return label.split("=")[-1] 
    else:
        return ""

fig.for_each_annotation(lambda a: a.update(text=clean_faucet_labels(a.text)))


fig.update_layout(legend=dict(
    orientation="h",
    yanchor="bottom",
    y=-.3,
    xanchor="left",
    x=0

))

fig.show()



#### Recommendations

A multitude of factors related to the Covid-19 pandemic have resulted in decreased levels of domestic and commercial transportation activity. Although it is reasonable to expect that these current levels of mobility will eventually increase, it will not be uniform and certain policy changes (i.e. flexible work arrangements) may have lasting impacts on future traffic levels. Therefore, I recommend that a greater emphasis be placed on KPI's related to increasing adoption of environmentally sustainable modes of transportation and improving vehicle effeciency. 