<div align="center">
    <h2>San Francisco International Airport Passenger and Cargo Traffic Study</h2>
    <img src="https://user-images.githubusercontent.com/48846576/117231743-7101ee00-ade5-11eb-8b3a-a026881c4ecd.jpg"  width="700" height="200">
    <br>Photo by <a href="https://unsplash.com/@starocker?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Ken Yam</a> on <a href="https://unsplash.com/?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a>

</div>
<br>
<div align="left"> 
San Francisco International Airport (SFO), California, USA is one of the busiest airports in the United States and also an important transpacific gateway for international passenger / cargo travel. This notebook explores the passenger and cargo traffic data in and out of SFO airport and the impact of COVID-19 on air travel from/to SFO.
</div><br>
<div align="left"> 
    <h3>Motivation</h3>
    As the world is still trying to get out of the COVID-19 pandemic, we all know that transportation industry is one of the heavily impacted ones. It would be good learning experience to explore a dataset from the industry. SFO being one of the major hubs in USA, exploring this airport statistics may be a good sample to understand how the industry is doing.<br><br>

The idea of dark theme for plots and having a specific set of color palette are inspired by the works of <a href='https://www.kaggle.com/subinium'>Subin An</a> and <a href='https://www.kaggle.com/ruchi798'>Ruchi Bhatia
</a>. Thanks to them for sharing thier great works! <br><br>
    
Work in progress.. <i>If you like my work, please do upvote and leave your comments for improvements! Much appreciated</i> 
</div><br>
<div style="background-color:#fdb913; font-size:120%;  font-family:sans-serif; text-align:center"><b>Table of Content</b></div>


<a id='0.0'></a>
* [Load Data](#loaddata)
* [Passenger Traffic](#passenger-1)
    * [Passenger travels 2019 vs 2020](#passenger-1-overview)
    * [Year over year passenger count](#passenger-1_yoy_count)
    * [Year over year % change in passenger count](#passenger-1_yoy)
    * [Passenger trend in 2019 vs 2020](#passenger-1_2019_202)
    * [Airlines servicing SFO](#passenger-1_airlines)
    * [Year over year passenger count by geo region](#passenger-1_yoy_geo_region)
    * [Year over year passgenr count by ticket fare](#passenger-1_yoy_ticket_fare)
* [Cargo Traffic](#cargo-1)
    * [Cargo trend in 2019 vs 2020](#cargo-1_yoy_2019_2020)
    * [Year over year cargo trend](#cargo-1_yoy_trend)
    * [Year over year % change in cargo shipment](#cargo-1_yoy_percent_change)
    * [ Year over year trend by cargo type](#cargo-1_yoy_cargo_type)
    * [Year over year cargo trend by region](#cargo-1_yoy_cargo_geo)
    * [Year over year cargo trend by aircraft type](#cargo-1_yoy_cargo_aircraft)


<a id='loaddata'></a>
## <p style="background-color:#fdb913; font-family:Computer Modern;src: url('http://mirrors.ctan.org/fonts/cm-unicode/fonts/otf/cmunss.otf'); font-size:100%; text-align:center"> Load Data </p>
[back to top](#0.0)

In [None]:
import numpy as np
import pandas as pd
import os
import plotly.graph_objects as go
from plotly.offline import iplot, init_notebook_mode
init_notebook_mode(connected=True)
import seaborn as sns
import matplotlib.pyplot as plt
from plotly.offline import iplot, init_notebook_mode
init_notebook_mode(connected=True)
import plotly_express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
from plotly.offline import init_notebook_mode
import plotly.io as pio
from plotly.subplots import make_subplots
# setting default template to plotly_white for all visualizations
pio.templates.default = "plotly_white"
%matplotlib inline
import gc

from colorama import Fore, Back, Style

y_ = Fore.YELLOW
r_ = Fore.RED
g_ = Fore.GREEN
b_ = Fore.BLUE
m_ = Fore.MAGENTA
c_ = Fore.CYAN
res = Style.RESET_ALL

import warnings
warnings.filterwarnings('ignore')
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

In [None]:
passenger = pd.read_csv('/kaggle/input/sfo-air-traffic-passenger-and-cargo-statistics/Air_Traffic_Passenger_Statistics.csv',index_col=None)
cargo = pd.read_csv('/kaggle/input/sfo-air-traffic-passenger-and-cargo-statistics/Air_Traffic_Cargo_Statistics.csv',index_col=None)
print(f"{y_}Passenger traffic data shape - {passenger.shape}{res}\n{m_}Cargo traffic data shape - {cargo.shape}{res}")

In [None]:
passenger

In [None]:
cargo

In [None]:
passenger.info()

In [None]:
cargo.info()

In [None]:
sns.heatmap(passenger.isna())

> Few missing values in passenger data on the airline iata fields. Will deal with that later

In [None]:
sns.heatmap(cargo.isna())

> No missing values in cargo data.

#### Color Palette

In [None]:
colors1 = ['#FC6238', '#FFD872','#F2D4CC','#E77577','#0065A2','#74737A']
colors2 = ['#3E7DCC', '#8F9CB3','#00C8C8','#F9D84A','#8CC0FF','#4D525A']
colors3 = ['#B29476', '#E3D6C9','#1F5C70','#FBA01D','#FCBC49','#393B45']
sns.palplot(sns.color_palette(colors1),size=0.9)
sns.palplot(sns.color_palette(colors2),size=0.9)
sns.palplot(sns.color_palette(colors3),size=0.9)
axis_color='#000000'
plot_background = colors3[-1]
color_2020 = colors1[0]

<a id='passenger-1'></a>
## <p style="background-color:#fdb913; font-family:Computer Modern;src: url('http://mirrors.ctan.org/fonts/cm-unicode/fonts/otf/cmunss.otf'); font-size:100%; text-align:center"> Passenger Traffic</p>
[back to top](#0.0)

<a id='passenger-1-overview'></a>
### Passenger travels 2019 vs 2020 
[back to top](#0.0)



In [None]:
passenger['date'] = pd.to_datetime(passenger['activity_period'], format='%Y%m')

def populate_total(row):
    if row['activity_type'] == 'Thru / Transit':
        return row['passenger_count']*2
    else:
        return row['passenger_count']

def plot_yearly_passenger_trend(years, df, title, annotate=True):   
    fig = go.Figure()
    # Assemble the data
    for year in years:
        yearly = df.loc[df['date'].dt.year == year].reset_index(drop=True)
        yearly = yearly.groupby(['date','activity_type'])['passenger_count'].sum().reset_index()
        yearly['combined_type'] = 'All'
        yearly['total_count'] = yearly.apply(populate_total, axis=1)
        yearly = yearly.groupby(['date','combined_type'])['total_count'].sum().reset_index()

        if year == 2020:
            line_color = colors1[0]
        else:
            line_color = colors2[2]
        # Draw the chart
        fig.add_trace(go.Scatter(x=yearly['date'].dt.strftime('%b'), 
                        y=yearly['total_count'],
                                 mode='lines+markers',
                                 name = year,
                                 line=dict(color=line_color, width = 3)
                                )
                     )

    fig.update_layout(xaxis=dict(tickmode = 'array',
                                 title='Month of the year',
                                 showgrid=False,
                                 zeroline=False,
                                color=axis_color),
                      yaxis=dict(title='Total passengers',
                                 showgrid=False,
                                 zeroline=False,
                                color=axis_color), 
                      title = dict(text = title,
                                   font_color = axis_color,
                                   xref = 'paper',
                                  ),
                      #legend = dict(bgcolor = colors3[-1]),
                     plot_bgcolor=plot_background
                     )
    if annotate==True:
        fig.add_annotation(
                x='1.5',
                y=2000000,
                xref="x",
                yref="y",
                text="<b>Beginning of COVID-19 era</b>",
                showarrow=True,
                font=dict(
                    #family="Computer Modern",
                    size=12,
                    color="#fafafa"
                    ),
                align="center",
                ax=20,
                ay=-30,
                bordercolor="#000000",
                borderwidth=2,
                borderpad=4,
                bgcolor="#000000",
                opacity=0.8
                )    
    fig.show()
    
def plot_2020_vs_rest_average_trend():   
    fig = go.Figure()
    # Assemble the average passenger count from 2006 till 2019
    years = list(passenger['date'].dt.year.unique())
    years.remove(2005)
    years.remove(2020)
    dfs = []
    for year in years:
        yearly = passenger.loc[passenger['date'].dt.year == year].reset_index(drop=True)
        yearly = yearly.groupby(['date','activity_type'])['passenger_count'].sum().reset_index()
        yearly['combined_type'] = 'All'
        yearly['total_count'] = yearly.apply(populate_total, axis=1)
        yearly = yearly.groupby(['date','combined_type'])['total_count'].sum().reset_index()
        dfs.append(yearly)
    all_years = pd.concat(dfs).reset_index(drop=True)    

    all_years['month'] = all_years['date'].dt.strftime('%b')
    all_years = all_years.groupby(['month'])['total_count'].mean().reset_index()
    all_years["mon"] = pd.to_datetime(all_years.month, format='%b', errors='coerce').dt.month
    all_years = all_years.sort_values(by="mon")

    #Assemble 2020 year data
    yearly = passenger.loc[passenger['date'].dt.year == 2020].reset_index(drop=True)
    yearly = yearly.groupby(['date','activity_type'])['passenger_count'].sum().reset_index()
    yearly['combined_type'] = 'All'
    yearly['total_count'] = yearly.apply(populate_total, axis=1)
    yearly = yearly.groupby(['date','combined_type'])['total_count'].sum().reset_index()
    
    # Draw the chart
    fig.add_trace(go.Scatter(x=all_years['month'], 
                        y=all_years['total_count'],
                                 mode='lines+markers',
                                 name = 'Average 2006-2019',
                                 line=dict(color=colors2[0], width = 3)
                                )
                     )
    fig.add_trace(go.Scatter(x=yearly['date'].dt.strftime('%b'), 
                        y=yearly['total_count'],
                                 mode='lines+markers',
                                 name = '2020',
                                 line=dict(color=color_2020, width = 3)
                                )
                     )
    
    fig.update_layout(xaxis=dict(tickmode = 'array',
                                 title='Month of the year',
                                 showgrid=False,
                                 zeroline=False,
                                color=axis_color,
                                categoryorder='array', categoryarray=yearly['date'].dt.strftime('%b')),
                      yaxis=dict(title='Total passengers',
                                 showgrid=False,
                                 zeroline=False,
                                color=axis_color), 
                      title = dict(text = 'SFO Passenger travels 2020 vs Average between 2006 - 2019',
                                   font_color = axis_color,
                                   xref = 'paper',
                                  ),
                      #legend = dict(bgcolor = colors3[-1]),
                     plot_bgcolor=plot_background
                     ) 
    fig.add_annotation(
            x='1.5',
            y=2000000,
            xref="x",
            yref="y",
            text="<b>Beginning of COVID-19 era</b>",
            showarrow=True,
            font=dict(
                #family="Computer Modern",
                size=12,
                color="#fafafa"
                ),
            align="center",
            ax=20,
            ay=-30,
            bordercolor="#000000",
            borderwidth=2,
            borderpad=4,
            bgcolor="#000000",
            opacity=0.8
            )    
    fig.show()    

In [None]:
plot_yearly_passenger_trend([2019,2020], passenger, 'SFO Passenger travels 2019 vs 2020')

> As we can see from the above chart there is deep decline in passenger count from beginning for Feb 2020 till Apr 2020. Mid Feb & March 2020 timframe is when COVID-19 entered USA. It slightly recovered towards end of the year during holiday travel seasons.

![sfo_passenger_2020](https://user-images.githubusercontent.com/48846576/117524035-53698b80-af81-11eb-84ff-760d2dc22403.png)


![SFO_Passengers_2019](https://user-images.githubusercontent.com/48846576/117524038-55334f00-af81-11eb-916c-f0acae6bd216.png)

In [None]:
plot_2020_vs_rest_average_trend()

<a id='passenger-1_yoy_count'></a>
### Year over year passenger count
[back to top](#0.0)

In [None]:
passenger['year'] = passenger['date'].dt.year
passenger['passenger_count_new'] = passenger.apply(populate_total, axis=1)

geo_summary = passenger.groupby(['year','geo_summary'])['passenger_count_new'].sum().reset_index()
domestic = geo_summary.loc[geo_summary['geo_summary'] == 'Domestic']
international = geo_summary.loc[geo_summary['geo_summary'] == 'International']

fig = go.Figure()
bar_colors1 = [colors2[2],] * domestic['year'].count()
#bar_colors1[domestic['year'].count()-1] = colors1[0]
bar_colors2 = [colors2[3],] * domestic['year'].count()
#bar_colors2[domestic['year'].count()-1] = colors1[0]

fig.add_trace(go.Bar(
    x=domestic['year'],
    y=domestic['passenger_count_new'],
    name='Domestic',
    marker_color = bar_colors1,
    text = domestic['passenger_count_new'],
    texttemplate='%{text:.2s}', 
    textposition='auto'
))
fig.add_trace(go.Bar(
    x=international['year'],
    y=international['passenger_count_new'],
    name='International',
    marker_color=bar_colors2,
    text = international['passenger_count_new'],
    texttemplate='%{text:.2s}', 
    textposition='auto'
))

fig.add_annotation(
            x='2020',
            y=13000000,
            xref="x",
            yref="y",
            text="COVID-19 Impact",
            showarrow=True,
                arrowhead=1,
            arrowsize=1.5,
            arrowwidth=2,
            arrowcolor=color_2020,    
            font=dict(
                size=12,
                color="#fafafa"
                ),
            align="center",
            #ax=20,
            #ay=-30,
            bordercolor="#000000",
            borderwidth=2,
            borderpad=4,
            bgcolor="#000000",
            opacity=0.8
            )    
fig.update_layout(barmode='group',
                  xaxis=dict(
                                tickmode = 'array',
                                 title='Year',
                                 showgrid=False,
                                 zeroline=False,
                                color=axis_color,
                                categoryorder='array', 
                                categoryarray=list(domestic['year'])
                            ),
                      yaxis=dict(title='Total passengers',
                                 showgrid=False,
                                 zeroline=False,
                                color=axis_color), 
                      title = dict(text = 'SFO Domestic & International Passengers count year-over-year',
                                   font_color = axis_color,
                                   xref = 'paper',
                                  ),
                      #legend = dict(bgcolor = colors3[-1]),
                     plot_bgcolor=plot_background,
                      bargap=0.15, 
                    bargroupgap=0.1,
                     ) 
fig.show()

<a id='passenger-1_yoy'></a>
### Year over year % change in passenger count
[back to top](#0.0)

In [None]:
yearly = passenger.groupby('year')['passenger_count_new'].sum().reset_index()
yearly['pct_passnger'] = yearly['passenger_count_new'].pct_change()
yearly = yearly.loc[yearly['year'] != 2005]
yearly['pct_passnger'] = yearly['pct_passnger'] * 100

color = colors1[0]
geo_region = passenger.groupby(['year','geo_region'])['passenger_count_new'].sum().reset_index()
regions = list(geo_region.geo_region.unique())
regions.remove('US')
regions.insert(0,'US')

fig = go.Figure()

fig.add_trace(go.Bar(
    x=yearly['year'],
    y=yearly['pct_passnger'],
    name='Year-over-year % change',
    marker_color = color,
    text = yearly['pct_passnger'],
    texttemplate='%{text:.2s}%', 
    textposition='auto',
    marker_line_width=2.5, opacity=0.8,
    marker_line_color = colors2[3]        
))

fig.update_layout(xaxis=dict(
                                tickmode = 'array',
                                 title='Year',
                                 showgrid=False,
                                 zeroline=False,
                                color=axis_color,
                                categoryorder='array', 
                                categoryarray=list(domestic['year'])
                            ),
                      yaxis=dict(title='YOY % change',
                                 showgrid=False,
                                 zeroline=False,
                                color=axis_color), 
                      title = dict(text = 'Year-over-year % change',
                                   font_color = axis_color,
                                   xref = 'paper',
                                  ),
                      #legend = dict(bgcolor = colors3[-1]),
                     plot_bgcolor=plot_background,
                      bargap=0.15, 
                    bargroupgap=0.1,
                     ) 
fig.show()

> There is 71% decline in the number of passengers in 2020 from 2019. Thats phenomenal drop due to COVID-19!. The 93% increase in 2006 is due to the fact that the data for 2005 consists only few months stats

<a id='passenger-1_2019_202'></a>
### Passenger trend in 2019 vs 2020
[back to top](#0.0)

In [None]:
plot_yearly_passenger_trend([2019,2020], 
                            passenger.loc[passenger['geo_summary'] == 'Domestic'].reset_index(drop=True),
                           'SFO Domestic Passenger Travels 2019 vs 2020', annotate=True)

In [None]:
plot_yearly_passenger_trend([2019,2020], 
                            passenger.loc[passenger['geo_summary'] == 'International'].reset_index(drop=True),
                           'SFO International Passenger Travels 2019 vs 2020', annotate=False)

<a id='passenger-1_airlines'></a>
### Airlines servicing SFO
[back to top](#0.0)

In [None]:
from PIL import Image
import requests
airplane_mask = np.array(Image.open(requests.get('https://i.dlpng.com/static/png/6564609_preview.png', stream=True).raw))
np.save('airplane_mask',airplane_mask)

from wordcloud import WordCloud, STOPWORDS , ImageColorGenerator
all_airlines = passenger.groupby(['operating_airline'])['passenger_count'].sum().reset_index()
tuples = [tuple(x) for x in all_airlines.values]
def generate_wc(wc_dict,
                fig_size=(20,12), 
                 stop_words = None, 
                 max_font_size = 250,
                 max_words = 250,
                 background_color = 'white',
                 color_map = 'inferno',
                 inter_polation = 'bilinear'
                ):
    fig, ax = plt.subplots(1, 1, figsize  = fig_size
                          )
    wordcloud_ALL = WordCloud(max_font_size=max_font_size,
                              min_font_size = 4,
                              #font_step = 10,
                              collocations=False,
                              max_words=max_words, 
                              background_color=background_color,
                              colormap=color_map,
                             mask=airplane_mask,
                              random_state=42,
                             contour_width=3, contour_color=colors3[2]).generate_from_frequencies(dict(tuples))
    # Display the generated image:
    ax.imshow(wordcloud_ALL, interpolation=inter_polation)
    ax.axis('off')

generate_wc(tuples,background_color = 'white',
                 color_map = 'seismic',inter_polation="bilinear")

<a id='passenger-1_yoy_geo_region'></a>
### Year over year passenger count by geo region
[back to top](#0.0)  

In [None]:
geo_region = passenger.groupby(['year','geo_region'])['passenger_count_new'].sum().reset_index()
regions = list(geo_region.geo_region.unique())
regions.remove('US')
regions.insert(0,'US')

fig = go.Figure()
bar_colors = [colors2[0],colors2[1],colors2[2],colors2[3],colors2[4],
         colors1[0],colors1[1],colors1[2],colors1[3],colors1[4],
         colors3[0]]
for region, color in zip(regions,bar_colors):
    df = geo_region.loc[geo_region['geo_region'] == region]
    fig.add_trace(go.Bar(
        x=df['year'],
        y=df['passenger_count_new'],
        name=region,
        marker_color = color,
        #text = df['passenger_count_new'],
        #texttemplate='%{text:.2s}', 
        #textposition='auto'
        marker_line_width=2.5, opacity=0.8,
        marker_line_color = color        
    ))
fig.add_annotation(
            x='2020',
            y=18000000,
            xref="x",
            yref="y",
            text="COVID-19 Impact",
            showarrow=True,
                arrowhead=1,
            arrowsize=1.5,
            arrowwidth=2,
            arrowcolor=color_2020,    
            font=dict(
                size=12,
                color="#fafafa"
                ),
            align="center",
            #ax=20,
            #ay=-30,
            bordercolor="#000000",
            borderwidth=2,
            borderpad=4,
            bgcolor="#000000",
            opacity=0.8
            )       
fig.update_layout(barmode='stack',
                  xaxis=dict(
                                tickmode = 'array',
                                 title='Year',
                                 showgrid=False,
                                 zeroline=False,
                                color=axis_color,
                                categoryorder='array', 
                                categoryarray=list(domestic['year'])
                            ),
                      yaxis=dict(title='Total passengers',
                                 showgrid=False,
                                 zeroline=False,
                                color=axis_color), 
                      title = dict(text = 'SFO Passengers count year-over-year by geo region',
                                   font_color = axis_color,
                                   xref = 'paper',
                                  ),
                     plot_bgcolor=plot_background,
                      bargap=0.15, 
                    bargroupgap=0.1,
                     ) 
fig.show()

<a id='passenger-1_yoy_ticket_fare'></a>
### Year over year passgenr count by ticket fare
[back to top](#0.0)  

In [None]:
price_category = passenger.groupby(['year','price_category'])['passenger_count_new'].sum().reset_index()
#price_category
#passenger.loc[passenger['price_category'] == 'Other']['operating_airline'].unique()
price_category = passenger.groupby(['year','price_category'])['passenger_count_new'].sum().reset_index()
categories = list(price_category.price_category.unique())

fig = go.Figure()
bar_colors = [colors1[3],colors1[4]]
for category, color in zip(categories,bar_colors):
    df = price_category.loc[price_category['price_category'] == category]
    fig.add_trace(go.Bar(
        x=df['year'],
        y=df['passenger_count_new'],
        name=category,
        marker_color = color,
        #text = df['passenger_count_new'],
        #texttemplate='%{text:.2s}', 
        #textposition='auto'
        marker_line_width=1.5, opacity=0.8,
        marker_line_color = axis_color
    ))
fig.add_annotation(
            x='2020',
            y=18000000,
            xref="x",
            yref="y",
            text="COVID-19 Impact",
            showarrow=True,
                arrowhead=1,
            arrowsize=1.5,
            arrowwidth=2,
            arrowcolor=color_2020,    
            font=dict(
                size=12,
                color="#fafafa"
                ),
            align="center",
            #ax=20,
            #ay=-30,
            bordercolor="#000000",
            borderwidth=2,
            borderpad=4,
            bgcolor="#000000",
            opacity=0.8
            )       
fig.update_layout(barmode='stack',
                  xaxis=dict(
                                tickmode = 'array',
                                 title='Year',
                                 showgrid=False,
                                 zeroline=False,
                                color=axis_color,
                                categoryorder='array', 
                                categoryarray=list(price_category['year'])
                            ),
                      yaxis=dict(title='Total passengers',
                                 showgrid=False,
                                 zeroline=False,
                                color=axis_color), 
                      title = dict(text = 'SFO Passengers count by ticket fare - year-over-year',
                                   font_color = axis_color,
                                   xref = 'paper',
                                  ),
                     plot_bgcolor=plot_background,
                     ) 
fig.show()

<a id='cargo-1'></a>
## <p style="background-color:#fdb913; font-family:Computer Modern;src: url('http://mirrors.ctan.org/fonts/cm-unicode/fonts/otf/cmunss.otf'); font-size:100%; text-align:center"> Cargo Traffic</p>
[back to top](#0.0)

In [None]:
cargo['date'] = pd.to_datetime(cargo['activity_period'], format='%Y%m')

def plot_yearly_cargo_trend(years, df, title, annotate=True):   
    fig = go.Figure()
    # Assemble the data
    for year in years:
        yearly = df.loc[df['date'].dt.year == year].reset_index(drop=True)
        yearly = yearly.groupby(['date'])['cargo_metric_tons'].sum().reset_index()

        if year == 2020:
            line_color = colors1[0]
        else:
            line_color = colors2[2]
        # Draw the chart
        fig.add_trace(go.Scatter(x=yearly['date'].dt.strftime('%b'), 
                        y=yearly['cargo_metric_tons'],
                                 mode='lines+markers',
                                 name = year,
                                 line=dict(color=line_color, width = 3)
                                )
                     )

    fig.update_layout(xaxis=dict(tickmode = 'array',
                                 title='Month of the year',
                                 showgrid=False,
                                 zeroline=False,
                                color=axis_color),
                      yaxis=dict(title='Total cargo (metric tons)',
                                 showgrid=False,
                                 zeroline=False,
                                color=axis_color), 
                      title = dict(text = title,
                                   font_color = axis_color,
                                   xref = 'paper',
                                  ),
                      #legend = dict(bgcolor = colors3[-1]),
                     plot_bgcolor=plot_background
                     )
    if annotate==True:
        fig.add_annotation(
                x='1.5',
                y=30000,
                xref="x",
                yref="y",
                text="<b>Beginning of COVID-19 era</b>",
                showarrow=True,
                font=dict(
                    #family="Computer Modern",
                    size=12,
                    color="#fafafa"
                    ),
                align="center",
                ax=20,
                ay=-30,
                bordercolor="#000000",
                borderwidth=2,
                borderpad=4,
                bgcolor="#000000",
                opacity=0.8
                )    
    fig.show()

<a id='cargo-1_yoy_2019_2020'></a>
### Cargo trend in 2019 vs 2020
[back to top](#0.0)  

In [None]:
plot_yearly_cargo_trend([2019,2020], cargo, 'SFO Cargo Traffic 2019 vs 2020')

<a id='cargo-1_yoy_trend'></a>
### Year over year cargo trend
[back to top](#0.0)  

In [None]:
cargo['year'] = cargo['date'].dt.year

geo_summary = cargo.groupby(['year','geo_summary'])['cargo_metric_tons'].sum().reset_index()
domestic = geo_summary.loc[geo_summary['geo_summary'] == 'Domestic']
international = geo_summary.loc[geo_summary['geo_summary'] == 'International']

fig = go.Figure()
bar_colors1 = [colors2[2],] * domestic['year'].count()
#bar_colors1[domestic['year'].count()-1] = colors1[0]
bar_colors2 = [colors2[3],] * domestic['year'].count()
#bar_colors2[domestic['year'].count()-1] = colors1[0]

fig.add_trace(go.Bar(
    x=domestic['year'],
    y=domestic['cargo_metric_tons'],
    name='Domestic',
    marker_color = bar_colors1,
    text = domestic['cargo_metric_tons'],
    texttemplate='%{text:.3s}', 
    textposition='auto'
))
fig.add_trace(go.Bar(
    x=international['year'],
    y=international['cargo_metric_tons'],
    name='International',
    marker_color=bar_colors2,
    text = international['cargo_metric_tons'],
    texttemplate='%{text:.2s}', 
    textposition='auto'
))

fig.update_layout(barmode='group',
                  xaxis=dict(
                                tickmode = 'array',
                                 title='Year',
                                 showgrid=False,
                                 zeroline=False,
                                color=axis_color,
                                categoryorder='array', 
                                categoryarray=list(domestic['year'])
                            ),
                      yaxis=dict(title='Total cargo (metric tons)',
                                 showgrid=False,
                                 zeroline=False,
                                color=axis_color), 
                      title = dict(text = 'SFO Domestic & International year-over-year cargo shipment',
                                   font_color = axis_color,
                                   xref = 'paper',
                                  ),
                      #legend = dict(bgcolor = colors3[-1]),
                     plot_bgcolor=plot_background,
                      bargap=0.15, 
                    bargroupgap=0.1,
                     ) 
fig.show()

> As expected, cargo shipment did not have much impact due to COVID-19. Infact domestic cargo shipment went up in 2020 (about 6k metric tons) compared to 2019. However international shipment did go down.

<a id='cargo-1_yoy_percent_change'></a>
### Year over year % change in cargo shipment
[back to top](#0.0)  


In [None]:
yearly = cargo.groupby('year')['cargo_metric_tons'].sum().reset_index()
yearly['pct_cargo'] = yearly['cargo_metric_tons'].pct_change()
yearly = yearly.loc[(yearly['year'] != 2005) & (yearly['year'] != 2006)]
yearly['pct_cargo'] = yearly['pct_cargo'] * 100

color = colors1[4]
#geo_region = cargo.groupby(['year','geo_region'])['pct_cargo'].sum().reset_index()
#regions = list(yearly.geo_region.unique())
#regions.remove('US')
#regions.insert(0,'US')

fig = go.Figure()

fig.add_trace(go.Bar(
    x=yearly['year'],
    y=yearly['pct_cargo'],
    name='Year-over-year % change',
    marker_color = color,
    text = yearly['pct_cargo'],
    texttemplate='%{text:.2s}%', 
    textposition='auto',
    marker_line_width=2.5, opacity=0.8,
    marker_line_color = colors2[3]        
))

fig.update_layout(xaxis=dict(
                                tickmode = 'array',
                                 title='Year',
                                 showgrid=False,
                                 zeroline=False,
                                color=axis_color,
                                categoryorder='array', 
                                categoryarray=list(yearly['year'])
                            ),
                      yaxis=dict(title='YOY % change',
                                 showgrid=False,
                                 zeroline=False,
                                color=axis_color), 
                      title = dict(text = 'SFO Cargo shipment year-over-year % change',
                                   font_color = axis_color,
                                   xref = 'paper',
                                  ),
                      #legend = dict(bgcolor = colors3[-1]),
                     plot_bgcolor=plot_background,
                      bargap=0.15, 
                    bargroupgap=0.1,
                     ) 
fig.show()

> As we can see from the above chart there is an impact on cargo shipment in 2020 due to COVID-19. Overall cargo shipment in 2020 was down 20% compared to 2019. However, the previous plots show that the decline is mainly attributed to the international shipment. Also the cargo shipment did not have drastic impact as opposed to the commerial passenger travel.

<a id='cargo-1_yoy_cargo_type'></a>
### Year over year trend by cargo type
[back to top](#0.0)  


In [None]:
cargo_type = cargo.groupby(['year','cargo_type'])['cargo_metric_tons'].sum().reset_index()

cargotypes = list(cargo_type.cargo_type.unique())
fig = go.Figure()
bar_colors = [colors2[0],colors3[3],colors1[0]]

for cargotype, color in zip(cargotypes,bar_colors):
    df = cargo_type.loc[cargo_type['cargo_type'] == cargotype]
    fig.add_trace(go.Bar(
        x=df['year'],
        y=df['cargo_metric_tons'],
        name=cargotype,
        marker_color = color,
        #text = df['passenger_count_new'],
        #texttemplate='%{text:.2s}', 
        #textposition='auto'
        marker_line_width=2.5, opacity=0.8,
        marker_line_color = color        
    ))
    

fig.update_layout(barmode='stack',
                  xaxis=dict(
                                tickmode = 'array',
                                 title='Year',
                                 showgrid=False,
                                 zeroline=False,
                                color=axis_color,
                                categoryorder='array', 
                                categoryarray=list(domestic['year'])
                            ),
                      yaxis=dict(title='Total cargo (metric tons)',
                                 showgrid=False,
                                 zeroline=False,
                                color=axis_color), 
                      title = dict(text = 'SFO Cargo shipment year-over-year by cargo type',
                                   font_color = axis_color,
                                   xref = 'paper',
                                  ),
                      #legend = dict(bgcolor = colors3[-1]),
                     plot_bgcolor=plot_background,
                      bargap=0.15, 
                    bargroupgap=0.1,
                     ) 
fig.show()

<a id='cargo-1_yoy_cargo_geo'></a>
### Year over year cargo trend by region
[back to top](#0.0)  

In [None]:
geo_region = cargo.groupby(['year','geo_region'])['cargo_metric_tons'].sum().reset_index()
regions = list(geo_region.geo_region.unique())
regions.remove('US')
regions.insert(0,'US')

fig = go.Figure()
bar_colors = [colors2[0],colors2[1],colors2[2],colors2[3],colors2[4],
         colors1[0],colors1[1],colors1[2],colors1[3],colors1[4],
         colors3[0]]
for region, color in zip(regions,bar_colors):
    df = geo_region.loc[geo_region['geo_region'] == region]
    fig.add_trace(go.Bar(
        x=df['year'],
        y=df['cargo_metric_tons'],
        name=region,
        marker_color = color,
        marker_line_width=2.5, opacity=0.8,
        marker_line_color = color        
    ))
  
fig.update_layout(barmode='stack',
                  xaxis=dict(
                                tickmode = 'array',
                                 title='Year',
                                 showgrid=False,
                                 zeroline=False,
                                color=axis_color,
                                categoryorder='array', 
                                categoryarray=list(geo_region['year'])
                            ),
                      yaxis=dict(title='Total passengers',
                                 showgrid=False,
                                 zeroline=False,
                                color=axis_color), 
                      title = dict(text = 'SFO Cargo shipment year-over-year by geo region',
                                   font_color = axis_color,
                                   xref = 'paper',
                                  ),
                     plot_bgcolor=plot_background,
                      bargap=0.15, 
                    bargroupgap=0.1,
                     ) 
fig.show()

> Some interesting stats from above chart. Asia has the greatest share of cargo shipment from/to SFO internatonal airport compared to other regions, In most cases even greater than domestic shipment. 

<a id='cargo-1_yoy_cargo_aircraft'></a>
### Year over year cargo trend by aircraft type
[back to top](#0.0)  

In [None]:
cargo_aircraft_type = cargo.groupby(['year','cargo_aircraft_type'])['cargo_metric_tons'].sum().reset_index()
aircrafts = list(cargo_aircraft_type.cargo_aircraft_type.unique())

fig = go.Figure()
bar_colors = [colors3[3],colors1[4],colors3[0]]
for aircraft, color in zip(aircrafts,bar_colors):
    df = cargo_aircraft_type.loc[cargo_aircraft_type['cargo_aircraft_type'] == aircraft]
    fig.add_trace(go.Bar(
        x=df['year'],
        y=df['cargo_metric_tons'],
        name=aircraft,
        marker_color = color,
        marker_line_width=2.5, opacity=0.8,
        marker_line_color = color        
    ))
  
fig.update_layout(barmode='stack',
                  xaxis=dict(
                                tickmode = 'array',
                                 title='Year',
                                 showgrid=False,
                                 zeroline=False,
                                color=axis_color,
                                categoryorder='array', 
                                categoryarray=list(geo_region['year'])
                            ),
                      yaxis=dict(title='Total passengers',
                                 showgrid=False,
                                 zeroline=False,
                                color=axis_color), 
                      title = dict(text = 'SFO Cargo shipment year-over-year by aircraft type',
                                   font_color = axis_color,
                                   xref = 'paper',
                                  ),
                     plot_bgcolor=plot_background,
                      bargap=0.15, 
                    bargroupgap=0.1,
                     ) 
fig.show()

Aircraft types for cargo shipment
* Passenger – Air cargo enplaned/deplaned from the belly of passenger aircraft
* Freighter – Air cargo enplaned/deplaned from cargo- only aircraft
* Combi – Air cargo enplaned/deplaned from combination aircraft, which are designed to carry both passengers and cargo on the main deck of the airplane

**Work in progress**