# Day 3 - Comparisons: Historical

Tourism is one of the largest industries on Guam, with most tourists traveling from East Asia to Guam for vacation. Several world events and larger economic trends have changed Guam's tourist flow significantly in the last two decades.

## Data

Visitor data is collected via customs forms and consolidated via the Guam Visitors Bureau. Although there isn't a single source of data, a collective view of the monthly visitors and their home countries can be compiled via Guam [Statistical Yearbooks](https://bsp.guam.gov/guam-statistical-yearbook-2/) and the monthly [Preliminary Arrival Summaries](https://www.guamvisitorsbureau.com/research/statistics/visitor-arrival-statistics).

Monthly tourist arrivals categorized by country of origin from January 2005 until December 2021 have been collected in a CSV file located in this folder.


In [143]:
import pandas as pd

with open('guam_arrivals_monthly_2005_2020.csv', 'r') as f:
    df = pd.read_csv(f, index_col='Month', parse_dates=['Month'])
    
print(df)

monthly_totals = df['Total']
del df['Total']

             Total  Japan  United States  Pacific Islands  Taiwan  \
Month                                                               
2005-01-01  113632  88977           4296             2492    1890   
2005-02-01  102129  80955           4078             2253    3198   
2005-03-01  102024  85283           4215             2716    1489   
2005-04-01   86521  68197           4141             2349    1252   
2005-05-01   90342  71277           3707             2562    1895   
...            ...    ...            ...              ...     ...   
2021-08-01    8675    325           5294              903     868   
2021-09-01    5735    358           3510              546      18   
2021-10-01    6416    315           3961              526      28   
2021-11-01    9615    345           4162              631      35   
2021-12-01    8764    521           5158              815      19   

            Philippines  Korea  Hong Kong  Other  
Month                                             


## Plotting

Plotting the full 

In [144]:
from bokeh.io import output_notebook
output_notebook()

In [145]:
from bokeh.palettes import brewer
from bokeh.plotting import figure, show
from bokeh.models import NumeralTickFormatter, DatetimeTicker, HoverTool

p = figure(title="Guam Monthly Visitors By Location of Origin",
           width=800,
           height=800,
           x_axis_type='datetime')
p.grid.minor_grid_line_color = '#eeeeee'
p.yaxis.formatter = NumeralTickFormatter(format="0,0")
p.xaxis.ticker = DatetimeTicker(num_minor_ticks = 4)

column_names = list(df.columns)
N = len(column_names)
p.varea_stack(stackers=column_names,
              x='Month',
              color=brewer['Spectral'][N],
              legend_label=column_names,
              source=df)

p.yaxis.axis_label = "Date"
p.yaxis.axis_label = "Monthly Visitors"

p.legend.orientation = "horizontal"
p.legend.background_fill_color = "#fafafa"

show(p)

In [146]:
df2 = df.groupby(pd.Grouper(freq='1Y')).sum()

p = figure(title="Guam Annual Visitors By Location of Origin",
           width=800,
           height=800,
           x_axis_type='datetime')
p.grid.minor_grid_line_color = '#eeeeee'
p.yaxis.formatter = NumeralTickFormatter(format="0,0")
p.xaxis.ticker = DatetimeTicker(num_minor_ticks = 4)

column_names = list(df.columns)
N = len(column_names)
p.varea_stack(stackers=column_names,
              x='Month',
              color=brewer['Spectral'][N],
              legend_label=column_names,
              source=df2)

p.yaxis.axis_label = "Date"
p.yaxis.axis_label = "Monthly Visitors"

p.legend.orientation = "horizontal"
p.legend.background_fill_color = "#fafafa"

show(p)



## Coronavirus Pandemic

An enormous drop-off in tourism (>99%) occured in the beginning of 2020 as the Coronavirus pandemic spread across the world. The primary origin of visitors to Guam shifted away from East Asia as visitors from Hawaii and the United States mainland proportionally increased. This was mostly due to relaxed travel restrictions on tourists traveling from and staying within the United States and its territories.

In [147]:
start_date = '2020-04-01'

p = figure(title="Guam Monthly Visitors By Location of Origin",
           width=800,
           height=800,
           x_axis_type='datetime')
p.grid.minor_grid_line_color = '#eeeeee'
p.yaxis.formatter = NumeralTickFormatter(format="0,0")
p.xaxis.ticker = DatetimeTicker(num_minor_ticks = 4)

column_names = list(df.columns)
N = len(column_names)
p.varea_stack(stackers=column_names,
              x='Month',
              color=brewer['Spectral'][N],
              legend_label=column_names,
              source=df[start_date:])

p.yaxis.axis_label = "Date"
p.yaxis.axis_label = "Monthly Visitors"

p.legend.orientation = "horizontal"
p.legend.background_fill_color = "#fafafa"

show(p)

## Shifting From Japan to Korea

In the past two decades, the average number of annual visitors has increased and the composition of visitors has also shifted. Japan's total number and percentage of total annual visitors has declined. South Korean visitors have done the opposite, eventually overtaking Japanese visitors for the largest share before the Coronavirus Pandemic.

In [148]:
df['Japan_p'] = df['Japan']/df.sum(axis=1)
df['Korea_p'] = df['Korea']/df.sum(axis=1)
df['Other_p'] = 1 - df['Japan_p'] - df['Korea_p']

df = df.groupby(pd.Grouper(freq='1Y')).mean()

end_date = '2021-12-01'

p = figure(title="Guam Annual Visitors By Location of Origin",
           width=800,
           height=800,
           x_axis_type='datetime')
p.grid.minor_grid_line_color = '#eeeeee'
p.yaxis.formatter = NumeralTickFormatter(format="0%0")
p.xaxis.ticker = DatetimeTicker(num_minor_ticks = 4)

p.yaxis.axis_label = "Date"
p.yaxis.axis_label = "Percent of Total Annual Visitors"

column_names = ['Japan_p', 'Korea_p', 'Other_p']
N = len(column_names)
p.line(x='Month',
       y='Japan_p',
       source=df[:end_date],
       legend_label="Japan",
       line_width=3,
       color="#e84d60")
p.line(x='Month',
       y='Korea_p',
       source=df[:end_date],
       legend_label="Korea",
       line_width=3,
       color="#718dbf")
p.line(x='Month',
       y='Other_p',
       source=df[:end_date],
       legend_label="Other",
       line_width=3,
       color="#c9d9d3")

p.legend.orientation = "horizontal"
p.legend.background_fill_color = "#fafafa"

show(p)