Zach Tretter

June 2020

--------

In [1]:
import pandas as pd
import numpy as np

# Step 03B - Monthly Visit Data

* [Read in NPS Data to Create Dataframe](#Read-in-Source-Data-from-NPS)
* [Export to CSV](#Export-to-CSV)

 https://irma.nps.gov/STATS/Reports/Park/GLAC

![nps_visits_data_source.PNG](attachment:nps_visits_data_source.PNG)

### Read in Source Data from NPS

In [2]:
df = pd.read_csv('../data/raw_source_data_CSVs/raw_visitors_by_month_NPS_STATS.csv')

### Process File
* Set year as index
* Drop years before 2000 and 2020
* Drop the textbox5 column

In [3]:
df = df.set_index('Year')
df = df.drop(labels=np.arange(1979,2000,1))
df = df.drop(labels=2020)
df.drop(columns='Textbox5',inplace=True)

### Make Months as Rows not Columns

* Build a list of dataframes where each dataframe is the data for one month for all years

In [4]:
to_concat = []

for i in df.columns:
    interim = pd.DataFrame(df[i])
    interim['month']=i
    interim.rename(columns={i:'visits'},inplace=True)
    to_concat.append(interim)

### Concatenate said dataframes into a single dataframe

* Convert the text month to capital first letter

* Create a column for the year so we don't lose it

* Create a 'key_year_date' column and let this be the index


In [5]:
df_monthly = pd.concat(to_concat)

In [6]:
df_monthly['month'] = df_monthly['month'].apply(lambda x: x.capitalize())

In [7]:
df_monthly['key_year_month'] = df_monthly.index.astype(str) + "_" + df_monthly['month']

In [8]:
df_monthly['year']=df_monthly.index

In [9]:
df_monthly['visits']=df_monthly['visits'].apply(lambda x: x.replace(",","")).astype(float)

In [10]:
df_monthly.set_index('key_year_month')

Unnamed: 0_level_0,visits,month,year
key_year_month,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2019_Jan,13581.0,Jan,2019
2018_Jan,12222.0,Jan,2018
2017_Jan,14690.0,Jan,2017
2016_Jan,15674.0,Jan,2016
2015_Jan,12087.0,Jan,2015
...,...,...,...
2004_Dec,10174.0,Dec,2004
2003_Dec,10073.0,Dec,2003
2002_Dec,8334.0,Dec,2002
2001_Dec,3387.0,Dec,2001


### Export this File

In [11]:
df_monthly.to_csv('../data/03b_monthly_visits_clean.csv')