# Airbnb Impact Analysis #

This is a simple exercise in determining the economic impact of Airbnb in Monterey. Referring to the below output you can see the BEA Travel and Tourism Satellite Accounts (TTSAs) and the scrapy output. The scrapy data is the scraped data from Airbnb.


## The Data ##

The following R code takes in two datasets:

1. http://www.bea.gov/industry/tourism_data.htm
2. Scrapy output

BEA Data:
The TTSAs present a detailed picture of travel and tourism activity and its role in the U.S. economy. These accounts present estimates of expenditures by tourists, or visitors, on 24 types of goods and services. The accounts also present estimates of the income generated by travel and tourism and estimates of output and employment generated by travel and tourism-related industries.

Scrapy Data:
All rooms on Airbnb available in Monterey are pulled and processed. I use the number of reviews, availability, and price of the rooms to calculate the total annual expenditure of Airbnb visitors in Monterey. 


## The Process ##

I use the Occupancy Model found here: http://www.sfbos.org/Modules/ShowDocument.aspx?documentid=52601 Marqusee Memo Airbnb and San Francisco: Descriptive Statistics and Academic Research on Page 30. Note, that the Author and the Planning Department released an updated version of this Research in May 2015, but it is not yet available online.

After using the Occupancy Model we can calculate the total expenditure on accomodations from travelers using Airbnb. This figure will be used with the BEA TTSA Commodity Output data to estimate the total traveler expenditure by industry in Monterey. Finally we will take the industry expenditure and use the Commodity Output Multipliers to estimate the economic impact by industry for Monterey.


## Caveat Emptor ##

These BEA estimates are national averages. It is easy to look at the top five industries in Monterey County and realize Monterey has a different economic profile than the national average. However, it is the opinion and attitude of the project that these figures are actually conservative estimates. The hospitality industry composes a large portion of Monterey County's economy. Not only this, but policy will be shaped around STR's and that market is much larger than is what is represented here. Deliberate conservatism was applied at every assumption to the best of the analyst capabilities.

In [115]:
#############################################
# Libraries needed
# library(ggplot2)

#############################################
# Pull in datasets

data <- read.csv("output.csv")
# Real Output
tourism_RO  <- read.csv("tour2014 - RO.csv", stringsAsFactors = FALSE)
tourism_RO$Chain_type_price_index <- NULL
# Commodity Output
tourism_CO  <- read.csv("tour2014 - CO.csv", stringsAsFactors = FALSE)
# Employment Output
tourism_EMP <- read.csv("tour2014 - EMP.csv", stringsAsFactors = FALSE)

##############################################
# Subset Columns

col_vars <- c("R_Listname","R_Value","A_Availability","S_Accommodates","R_Reviews", "R_Hostname")
data_v1 <- data[,col_vars]


In [93]:
head(data_v1, 10)
head(tourism_CO, 10)

Unnamed: 0,R_Listname,R_Value,A_Availability,S_Accommodates,R_Reviews,R_Hostname
1,Private Entry No Shared Space Suite,$130,"Fridays and Saturdays,2 nights",2,55,Aaron
2,Victorian Studio,$109,2 nights,4,161,Joy
3,Coastal Hideaway,$145,1 night,4,120,Gary
4,Sunny bedroom in downtown Monterey,$85,1 night,1,9,Ksenia
5,Monterey cabin,$192,2 nights,2,55,Kirk
6,"Ocean view, pretty & friendly place",$95,1 night,3,17,Vita
7,Just a 3 minute walk to the beach!,$144,1 night,2,100,Sean
8,Crow's Nest w/ 2Aquarium passes,$155,2 nights,3,57,Janet
9,"Tranquil ""Clementine Cottage""",$110,1 night,2,59,Cathryn & Robert
10,Four Blocks From Downtown Monterey!,$120,1 night,2,24,Nicole


Unnamed: 0,Commodity,Direct_tourism_output,Total_commodity_output_multiplier,Total_tourism_related_output
1,Traveler accommodations,168704,1.58,266688
2,Food services and drinking places,130712,1.83,238970
3,Domestic passenger air transportation services,98550,1.74,171724
4,International passenger air transportation services,51202,1.74,89221
5,Passenger rail transportation services,2256,1.75,3938
6,Passenger water transportation services,12761,2.05,26170
7,Interurban bus transportation,1472,1.72,2525
8,Interurban charter bus transportation,1885,1.72,3234
9,Urban transit systems and other transportation services,4462,1.72,7653
10,Taxi service,4398,1.72,7544


## Occupancy Model ##

Here is where we implement the San Francisco Occupancy Model by using the number of reviews and the estimated bookings percentage to compute the total estimated bookings. We also assume an average length of stay of three days when the minimum stay is less than three days. At the bottom you can see the Total_airbnb_spend number which represents the total annual spend of Airbnb users in Monterey in 2009 dollars. 



In [116]:

##############################################
# Constants:
# Airbnb Estimated Bookings (an exogenous figure used to estimate occupancy rate)
# Airbnb length, when minimum stay is less than 3 we assume 3 nights is the average stay
# BEA CPI estimate from 2016 to 2009 dollars
airbnbEst <- .4 
airbnbLength <- 3
BEAcpi_2016_2009 <- 1.22064193461303


##############################################
# Munge the fields into appropriate formats and clean the availability field, *consider doing this in a pipeline*
data_v1$R_Value <- gsub("[$]","", data_v1$R_Value)
data_v1$R_Value <- as.numeric(as.character(data_v1$R_Value))
data_v1$A_Availability <- as.character(data_v1$A_Availability)
data_v1$A_Availability <- gsub("nights", "", data_v1$A_Availability)
data_v1$A_Availability <- gsub("night", "", data_v1$A_Availability)
data_v1$A_Availability <- gsub("varies", "", data_v1$A_Availability)
data_v1$A_Availability <- gsub("Fridays and Saturdays", "", data_v1$A_Availability)

data_v1$A_Availability <- gsub(" ", "", data_v1$A_Availability)
data_v1$A_Availability <- gsub(",", "", data_v1$A_Availability)
#Those who have minimum night stays, those are used to calculate length stays, all else assumes three nights
data_v1$A_Availability[nchar(data_v1$A_Availability) == 0] <- airbnbLength
data_v1$A_Availability[nchar(data_v1$A_Availability) > 2] <- substr(data_v1$A_Availability[nchar(data_v1$A_Availability) > 2],0,1)

data_v1$A_Availability <- as.numeric(data_v1$A_Availability)

#If someone doesn't have a review, give them exactly one so as to not exclude them from the analysis, their impacts are minimal
data_v1$R_Reviews[is.na(data_v1$R_Reviews)] <- 1

##############################################
# Calculate Columns of interest
# Estimated Bookings
# Nights per year
# Occupancy Rate
# Cap occupancy rate at 70%
# Re assign nights per year after adjusting for capped occupancy rate
# Total dollar value of airbnb travel stay in a year

data_v1$R_estimatedBookings <- data_v1$R_Reviews * airbnbEst
data_v1$R_nightsYear        <- data_v1$R_estimatedBookings * data_v1$A_Availability

data_v1$R_occRate           <- data_v1$R_nightsYear/365
data_v1$R_occRate[data_v1$R_occRate > .7] <- .7

data_v1$R_nightsYear        <- data_v1$R_occRate * 365

data_v1$R_annSpend          <- data_v1$R_nightsYear * data_v1$R_Value

Total_airbnb_spend <- sum(data_v1$R_annSpend)
Total_airbnb_spend <- Total_airbnb_spend/BEAcpi_2016_2009

Total_airbnb_spend_f <- paste('$',formatC(Total_airbnb_spend, big.mark=',', format = 'f'))
Total_airbnb_spend_f

## Total Economic Contributions ##

After annual spend is the shares of output of the 24 industries representing the total output consumed by visitors. These figures represent the national expenditure of travel and tourism. Looking at the shares you will see that 18.4% of travel expenditure is on Traveler Accomodations. Using the industry shares we take the total Airbnb expenditure and calculate what travelers were spending using the national data. Airbnb visitors spent an estimated $11.2 million dollars in 2016. 

In [119]:
##############################################
# Now that we have the total spend of airbnb travels in the same
# dollars as our satellite accounts, lets calculate the total amount
# spent during travel


tourism_CO$Commodity <- as.character(tourism_CO$Commodity)
# Convert strings to numeric and remove commas
tourism_CO$Direct_tourism_output <- as.numeric(gsub(",","",tourism_CO$Direct_tourism_output))
tourism_CO$Total_tourism_related_output <- as.numeric(gsub(",","",tourism_CO$Total_tourism_related_output))

total_output <- tourism_CO[tourism_CO$Commodity == "Total", c("Total_tourism_related_output")]

tourism_CO$Output_shares <- tourism_CO$Direct_tourism_output/total_output


tourism_CO_viz <- tourism_CO[-(which(tourism_CO$Commodity == "Total")),]
tourism_CO_viz[order(-tourism_CO_viz$Output_shares),]

traveler_accomodations <- subset(tourism_CO, Commodity == "Traveler accommodations", select = c(Total_tourism_related_output, Output_shares))

# Total economic contributions
economic_total <- Total_airbnb_spend/traveler_accomodations[,c("Output_shares")]




Unnamed: 0,Commodity,Direct_tourism_output,Total_commodity_output_multiplier,Total_tourism_related_output,Output_shares,Airbnb_output_totals,Economic_impact
1,Traveler accommodations,168704,1.58,266688,0.1070623,2034506.0,3214520.0
24,Nondurable PCE commodities other than gasoline,132858,2.05,272180,0.08431387,1602217.0,3284545.0
2,Food services and drinking places,130712,1.83,238970,0.08295198,1576337.0,2884697.0
23,Gasoline,101373,1.5,151896,0.06433297,1222520.0,1833780.0
3,Domestic passenger air transportation services,98550,1.74,171724,0.06254145,1188476.0,2067948.0
4,International passenger air transportation services,51202,1.74,89221,0.03249363,617476.8,1074410.0
17,Travel arrangement and reservation services,47370,1.53,72318,0.03006178,571264.3,874034.4
21,Gambling,46400,1.71,79210,0.0294462,559566.5,956858.7
12,Automotive rental,34338,1.6,55112,0.02179146,414103.3,662565.3
18,Motion pictures and performing arts,26325,1.77,46503,0.01670628,317469.6,561921.1


## Economic Impact ##

Using the multipliers found within the dataset provided by the BEA you can see the total economic impacts by industry below. The total economic impact of Airbnb in Monterey for 2016 is $19 million. 


In [124]:
#Column for economic contributions by industry for 2016
tourism_CO$Airbnb_output_totals <- (tourism_CO$Output_shares*economic_total*BEAcpi_2016_2009)

tourism_CO$Economic_impact <- tourism_CO$Airbnb_output_totals * tourism_CO$Total_commodity_output_multiplier

economic_impact <- subset(tourism_CO, select = c("Commodity", "Airbnb_output_totals", "Economic_impact"))

# The "Total" Column here is nonsensical, Don't use it
economic_impact_viz <- economic_impact[-(which(economic_impact$Commodity == "Total")),]

economic_impact_viz

# Can't sum the "Total" Column
Total_economic_impact <- sum(economic_impact_viz$Economic_impact)

Total_economic_impact_f <- paste('$',formatC(Total_economic_impact, big.mark=',', format = 'f'))
Total_economic_impact_f

Unnamed: 0,Commodity,Airbnb_output_totals,Economic_impact
1,Traveler accommodations,2034506.0,3214520.0
2,Food services and drinking places,1576337.0,2884697.0
3,Domestic passenger air transportation services,1188476.0,2067948.0
4,International passenger air transportation services,617476.8,1074410.0
5,Passenger rail transportation services,27206.51,47611.39
6,Passenger water transportation services,153892.8,315480.3
7,Interurban bus transportation,17751.76,30533.03
8,Interurban charter bus transportation,22732.39,39099.71
9,Urban transit systems and other transportation services,53810.03,92553.26
10,Taxi service,53038.22,91225.74
