# Impact of Covid-19 on tourism: Predictive model

The tourism industry has been masively affected by the Covid-19 situation. One of the indicators for travelling is the number of Airbnb reviews, treated here as demand. 

This notebook builds a predictive model to compare the actual Airbnb demand with the forecasted Airbnb demand if Covid-19 did not occur.

**Input:**
Bristol_reviews.csv 
This dataset is the reviews file downloaded from http://insideairbnb.com/get-the-data.html. It contains all the reviews up to the last scraped date.

**Output:**
Bristol_forecast_components.csv
Bristol_reviews_prediction.csv
The first output represents the forecast components, namely the overall trend and the yearly trend. The second output contains the actual preporcessed data, the training data and the forecast data.

**Steps**
1. Preprocess the data and aggregate it by month.
2. Build a predictive model using FB Prophet and plot the data
3. Export the data to create a dashboard



In [None]:
# Install all the dependencies 

import pandas as pd
import numpy as np
from datetime import datetime
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker

!pip install html5lib
!pip install wget

import requests
from bs4 import BeautifulSoup
import pandas as pd
import io
import html5lib
from tqdm import tqdm
import gzip, requests, zipfile, io
from datetime import datetime
import os
from os import listdir
from os.path import isfile, join
import matplotlib.pyplot as plt
import wget
import gzip
import shutil
import glob
!pip install fbprophet
import fbprophet

## Preprocess the data 

In [None]:
# Load the reviews of a given city
city_reviews = pd.read_csv("Bristol_reviews.csv")

# Format the dates
city_reviews.date = pd.to_datetime(city_reviews.date, format="%Y-%m-%d")
city_reviews.date = city_reviews.date.apply(lambda x: x.strftime('%Y-%m'))

# Count the number of reviews per month
city_reviews['review_count'] = 0
city_demand = city_reviews.groupby('date').agg({
    'review_count':'count'
}).reset_index()

# Print the results
city_demand.head()


## Predict number of reviews using FB Prophet

In [None]:
# Rename the date and target columns to be used by Facebook prophet
city_demand.rename(columns={
    'date': 'ds', 
    'review_count': 'y'
}, inplace=True)

# Create a training set
city_demand_train = city_demand[:-5]
city_demand_train.tail()

# Initialise the Facebook Prophet model
model = fbprophet.Prophet(
    daily_seasonality=False,
    weekly_seasonality=False,
    yearly_seasonality=False,
    changepoint_prior_scale=0.05,
    interval_width=0.95,
    mcmc_samples=300)
model.add_seasonality(name='yearly', period=365.25, fourier_order=5)

# Fit the training data to the model
model.fit(city_demand_train)


In [None]:
# Use the trained model to predict the number of reviews in the future

forecast = model.predict(city_demand)
forecast[['ds', 'yhat']].head()
model.plot_components(forecast)
plt.savefig('trends.png', fontsize=14)

In [None]:
# Reformat the data

city_demand.ds = pd.to_datetime(city_demand.ds, format="%Y-%m")
city_demand.ds = city_demand.ds.apply(lambda x: x.strftime('%Y-%m-%d'))
city_demand.ds = pd.to_datetime(city_demand.ds, format="%Y-%m-%d").dt.date
city_demand_train.ds = pd.to_datetime(city_demand_train.ds, format="%Y-%m")
city_demand_train.ds = city_demand_train.ds.apply(lambda x: x.strftime('%Y-%m-%d'))
city_demand_train.ds = pd.to_datetime(city_demand_train.ds, format="%Y-%m-%d").dt.date

In [None]:
# Plot the results

plt.close()
fig, ax = plt.subplots(figsize=(15, 5))

ax.plot(city_demand_train['ds'],city_demand_train['y'], c='blue', marker='o',ms=5, linestyle='None', label = 'Train Data')
ax.plot(city_demand['ds'],city_demand['y'], c='g', label = 'Actual Data')

ax.plot(forecast['ds'],forecast['yhat'], c='r', marker='o', ms=5, linestyle='None', label='Forecast', alpha=0.5)
plt.xticks(rotation=20,fontsize=14)
plt.yticks(fontsize=14)
ax.legend(fontsize=14)

plt.show()
plt.close()

## Export the data

In [None]:

# Save the predictions
city_data = pd.DataFrame(city_demand['ds'])
city_data['Actual_Data'] = city_demand['y']
city_data['Forecast'] = forecast['yhat']
city_data['Train_Data'] = city_demand_train['y']
city_data['city'] = 'Bristol'
city_data['country'] = 'United Kingdom'
city_data.ds = pd.to_datetime(city_data.ds, format="%Y-%m-%d")
city_data.ds = city_data.ds.apply(lambda x: x.strftime('%Y-%m'))
city_data.long =  -2.587910
city_data.to_csv('Bristol_reviews_prediction.csv')

# Save the model components
forecast['city'] = 'Bristol'
forecast['country'] = 'United Kingdom'
forecast.ds = pd.to_datetime(forecast.ds, format="%Y-%m-%d")
forecast.ds = forecast.ds.apply(lambda x: x.strftime('%Y-%m'))
forecast.to_csv('Bristol_forecast_components.csv')

**Authors:**
<br>Maria Ivanciu is AI Developer in R<sup>2</sup> Data Labs, Rolls-Royce.</br>
<br>Vincent Nelis is Senior Data Scientist with IBM Data Science & AI Elite team.</br>