## Assignment 1: Rolling Average

1. Plot a line chart of the raw PowerConsumption_Zone1 series. This represents electricity consumption in KwH.
2. Then, calculate a 1 week (7 * 24) period moving average and plot.
3. Finally, calculate a 30-day (30 * 24) period moving average and plot.
4. Are there any unusual or unexpected patterns in the data?

In [0]:
import pandas as pd
import seaborn as sns

electricity_df = pd.read_csv(
    "../Data/powerconsumption.csv",
    usecols=["PowerConsumption_Zone1", "Datetime"],
    index_col=["Datetime"],
    parse_dates=["Datetime"]
).resample("H").mean()

electricity_df.head()

In [0]:
electricity_df[:168].plot(ylabel="Consumption (KwH)", title="Electricity Use 2017-01-01 to 2017-01-07")

sns.despine();

In [0]:
electricity_df.rolling(24 * 7).mean().plot();

In [0]:
electricity_df.rolling(24 * 30).mean().plot();

In [0]:
electricity_df.resample("M").mean().plot()

## Assignment 2: Decomposition

1. Plot the entire Madrid weather dataset, then try decomposing it. Then, try filtering the data down to the first 168 rows (1 week) and review the results. Fit an ACF chart on the hourly data to assess seasonal correlations.


2. Then, decompose the monthly average temperature series, `monthly_weather`. How does it differ from a daily decomposition? Then, fit an ACF chart on the monthly data to assess seasonal correlations.

In [0]:
import pandas as pd

hourly_weather = (
    pd.read_csv(
        "../Data/madrid_weather.csv", 
        usecols=["time", "temperature"],
        parse_dates=["time"],
        index_col="time")
)

hourly_weather.head()

In [0]:
hourly_weather.plot()

In [0]:
from statsmodels.tsa.seasonal import seasonal_decompose

result = seasonal_decompose(hourly_weather[:168])

result.plot();

In [0]:
from statsmodels.graphics.tsaplots import plot_acf


plot_acf(hourly_weather);

In [0]:
monthly_weather = (
    pd.read_csv(
        "../Data/madrid_weather.csv", 
        usecols=["time", "temperature"],
        parse_dates=["time"],
        index_col="time")
    .resample("M")
    .mean()
)

monthly_weather.head()

In [0]:
monthly_weather.plot()

In [0]:
from statsmodels.tsa.seasonal import seasonal_decompose

result = seasonal_decompose(monthly_weather)

result.plot();

In [0]:
from statsmodels.graphics.tsaplots import plot_acf


plot_acf(monthly_weather);

## Assignment 3: Forecasting

1. Engineer monthly dummy and trend variables, then fit a regression model and forecast two years worth of airline data (24 months).

2. Review the accuracy - and then try fitting the regression by using a log transform of the target variable.

3. Next, fit a Facebook Prophet model and compare the accuracy of Prophet to your Linear Regression model.

In [0]:
import pandas as pd

air_traffic = pd.read_csv("../Data/AirPassengers.csv", parse_dates=["Month"])

air_traffic.head()

In [0]:
air_traffic.set_index("Month").plot()

### Linear Regression Forecasting

In [0]:
import numpy as np

air_traffic = air_traffic.assign(
    trend= air_traffic.index,
    month = air_traffic["Month"].dt.month.astype("string"),
)

air_traffic = pd.get_dummies(air_traffic, drop_first = True)

air_traffic.head()

In [0]:
air_traffic_train = air_traffic[:-24]
air_traffic_test = air_traffic[-24:]

In [0]:
import statsmodels.api as sm

y = np.log(air_traffic_train["Passengers (k)"]) 
# y = air_traffic_train["Passengers (k)"]
X = sm.add_constant(air_traffic_train.iloc[:, 2:])

model = sm.OLS(y, X).fit()

model.summary()

In [0]:
air_traffic_test.head()

In [0]:
from sklearn.metrics import mean_absolute_percentage_error as mape
from sklearn.metrics import mean_absolute_error as mae

print(f"MAPE: {mape(air_traffic_test['Passengers (k)'], model.predict(sm.add_constant(air_traffic_test.iloc[:, 2:])))}")
print(f"MAE: {mae(air_traffic_test['Passengers (k)'], model.predict(sm.add_constant(air_traffic_test.iloc[:, 2:])))}")

In [0]:
# undo logs
print(f"MAPE: {mape(air_traffic_test['Passengers (k)'], np.exp(model.predict(sm.add_constant(air_traffic_test.iloc[:, 2:]))))}")
print(f"MAE: {mae(air_traffic_test['Passengers (k)'], np.exp(model.predict(sm.add_constant(air_traffic_test.iloc[:, 2:]))))}")

In [0]:
import matplotlib.pyplot as plt

fig, ax = plt.subplots()

ax.plot(air_traffic_test["Month"], air_traffic_test["Passengers (k)"])
# ax.plot(air_traffic_test["Month"], model.predict(sm.add_constant(air_traffic_test.iloc[:, 2:])))
ax.plot(air_traffic_test["Month"], np.exp(model.predict(sm.add_constant(air_traffic_test.iloc[:, 2:]))))

In [0]:
air_traffic = pd.read_csv("../Data/AirPassengers.csv", parse_dates=["Month"])

air_traffic.head()

### Facebook Prophet

In [0]:
air_traffic = (
    pd.read_csv(
        "../Data/AirPassengers.csv", 
        usecols=["Passengers (k)", "Month"], 
        parse_dates=["Month"])
    .rename({"Month": "ds", "Passengers (k)": "y"}, axis=1)
)

air_traffic.head()

In [0]:
air_traffic_train = air_traffic[:-24]
air_traffic_test = air_traffic[-24:]

In [0]:
from prophet import Prophet

m = Prophet(seasonality_mode = "multiplicative") # seasonality_mode = "multiplicative"
m.fit(air_traffic_train)

In [0]:
future = m.make_future_dataframe(periods=24, freq="M")

forecast = m.predict(future)

m.plot(forecast);

In [0]:
fig = m.plot_components(forecast)

In [0]:
import seaborn as sns

(air_traffic_test
 .assign(predictions = m.predict(future)["yhat"])
 .set_index("ds")
 .plot()
)

sns.despine()

In [0]:
forecast = m.predict(future).iloc[-24:, -1]

In [0]:
print(f"MAPE: {mape(air_traffic_test['y'], forecast)}")
print(f"MAE: {mae(air_traffic_test['y'], forecast)}")