# Business Understanding

Our project concerns the creation of a machine learning model for sales prediction for the Rossmann store brand, basing on data gathered for over a thousand different stores. The Rossmann company utilized sales prediction on local level, tasking their store managers with predicting sales for up to 6 weeks in advance, in order to improve the efficiency of their supply procedures and lower the logistical cost of running the brand. Therefore, the selected case offers practical application, opening an avenue for implementation of the created model as a part of Rossmann official predictions.

Aside for the possibility of practical application in business world, the case study has advantages from the point of view of computer science. As the data is provided directly by Dirk Rossmann GmbH, it has an above-average level of completeness and accuracy, due to originating from the primary source rather than an outside observer.

The project takes data from over 1000 Rossmann stores and aims at predicting sales for them. We implement three different prediction methods: FB prophet, fast.ai and random forest, in order to compare their accuracy in relation to the effort needed for the prediction.


# Data Description
The data analysed in the project comes from 1115 Rossmann stores.
Data is split into three sets: train set containing data for training models; test set containing data for testing model accuracy and store set containing additional data on the stores.

# Data Fields
Non-selfexplanatory data fields are described below

train.csv set
* Store - unique ID for the store
* Sales - turnovers for any given day
* Customers - number of customers for any given day
* Open - binary value, denotes if the store is open (0 - closed, 1 - open)
* StateHoliday - denotes days with state holidays: a - public holiday; b - Christmas; c - Easter; 0 - no holiday
* SchoolHoliday - binary value, shows if the store was affected by school holiday
* Promo - indicates if the store has a promotion on that day

test.csv set
* ID - tuple containing store ID and date

store.csv set
* StoreType - shows one of four store types: a, b, c and d
* Assortment - describes an assortment level: a - basic, b - extra, c - extended
* CompetitionDistance - distance in meters to the nearest competitor store
* nCompetitionOpenSince - gives the approximate year and month of the time the nearest competitor was opened
* Promo - shows whether a store is running a promo on that day
* Promo2 - Promo2 is a continuing and consecutive promotion for some stores: 0 = store is not participating, 1 = store is participating
* Promo2Since - describes the year and calendar week when the store started participating in Promo2
* PromoInterval - describes the consecutive intervals Promo2 is started, naming the months the promotion is started anew

In [None]:
import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt 
import datetime as dt
from fastai.tabular.all import *
from sklearn.ensemble import RandomForestRegressor
import fbprophet as fbp
import seaborn as sns
import itertools

In [None]:
#Calculating metric used in competition
from sklearn.metrics import mean_squared_error
from math import sqrt
def ToWeight(y):
    w = np.zeros(y.shape, dtype=float)
    ind = y != 0
    w[ind] = 1./(y[ind]**2)
    return w

def rmspe(yhat, y):
    w = ToWeight(y)
    rmspe = np.sqrt(np.mean( w * (y - yhat)**2 ))
    return rmspe


In [None]:
train = pd.read_csv("../input/rossmann-store-sales/train.csv", parse_dates=['Date'], low_memory=False)
store = pd.read_csv("../input/rossmann-store-sales/store.csv", low_memory=False)


In [None]:
train.head()

In [None]:
train.dtypes

## Clear the data

In [None]:
#First we get rid of useless rows where there are no sales and shops are closed
initial_len = train.shape[0]
train=train[(train['Sales']!=0) & (train['Open']!=0)]
new_len = train.shape[0]
print(f"We removed {(initial_len-new_len)/initial_len*100}% of rows")

In [None]:
#Nothing to clear in train dataframe
train.isnull().any()

In [None]:
#Some missing values in Store
store.isnull().sum()

In [None]:
# Removing missing values 
cols = ['CompetitionOpenSinceMonth',
       'CompetitionOpenSinceYear',
       'Promo2SinceWeek',
       'Promo2SinceYear']
for col in cols:
    store[col].fillna(0, inplace=True)
store['CompetitionDistance'].fillna(0, inplace=True) #Flaot
store[cols].isnull().any() 


In [None]:
# Setting Promo Interval equal to zero for those who are not continuing Promo and for missing values
index=store[(store['Promo2']==0)&(store['PromoInterval'].isnull())].index
store.loc[index,'PromoInterval']=0

store['PromoInterval'].isnull().any() # To check

In [None]:
# Converting from float into integer type
store[cols]=store[cols].astype(int)
store[cols].dtypes # To check

In [None]:
#object type for categorical variable
store['Promo2']=store['Promo2'].astype(object)
store.dtypes

In [None]:
#Check if no more nulls
store.isnull().any()

## New features

We can extract additional information from date column

In [None]:
def add_date_info(df, field_name='Date'):
    field = df[field_name]
    attr = ['Year', 'Month', 'Day','Dayofyear', 'Is_month_end', 'Is_month_start',
                'Is_quarter_end', 'Is_quarter_start', 'Is_year_end', 'Is_year_start']
    for n in attr: df[n] = getattr(field.dt, n.lower())
    df['Weekofyear'] = df.Date.dt.isocalendar().week


In [None]:
add_date_info(train)

We can create new features based on sale info that should make training model easier

In [None]:
train['SalesPerCustomer'] = train['Sales']/train['Customers']

In [None]:
table_1 = pd.pivot_table(data=train, index=['DayOfWeek','Promo'], values=['Sales','Customers', 'SalesPerCustomer'], aggfunc='mean')

In [None]:
table_1

In [None]:
table_1.plot(kind='bar',y=['SalesPerCustomer'],title='Average Sales per customer', figsize=(15,5))

Worth to note:
- no promo on weekends
- promos increased but number of customers and items per customer

We can also create new features for each store

In [None]:
# avg_store Dataframe containing columns : 'Average Sales','Average Customers','Average Sales Per Customer'
avg_store=train.groupby('Store')[['Sales','Customers','SalesPerCustomer']].mean()
avg_store.rename(columns=lambda x : 'Avg_' + x,inplace=True)
avg_store.reset_index(inplace=True)

# Adding column Max_Customers(containing maximum value of customers) to avg_store Dataframe 
Max_customer=train.groupby('Store')['Customers'].max()
avg_store=pd.merge(avg_store,Max_customer,how='inner',on='Store')
avg_store.rename(columns={'Customers':'Max_Customers'},inplace=True)

# Adding column Min_Customers(containing mimimum value of customers) to avg_store Dataframe 
Min_customer=train.groupby('Store')['Customers'].min()
avg_store=pd.merge(avg_store,Min_customer,how='inner',on='Store')
avg_store.rename(columns={'Customers':'Min_Customers'},inplace=True)

# Adding column Std_Customers(containing Standard Deviation value of customers) to avg_store Dataframe 
Std_customer=train.groupby('Store')['Customers'].std()
avg_store=pd.merge(avg_store,Std_customer,how='inner',on='Store')
avg_store.rename(columns={'Customers':'Std_Customers'},inplace=True)

# Adding column Med_Customers(containing Median value of customers) to avg_store Dataframe 
Med_customer=train.groupby('Store')['Customers'].median()
avg_store=pd.merge(avg_store,Med_customer,how='inner',on='Store')
avg_store.rename(columns={'Customers':'Med_Customers'},inplace=True)

# Adding column Max_Sales(containing maximum value of Sales) to avg_store Dataframe 
Max_Sale=train.groupby('Store')['Sales'].max()
avg_store=pd.merge(avg_store,Max_Sale,how='inner',on='Store')
avg_store.rename(columns={'Sales':'Max_Sales'},inplace=True)

# Adding column Min_Sales(containing mimimum value of Sales) to avg_store Dataframe 
Min_Sale=train.groupby('Store')['Sales'].min()
avg_store=pd.merge(avg_store,Min_Sale,how='inner',on='Store')
avg_store.rename(columns={'Sales':'Min_Sales'},inplace=True)

# Adding column Std_Sales(containing Standard Deviation value of Sales) to avg_store Dataframe 
Std_Sale=train.groupby('Store')['Sales'].std()
avg_store=pd.merge(avg_store,Std_Sale,how='inner',on='Store')
avg_store.rename(columns={'Sales':'Std_Sales'},inplace=True)

# Adding column Med_Sales(containing Median value of Sales) to avg_store Dataframe 
Med_Sale=train.groupby('Store')['Sales'].median()
avg_store=pd.merge(avg_store,Med_Sale,how='inner',on='Store')
avg_store.rename(columns={'Sales':'Med_Sales'},inplace=True)


avg_store.head()

In [None]:
store=pd.merge(store,avg_store,how='inner',on='Store')
store.head()

In [None]:
# Merging
new_train=pd.merge(train,store,how='left',on='Store')
print('New training dataset shape :',new_train.shape)
new_train.head()



In [None]:
# Making column "MonthCompetitionOpen" which contains date information in months since the competition was opened 
new_train['MonthCompetitionOpen']=12*(new_train['Year']-new_train['CompetitionOpenSinceYear'])+\
new_train['Month']-new_train['CompetitionOpenSinceMonth']

new_train.loc[(new_train['CompetitionOpenSinceYear']==0),'MonthCompetitionOpen']=0
# Negative values indcate that the competitor's store was opened after the Rossman's store opening date.

    

In [None]:
# Making column "WeekPromoOpen" which contains date information in weeks since the promo is running
new_train['WeekPromoOpen']=52.14298*(new_train['Year']-new_train['Promo2SinceYear'])+\
new_train['Weekofyear']-new_train['Promo2SinceWeek']

new_train.loc[(new_train['Promo2SinceYear']==0),'WeekPromoOpen']=0


In [None]:
new_train

# Sales and Customers by Store Type

In [None]:
fig, axes = plt.subplots(1, 3,figsize=(17,10) )
palette = itertools.cycle(sns.color_palette(n_colors=4))
plt.subplots_adjust(hspace = 0.28)
axes[0].bar(new_train.groupby(by="StoreType").sum().Customers.index,new_train.groupby(by="StoreType").Sales.mean(),color=[next(palette),next(palette),next(palette),next(palette)])
axes[0].set_title("Average Sales per Store Type")
axes[1].bar(new_train.groupby(by="StoreType").sum().Customers.index,new_train.groupby(by="StoreType").Customers.mean(),color=[next(palette),next(palette),next(palette),next(palette)])
axes[1].set_title("Average Number of Customers per Store Type")
axes[2].bar(new_train.groupby(by="StoreType").sum().SalesPerCustomer.index,new_train.groupby(by="StoreType").SalesPerCustomer.mean(),color=[next(palette),next(palette),next(palette),next(palette)])
axes[2].set_title("Average Sales per Customer per Store Type")
plt.show()

# Sales and Customers by Week Day

In [None]:
fig, axes = plt.subplots(1, 3,figsize=(17,10) )
palette = itertools.cycle(sns.color_palette(n_colors=7))
plt.subplots_adjust(hspace = 0.28)
axes[0].bar(new_train.groupby(by="DayOfWeek").sum().Customers.index,new_train.groupby(by="DayOfWeek").Sales.mean(),color=[next(palette),next(palette),next(palette),next(palette),next(palette),next(palette),next(palette)])
axes[0].set_title("Average Sales per Week Day")
axes[1].bar(new_train.groupby(by="DayOfWeek").sum().Customers.index,new_train.groupby(by="DayOfWeek").Customers.mean(),color=[next(palette),next(palette),next(palette),next(palette),next(palette),next(palette),next(palette)])
axes[1].set_title("Average Number of Customers per Week Day")
axes[2].bar(new_train.groupby(by="DayOfWeek").sum().SalesPerCustomer.index,new_train.groupby(by="DayOfWeek").SalesPerCustomer.mean(),color=[next(palette),next(palette),next(palette),next(palette),next(palette),next(palette),next(palette)])
axes[2].set_title("Average Sales Per Customer per Week Day")
plt.show()

The day of week significantly influences the number of sales and customers per store.
Highest sales are generated on Mondays and Sundays, while highest customer count happens on Sundays, with almost 75% more customers than on Mondays. The difference in sales and customers happens due to most stores being closed on Sundays, thus generating additional sales and, most importantly, additional popularity in the few stores that are open. Increased sales on Monday are also caused by stores being closed on weekends in areas where no nearby open stores are accessible.

# Sales and Customers by Month

In [None]:
fig, axes = plt.subplots(1, 3,figsize=(17,10) )
palette = itertools.cycle(sns.color_palette(n_colors=12))
plt.subplots_adjust(hspace = 0.28)
axes[0].bar(new_train.groupby(by="Month").sum().Customers.index,new_train.groupby(by="Month").Sales.mean(),color=[next(palette),next(palette),next(palette),next(palette),next(palette),next(palette),next(palette),next(palette),next(palette),next(palette),next(palette),next(palette)])
axes[0].set_title("Average Sales per Month")
axes[1].bar(new_train.groupby(by="Month").sum().Customers.index,new_train.groupby(by="Month").Customers.mean(),color=[next(palette),next(palette),next(palette),next(palette),next(palette),next(palette),next(palette),next(palette),next(palette),next(palette),next(palette),next(palette)])
axes[1].set_title("Average Number of Customers per Month")
axes[2].bar(new_train.groupby(by="Month").sum().SalesPerCustomer.index,new_train.groupby(by="Month").SalesPerCustomer.mean(),color=[next(palette),next(palette),next(palette),next(palette),next(palette),next(palette),next(palette),next(palette),next(palette),next(palette),next(palette),next(palette)])
axes[2].set_title("Average Sales Per Customer per Month")
plt.show()

There is a significant growth in sales and customers in December, most probably due to numerous holidays (Christmas, Hannukah etc.) taking place during the month, thus generating additional sales due to the tradition of gift exchange common among these holidays.

# How do Promos influence Sales?

In [None]:
plt.figure(figsize=(10,10))
plt.title("Sales depending on Promos")
sns.set(style="whitegrid",palette="pastel",color_codes=True)
sns.violinplot(x="DayOfWeek",y="Sales",hue="Promo",split=True, data=new_train)

In [None]:
plt.figure(figsize=(10,10))
plt.title("Customers depending on Promos")
sns.set(style="whitegrid",palette="pastel",color_codes=True)
sns.violinplot(x="DayOfWeek",y="Customers",hue="Promo",split=True, data=new_train)

In [None]:
plt.figure(figsize=(10,10))
plt.title("Customers depending on Promos")
sns.set(style="whitegrid",palette="pastel",color_codes=True)
sns.violinplot(x="DayOfWeek",y="SalesPerCustomer",hue="Promo",split=True, data=new_train)

Promos seem to have an influence on the Sales, with significantly different results depending on the day of the week. The largest sales difference between stores with and without promos is on Mondays, which is to be expected due to most stores being closed on Sundays.
No promos are run on weekends, thus decreasing the sales per customer value, but the increased number of customers on Sundays balances out the loss in sales.

# Modeling

Our project goal is to predict time series, therefore simple random split into training and validation would be wrong approach. The base copmetition was about predicting sales several weeks into the future, that is why we decided to set our goal on 30 day predicion.
To validate we split the data into test set that we used to assert models performance and training set used for generating them.
Test set was composed of information about sales from last 30 days available in dataset.
Training set had information about sales from the previous 3 years.

Models are verified basing on the comparison of the results to the data given in test set. The verification uses mean square root percentage error and direct sales difference comparison to check the accuracy of each model.

##  Model Random Forest

Random forest is an ensemble decision tree model, used for regression and classification, which implements multiple decision trees. The random forest method uses a modified bootstrap algorithm, also known as bagging. The algorithm creates many random subsets from the dataset with replacement and trains them using modified regression or classification with random subset of features. The average result of all subtrees is selected as the best fitting one.

In [None]:
# Get dummies for categorical variables
df_forest=pd.get_dummies(data=new_train,columns=['StoreType','StateHoliday','Assortment','PromoInterval'])

In [None]:
#Period for which we will be making prediction, we take last 30 working days which is similar to the goal of the 
#competition
period = 30
to_drop = ['Customers','Sales','Open','Date']

In [None]:
df_forest.sort_values('Date',inplace=True)
dates = df_forest['Date'].unique()

forest_train = df_forest[df_forest['Date'].isin(dates[:-period])]
Y_train = forest_train['Sales']
X_train = forest_train.drop(to_drop,axis=1)
forest_test = df_forest[df_forest['Date'].isin(dates[-period:])]
Y_test = forest_test['Sales']
X_test = forest_test.drop(to_drop,axis=1)


In [None]:
X_train.columns

In [None]:
num_leaves = [4,6,8,10]
num_features = ['sqrt',0.5,0.6,0.7,0.8,1]


In [None]:
rfr=RandomForestRegressor(n_estimators=100,
                          criterion='mse',
                          oob_score=True,
                          n_jobs=12,
                          verbose=1,
                          random_state=404,max_features=0.8
                         )

rfr.fit(X_train,Y_train)


In [None]:
# Prediction
predict=rfr.predict(X_test)
predict

In [None]:
predict.shape, Y_test.dtype

In [None]:
# Out of Bag score
print('oob score :',rfr.oob_score_)

In [None]:
# Root mean square error
mse=mean_squared_error(Y_test,predict)
print('Root Mean Square Percent Error {}, RMSE = {}'.format(rmspe(predict,Y_test), sqrt(mse)))

In [None]:
# Import attributes according to model

pd.options.display.float_format='{:.5f}'.format
important_features=pd.DataFrame(rfr.feature_importances_,index=X_train.columns)
important_features.sort_values(by=0,ascending=False)



In [None]:
plt.plot(Y_test-predict)

## Model fast-ai

Fast-ai is a deep learning library implementing fully connected neural networks, developed to simplify the process of learning. FCNN are a type of neural networks in which every neuron in a layer is connected to all other neurons in other layers. Fully connected neural networks are of general purpose, requiring no special assumptions on input data, which makes them easy to implement but lowers their learning efficiency.

In [None]:
df_fast = new_train.copy()
df_fast.head()

In [None]:
df_fast.dtypes

In [None]:
period = 30
to_drop = ['Customers','Open','Date']

In [None]:
change_dtypes = {'Weekofyear': np.int64, 'Sales': np.float64, 'Is_month_end':np.int64, 'Is_month_start':np.int64,
                'Is_quarter_end':np.int64, 'Is_quarter_start':np.int64, 'Is_year_end':np.int64, 'Is_year_start':np.int64}
df_fast = df_fast.astype(change_dtypes)

In [None]:
df_fast.sort_values('Date',inplace=True)
dates = df_fast['Date'].unique()
index_train = df_fast[df_forest['Date'].isin(dates[:-period])].index
index_test = df_fast[df_forest['Date'].isin(dates[-period:])].index
df_fast.drop(to_drop,axis=1,inplace=True)

In [None]:
dep_var = 'Sales'
cont_nn,cat_nn = cont_cat_split(df_fast, max_card=1000, dep_var=dep_var)

In [None]:
cont_nn, cont_nn.pop(0)

In [None]:
for col in cont_nn:
    change_dtypes = {col: np.float64}
    df_fast = df_fast.astype(change_dtypes)

In [None]:
df_fast['Sales'] = np.log(df_fast['Sales'])

In [None]:
cat_nn.append('Store')

In [None]:
splits = (list(index_train), list(index_test))

In [None]:
procs_nn = [Categorify, Normalize]
to_nn = TabularPandas(df_fast,procs_nn, cat_nn, cont_nn, splits=splits, y_names=dep_var)

In [None]:
dls = to_nn.dataloaders(1024)

In [None]:
y = to_nn.train.y
y.min(),y.max()

In [None]:
y.min()

In [None]:
learn = tabular_learner(dls, y_range=(y.min(),y.max()), layers=[500,250,100],
                        n_out=1, loss_func=F.mse_loss)

In [None]:
learn.lr_find()

In [None]:
learn.fit_one_cycle(5, 1e-2)

Mean square root percent error

In [None]:
preds,targs = learn.get_preds()
rmspe(np.exp(np.array(preds)),np.exp(np.array(targs)))

In [None]:
preds

In [None]:
targs

In [None]:
plt.plot(np.exp(np.array(preds))-np.exp(np.array(targs)))

## FBProphet

FBProphet is an open source library developed by Facebook for time series forecasting. It implements decomposable models, taking into account not only the trend but also seasonal and holiday changes.

y(t)=g(t)+s(t)+h(t)+e

Where:

g(t) is a piecewise growth curve, linear or logarithmic, for modelling non-periodic changes in the 
time series

s(t) is a function responsible for modelling seasonal changes

h(t) is a function responsible for modelling holidays or irregular events

e is the error term accounting for changes not included in previous functions

In [None]:
df = pd.read_csv("../input/rossmann-store-sales/train.csv", parse_dates=['Date'], low_memory=False)

In [None]:
df.rename(columns = {'Date': 'ds', "Sales": 'y'}, inplace=True)
df_prophet = df[['Store','ds','y','StateHoliday','SchoolHoliday']]


Prophet allows for using information about holiday in prediction

In [None]:
state = df_prophet[(df_prophet.StateHoliday == 'a') | (df_prophet.StateHoliday == 'b') & (df_prophet.StateHoliday == 'c')].loc[:,['Store','ds']]
state['holiday'] = 'state_holiday'
school = df_prophet[df_prophet.SchoolHoliday == 1].loc[:, ['Store','ds']]
school['holiday'] = 'school_holiday'

#state = pd.DataFrame({'holiday': 'state_holiday', 'ds': state_dates})
#school = pd.DataFrame({'holiday': 'school_holiday', 'ds': school_dates})

holidays_all = pd.concat((state, school))      
holidays_all.head()


In [None]:
def get_prediction_store(store_id, df_all, periods, holidays_all):
    holiday = holidays_all[holidays_all['Store'] == 1][['ds','holiday']]
    df = df_all[df_all['Store'] == 1][['ds','y']]
    df = df.sort_values('ds')
    df_cut = df[:-periods]
    
    model = fbp.Prophet(holidays = holiday)
    model.fit(df_cut)
    
    future_df = model.make_future_dataframe(periods=periods)
    predictions = model.predict(future_df)
    
    
    return predictions[periods:], df['y'][-periods:]
    


In [None]:
tmp = holidays_all[holidays_all['Store'] == 1][['ds','holiday']]

In [None]:
df_prophet['Store'].unique()

In [None]:
pred, ys = get_prediction_store(1, df_prophet, 30, holidays_all)

In [None]:
pred[pred['ds'].isin(df_prophet.loc[ys.index]['ds'])]['yhat']

In [None]:
pred[pred['ds'].isin(df_prophet.loc[ys.index]['ds'])][['ds','yhat']],df_prophet.loc[ys.index][['ds','y']]

In [None]:
y_hats = pd.DataFrame(pred[pred['ds'].isin(df_prophet.loc[ys.index]['ds'])]['yhat'])
y = pd.DataFrame(df_prophet.loc[ys.index]['y'])

In [None]:
y_hats[y_hats['yhat'] < 0] = 0

In [None]:
sqrt(mean_squared_error(y_hats,y)),rmspe(np.array(y_hats),np.array(y))

In [None]:
store = 1 
pred, ys = get_prediction_store(1, df_prophet, 30, holidays_all)
yhats_glob = pd.DataFrame(pred[pred['ds'].isin(df_prophet.loc[ys.index]['ds'])]['yhat'])
ys_glob = pd.DataFrame(df_prophet.loc[ys.index]['y'])
for store_id in df_prophet['Store'].unique()[1:]:
    pred, ys = get_prediction_store(1, df_prophet, 30, holidays_all)
    y_hats = pd.DataFrame(pred[pred['ds'].isin(df_prophet.loc[ys.index]['ds'])]['yhat'])
    ys = pd.DataFrame(df_prophet.loc[ys.index]['y'])
    y_hats[y_hats['yhat'] < 0] = 0
    
    pd.concat((yhats_glob,y_hats))
    pd.concat((ys_glob,ys))
    
sqrt(mean_squared_error(yhats_glob,ys_glob)),rmspe(np.array(yhats_glob),np.array(ys_glob))

# Conclusions

Out of the three methods, random forest proved to be the most accurate, achieving a mean square root percentage error of 13%. While it has the lowest error of all methods, it requires more work than the two other approaches.

FBProphet provided worst results, with accuracy of approximately 16%, while also requiring the longest learning time. Despite these drawbacks, it proved to be a good baseline approach, as it requires little to none data preparation and feature engineering, thus being appropriate for simple, straight-forward prediction cases.

Fast-ai provided worse results than expected, with accuracy of 15%. The accuracy could be further improved if more time is invested into preparation. It is also the fastest learning approach and thus good for predicting basing on large data sets. We theoretize that using a different neural network approach, for example recurrent, might provide better results.

The results of all predictions may be skewed due to data preprocessing, as the training set contains a large portion (about 16%) of incomplete entries that had to be filled with most fitting values.

# Bibliography

Random Forest
https://builtin.com/data-science/random-forest-algorithm
https://machinelearningmastery.com/bagging-and-random-forest-ensemble-algorithms-for-machine-learning/
https://towardsdatascience.com/understanding-random-forest-58381e0602d2

Fast-ai and FCNN
https://docs.fast.ai/
https://medium.com/swlh/fully-connected-vs-convolutional-neural-networks-813ca7bc6ee5

FBProphet
https://www.analyticsvidhya.com/blog/2018/05/generate-accurate-forecasts-facebook-prophet-python-r/