<p style='text-align:center;
          font-family: sans-serif;
          font-weight:bold;
          color:black;
          font-size:40px;
          margin: 30px;'> Walmart Sales 
    <font color='#128EB3'; size='7'>Forecasting</font></p>
<p style="text-align:center;
          font-family: sans-serif;
          font-weight:bold;
          color:#616161;
          font-size:20px;
          margin: 30px;">Prophet model 📈</p>

<h3 style="text-align: left; background-color: #128EB3; font-family:Trebuchet MS; color: white; padding: 14px; line-height: 1; border-radius:10px"> Notebook goals🔎</h3>

In this first version of the project the objective will be: 

* Predicting store-level sales behavior through time series modeling with some of the libraries available in Python

**Note: I have divided the project into several notebooks for each of the models used for sales forecasting. I leave the link below:**


* [Exploratory Data Analysis](https://www.kaggle.com/code/armandodelahoya/walmart-sales-forecasting-eda/notebook)
* [Prophet model](https://www.kaggle.com/code/armandodelahoya/walmart-sales-forecasting-prophet-model/notebook)
* [Neural prophet model](https://www.kaggle.com/code/armandodelahoya/walmart-sales-forecasting-neural-model/notebook)
* [PyCaret library](https://www.kaggle.com/code/armandodelahoya/walmart-sales-forecasting-pycaret-library/notebook)
* [Power BI Dashboard](https://app.powerbi.com/view?r=eyJrIjoiNWFjYWU0ZTktNjFiYS00Zjg3LTg0M2MtOTMzM2M0MjQ5OThmIiwidCI6IjAyNDlhNTcxLWI5YTItNGNhMi1iOTNiLTIwYzc3MDg4ZjA4YiJ9)
* [Github repository](https://github.com/ArmandoLazalde/Walmart-Sales-Forecasting)

<a id="top"></a>
<div class="list-group" id="list-tab" role="tablist">
    <h3 style="text-align: left; background-color: #128EB3; font-family:Trebuchet MS; color: white; padding: 14px; line-height: 1; border-radius:10px"><b>Table of Contents</b></h3>
    
* [1. Import necessary libraries](#1)
* [2. Best store analysis](#2)
* [3. All stores analysis](#3)
    
    

<a id="1"></a>
<h3 style="text-align: left;
           background-color: #128EB3;
           font-family:Trebuchet MS; 
           color: white; 
           padding: 14px; 
           line-height: 1; 
           border-radius:10px"> 1. Import necessary libraries
</h3>

In [None]:
#Installation problems with Facebook Prophet, not module pystan, pystan v3.10 not working bug, so lets install previous version
!pip install pystan~=2.19 
!pip install fbprophet

In [None]:
import pandas as pd
from datetime import datetime
from zipfile import ZipFile

#matplotlib configuration
import seaborn as sns
import matplotlib
import matplotlib.pyplot as plt
import matplotlib.style as style
%matplotlib inline

plt.style.use('fivethirtyeight')
plt.rcParams['lines.linewidth'] = 1.5
plt.rcParams['font.family'] = 'Trebuchet MS'

matplotlib.rcParams['font.size'] = 15
matplotlib.rcParams['figure.figsize'] = (8, 6)
matplotlib.rcParams['figure.facecolor'] = '#00000000'

# Prophet
from fbprophet import Prophet
from fbprophet.plot import plot_plotly
from fbprophet.plot import add_changepoints_to_plot
from prophet.plot import plot_plotly, plot_components_plotly

#plotly configuration
import plotly.offline as py

#Metrics
from sklearn.metrics import mean_absolute_error, mean_absolute_percentage_error, mean_squared_error, r2_score

In [None]:
with ZipFile('/kaggle/input/walmart-recruiting-store-sales-forecasting/features.csv.zip') as f:
    f.extractall(path='walmart-recruiting-store-sales-forecasting')
    
with ZipFile('/kaggle/input/walmart-recruiting-store-sales-forecasting/sampleSubmission.csv.zip') as f:
    f.extractall(path='walmart-recruiting-store-sales-forecasting') 
    
with ZipFile('/kaggle/input/walmart-recruiting-store-sales-forecasting/test.csv.zip') as f:
    f.extractall(path='walmart-recruiting-store-sales-forecasting') 
    
with ZipFile('/kaggle/input/walmart-recruiting-store-sales-forecasting/train.csv.zip') as f:
    f.extractall(path='walmart-recruiting-store-sales-forecasting') 

In [None]:
train_df = pd.read_csv('/kaggle/working/walmart-recruiting-store-sales-forecasting/train.csv')
features_df = pd.read_csv('/kaggle/working/walmart-recruiting-store-sales-forecasting/features.csv')
stores_df = pd.read_csv('/kaggle/input/walmart-recruiting-store-sales-forecasting/stores.csv')
test_df = pd.read_csv('/kaggle/working/walmart-recruiting-store-sales-forecasting/test.csv')

In [None]:
data = train_df.merge(features_df, on=['Store', 'Date', 'IsHoliday'], how = 'inner')
final_df = data.merge(stores_df, on=['Store'], how='inner')
final_df.head(5)

<a id="1"></a>
<h3 style="text-align: left;
           background-color: #128EB3;
           font-family:Trebuchet MS; 
           color: white; 
           padding: 14px; 
           line-height: 1; 
           border-radius:10px"> 2. Best store analysis
</h3>

In [None]:
#Store 20 analysis
store20 = final_df[final_df['Store'] == 20]
store20 = store20[['Date','Weekly_Sales']]
store20 = store20.groupby('Date').sum()
store20.reset_index(inplace = True)

store20['Date'] = pd.to_datetime(store20['Date']) #converting 'date' column to a datetime type
store20.columns=['ds','y'] #Rename columns
store20

In [None]:
#Split y_train and y_test
y_train = store20.iloc[:120]
y_test = store20.iloc[120:]

In [None]:
#Model training
model = Prophet() #seasonality_mode='additive' by default
model.fit(y_train)

In [None]:
#Forecasting
future = model.make_future_dataframe(periods=52, freq='W-Fri')
forecast = model.predict(future)
fig=plot_plotly(model, forecast)
py.iplot(fig)

In [None]:
forecast[['ds', 'yhat', 'yhat_lower', 'yhat_upper', "trend"]]

In [None]:
#Components
plot_components_plotly(model, forecast)

In [None]:
#Plot train, test and forecasting comparation 

plt.figure(figsize=(22,8))

plt.plot(y_train['ds'], y_train['y'], 'bx:')

plt.plot(y_test['ds'], y_test['y'], 'bx:', color='green')
plt.plot(forecast["ds"][120:], forecast["yhat"][120:], "bx-", color="red")

plt.legend(['Time Series Train ','Time Series Test','Forecast'])
plt.title("Forecast with Prophet")
plt.ylabel("Sales")
plt.xlabel("Date")
plt.show()

In [None]:
y_pred = forecast['yhat'][120:].head(len(y_test)).values

In [None]:
mape = mean_absolute_percentage_error(y_test['y'].values, y_pred).round(4)
mae = mean_absolute_error(y_test['y'].values, y_pred).round(4)
mse = mean_squared_error(y_test['y'].values, y_pred).round(4)
r2 = r2_score(y_test['y'].values, y_pred).round(4)
print(f"MAPE: {(mape)*(100)}%")
print(f"MAE: {(mae)}")
print(f"MSE: {(mse)}")
print(f"R2: {(r2)}")

<a id="3."></a>
<h3 style="text-align: left; background-color: #128EB3; font-family:Trebuchet MS; color: white; padding: 14px; line-height: 1; border-radius:6px"> 3. All stores analysis</h3>

In [None]:
results = {}

for store in final_df['Store'].unique():
    
    #Analysis per store
    store_x = final_df[final_df['Store'] == store]
    store_x = store_x[['Date','Weekly_Sales']]
    store_x = store_x.groupby('Date').sum()
    store_x.reset_index(inplace = True)
    store_x['Date'] = pd.to_datetime(store_x['Date'])
    
    store_x=store_x[['Date','Weekly_Sales']]
    store_x.columns=['ds','y']
    
    #Time series split to train the model
    y_train = store_x.iloc[:120]
    y_test = store_x.iloc[120:]
    
    #Model training
    model = Prophet() 
    model.fit(y_train)
    
    future = model.make_future_dataframe(periods=23, freq='W-Fri')
    forecast = model.predict(future)

    y_pred = forecast['yhat'].tail(len(y_test)).values
    
    mape = (mean_absolute_percentage_error(y_test['y'].values, y_pred).round(4))*100
    mae = mean_absolute_error(y_test['y'].values, y_pred).round(4)
    
    results[store] = [mape, mae]

In [None]:
#Results
results = pd.DataFrame(results)
results = results.transpose()
results.reset_index(inplace=True)
results.columns = ['Store', 'MAPE', 'MAE']
results

In [None]:
results.to_csv("prophet_results.csv", index=False)