# 🤖 3 Training the forecasting model

### 🎯 Objective
The objective of this notebook is to include the needed code to train a forecasting model with the tools you find more suitable for the exercise.
You'll be guided through the notebook, but feel free to add anything you may find interesting to add value to the trained model and its accuracy.

### 🔙 Prerrequisites
1. Make sure you have built and transformed the dataset in previous steps, you'll find notebooks guiding you through this process in this same repository.
2. Install any dependencies you may need, using any dependency manager you like.

### 🏁 When you finish
Make sure to comment with markdown cells what you have developed to solve this problem. Also, try to explain how to use your code, what we need to install to use it, etc.

## 1. 🔍 Data Loading

Use this part of the notebook to work on: loading the built dataset and splitting in train, validation and test partitions if desired.

### 1.1 💾 Load the dataset.csv file you built in previous steps

In [1]:
import pandas as pd

In [5]:
sales_df = pd.read_csv('../data/Sales_ipc_calendar_dataset.csv')
sales_df.head()

Unnamed: 0,DATE,SALES,CUMULATIVE_SALES_ALL,HOLIDAY,Year,Month,DayOfWeek,YearMonth,IPC_VALUE,A,FA,FM,C,CR
0,2015-01-01,1103.868,1103.868,False,2015,1,3,2015-01,-0.4,0,0,0,1521,0
1,2015-01-02,13170740.0,13171840.0,False,2015,1,4,2015-01,-0.4,1508,0,0,13,0
2,2015-01-03,16166560.0,29338400.0,False,2015,1,5,2015-01,-0.4,1521,0,0,0,0
3,2015-01-04,126725.7,29465120.0,False,2015,1,6,2015-01,-0.4,0,10,0,1511,0
4,2015-01-05,17914980.0,47380100.0,False,2015,1,0,2015-01,-0.4,1520,0,0,1,0


### 1.2 ✂️ Split your data in partitions

## 2. 🎯 Feature Selection

Here, you may find useful to select the features you will use for training and the target to forecast. Try to explain why you have selected these features and discarded others.

Remember that the sales column you can find in `data/sales_train_dataset.csv` should be the target variable for your model.

## 3. 🤖 Modeling

In this section you will declare everything you need to train your model and fit it to your dataset.

## 4. ✅ Validation

Decide what metric to use to validate the performance of your model and explain why you chose it.

## 5. 📊 Presentation of Results

If you find it interesting, you can use plots to explain your results, metrics and how your model performance degrades over time.

Also, it may be interesting to write down future work that you think may improve the obtained result.

### 5.1 👨🏻‍🏫 Explainability with SHAP

You have explained the results of your trained model using plots and your own words. Now... let's take the next step. We'll be using a library called SHAP, that may help you find insights and explainability on how your model learned the data.

If you haven't heard of SHAP you can find their web site here:

https://shap.readthedocs.io/en/latest/

Don't worry, you will be guided through this exercises.

#### 5.1.1 🐝 Build a Beeswarm plot

Build a Beeswarm plot to help you explain how each feature affects to the output given by the model

#### 5.1.2 💪🏻 Build a Force plot

Build a Force plot to help you explain how each feature affects to the output given by the model

#### 5.1.3 🤷🏻‍♂️ Optional

Explore the SHAP documentation and add any plot you may find useful to explain how the model decides its outputs.

## 6. 🗃️ Model export

If you have arrived to this point... Congratulations!! 🥳. You trained a working model that can now be served in an API to be used to make predictions.

If you feel like developing a simple REST API to serve your amazing model, continue to the next step you will find in the `app/main.py` file located in this repository.

But first, make sure to export your model to a file you can use to load it in your endpoint.