# Panel Forecasting Using Decision Trees/Decision Tree Ensembles

Given a dataset in Panel format, for .e.g

| Time  | Store Id | Sales |
| :---: | :---:    | :---: |
| Jan | Store1   | 283   |
| ... | Store1   | ...   |
| Dec | Store1   | 200   |
| Jan | Store2   | 11   |
| ... | Store2   | ...   |
| Dec | Store2   | 31   |

We need a ```make_panel_reduction``` with exogenous variables which transforms the data into the following format.
The IDs identifying ```multiple panels``` will be passed as Categorical Exogenous Variable.
  

| Lag2  | Lag1 | Store Id | Target |
| :---: | :---:| :---:    | :---:  |
| 283   | 205  | Store1   | Mar Sales S1   |
| ...   | ...  | Store1   | Dec Sales S1  |
| 11    | 13   | Store2   | Mar Sales S2  |
| ...   | ...  | Store2   | Dec Sales S2  |

So we can train a **single model** on the above reduced data set and have a Panel Forecaster. The Decision Tree/GBM/RF will make the first split based on the ```Panel Id```(which will amount to maximum reduction of variance in the data), after which, to put it loosley - the case almost becomes univariate(internal to the Decision Tree). Plotting a feature importance plot for the Decision Tree should show the Panel ID column to be the most important one.



                                                Split on "StoreId"
                                                        |
                                if Store Id == "Store 1"|if Store Id == "Store 2"
                                 _______________________|_______________________
                                |                                               |
                                |                                               | 
                                |                                               |
                                |                                               |
                                
 | Lag2  | Lag1 | Store Id | Target||                              | Lag2  | Lag1 | Store Id | Target |
| :---: | :---:| :---:  | :---:   |:---:|                                    | :---: | :---:| :---:    | :---:  |
| 283   | 205  | Store1 | Mar Sales S1| |                           | 11    | 13   | Store2   | Mar Sales S2  |
| ...   | ...  | Store1   | Dec Sales S1| |                            | ...   | ...  | Store2   | Dec Sales S2  |                 

**Advantage-**
In a practical setting where we have 1000s of SKU time series to train as in Retail, a Single Model will be sufficient instead of 1000 different univariate models.

**Caveats -**
* Boosting models don't do well when it comes to capturing **trend**, this can be taken care of by using differences of the values across different lags as exogenous/ fitting a differenced series like ARIMA etc. 

*Note* - Seasonality can be captured by using Date-time indicators like Month/Day of Week etc. as Categorical Exogenous variables.

* Did not find any sound research paper on this yet.

References-
1. [Notebook by MS](https://github.com/microsoft/forecasting/blob/master/examples/grocery_sales/python/00_quick_start/lightgbm_single_round.ipynb)
2. [Something similar in Neural Nets by GluonTS](https://github.com/awslabs/gluon-ts/blob/master/examples/m5_gluonts_template.ipynb)
