# Template NUM

**Prerequisites:**

- This notebook must have been generated using Gabarit's numerical template.


- **Launch this notebook with a kernel using your project virtual environment**. In order to create a kernel linked to your virtual environment : `pip install ipykernel` and then `python -m ipykernel install --user --name=your_venv_name` (once your virtual environment is activated). Obviously, the project must be installed on this virtual environment

---
---
---

## 1. How this template works

**Why use gabarit's numerical template ?**

The numerical template automatically generates a project folder and python code containing mainstream models and facilitating their industrialization.

The generated project can be used for **classification** and **regression** tasks on numerical data. Of course, you have to adapt it to your particular use case. 

**Structure of the generated project**

<div style="font-family: monospace; display: grid; grid-template-columns: 1fr 2fr;">
  <div>.                                </div>  <div style="color: green;"></div>
  <div>.                                </div>  <div style="color: green;"></div>
  <div>├── {{package_name}}             </div>  <div style="color: green;"># The package</div>
  <div>│ ├── models_training            </div>  <div style="color: green;"># Folder containing all the modules related to the models</div>
  <div>│ ├── monitoring                 </div>  <div style="color: green;"># Folder containing all the modules related to the explainers and MLflow</div>
  <div>│ └── preprocessing              </div>  <div style="color: green;"># Folder containing all the modules related to the preprocessing</div>
  <div>├── {{package_name}}-data        </div>  <div style="color: green;"># Folder containing all the data (datasets, embeddings, etc.)</div>
  <div>├── {{package_name}}-exploration </div>  <div style="color: green;"># Folder where all your experiments and explorations must go</div>
  <div>├── {{package_name}}-models      </div>  <div style="color: green;"># Folder containing all the generated models</div>
  <div>├── {{package_name}}-pipelines   </div>  <div style="color: green;"># Folder containing fitted pipelines</div>
  <div>├── {{package_name}}-ressources  </div>  <div style="color: green;"># Folder containing some ressources such as the instructions to upload a model</div>
  <div>├── {{package_name}}-scripts     </div>  <div style="color: green;"># Folder containing examples script to preprocess data, train models, predict and use a demonstrator</div>
  <div>│ └── utils                      </div>  <div style="color: green;"># Folder containing utils scripts (such as split train/test, sampling, etc...)</div>
  <div>├── {{package_name}}-tutorials    </div>  <div style="color: green;"># Folder containing notebook tutorials, including this one</div>
  <div>├── tests                        </div>  <div style="color: green;"># Folder containing all the unit tests</div>
  <div>├── .gitignore                   </div>  <div style="color: green;"></div>
  <div>├── .coveragerc                  </div>  <div style="color: green;"></div>
  <div>├── Makefile                     </div>  <div style="color: green;"></div>
  <div>├── nose_setup_coverage.cfg      </div>  <div style="color: green;"></div>
  <div>├── README.md                    </div>  <div style="color: green;"></div>
  <div>├── requirements.txt             </div>  <div style="color: green;"></div>
  <div>├── setup.py                     </div>  <div style="color: green;"></div>
  <div>└── version.txt                  </div>  <div style="color: green;"></div>
</div>

**General principles on the generated packages**

- Data must be saved in the `{{package_name}}-data` folder<br>
<br>
- Trained models will automatically be saved in the `{{package_name}}-models` folder<br>
<br>
- Fitted pipelines will automatically be saved in the `{{package_name}}-pipelines` folder<br>
<br>
- Be aware that all the functions/methods for writing/reading files uses these two folders as base. Thus when a script has an argument for the path of a file/model, the given path should be **relative** to the `{{package_name}}-data`/`{{package_name}}-models` folders.<br>
<br>
- The provided scripts in `{{package_name}}-scripts` are given as example. You can use them as accelerators, but their use is not required.<br>
<br>
- You can use this package for mono-label and multi-labels tasks (`multi_label` argument in models' classes)<br>
<br>
- The modelling part is structured as follows :
    - `ModelClass`: main class taking care of saving data and metrics (among other)
    - `ModelPipeline`: child class of ModelClass managing all models related to a sklearn pipeline
    - `ModelKeras`: child class of ModelClass managing all models using Keras<br>
<br>
- Each task (regression and classification) has a mixin class (`ModelRegressorMixin` and `ModelClassifierMixin`) and specific models located in corresponding subfolders.

---
---
---

### Load utility functions

Please run the following cell to load needed utility functions. These functions are only needed in this notebook.

In [None]:
%load_ext autoreload
%autoreload 2

# Import utility functions
import os
import sys
sys.path.append(os.path.abspath(''))
from tutorial_exercices import answers, verify, utils

---

## 2. Use the template to train your first model

### Wine dataset

[![glass of wine](https://archive.ics.uci.edu/ml/assets/MLimages/Large109.jpg)](https://archive.ics.uci.edu/ml/datasets/wine)

We are going to use the generated python package in a classification problem. 

We are going to work with the [Wine recognition dataset](https://archive.ics.uci.edu/ml/datasets/wine) from [sklearn](https://scikit-learn.org/stable/datasets/toy_dataset.html#wine-recognition-dataset).
This dataset results from a chemical analysis of wines grown in the same region in Italy but derived from three different cultivars.

First we load the dataset thanks to `sklearn.datasets.load_wine` and save the dataset as a csv file in `{{package_name}}-data` :

In [None]:
from {{package_name}}.utils import get_data_path
from sklearn.datasets import load_wine

# Load dataset
wine_dataset = load_wine(as_frame=True)
df_wine = wine_dataset["data"]
df_wine["target"] = wine_dataset["target"]

# Save dataset to {{package_name}}-data
DATA_PATH = get_data_path()
DATASET_WINE_FILENAME = "wine.csv"
DATASET_WINE_PATH = os.path.join(DATA_PATH, DATASET_WINE_FILENAME)
df_wine.to_csv(DATASET_WINE_PATH, sep=";", index=None)
print(f"Wine dataset saved to {DATASET_WINE_PATH}")

# Display first rows
df_wine.head()

You can verify that a new file called `wine.csv` is present in your `{{package_name}}-data` directory. Notice the use of `get_data_path` function from `{{package_name}}.utils`. It returns the data folder.

---

<span style="color:red">**Exercice 1**</span> : **train / valid / test split**

**Goal:**

- Split the main dataset in train / valid / test sets

**TODO:**
- Use the script `utils/0_split_train_valid_test.py` on the dataset `{{package_name}}-data/wine.csv`
- We want a 'random' split but **with a random seed set to 42** (in order to always reproduce the same results)
- We use the default splitting ratios (0.6 / 0.2 / 0.2)

**Help:**
- The file `utils/0_split_train_valid_test.py` splits a dataset in 3 .csv files:
    - `{filename}_train.csv`: the training dataset
    - `{filename}_valid.csv`: the validation dataset
    - `{filename}_test.csv`: the test dataset
- You can specify the type of split : random, stratified or hierarchical (here, use random)
- Reminder: the path to the file to process is relative to `{{package_name}}-data`
- To get the possible arguments of the script: `python 0_split_train_valid_test.py --help`
- Don't forget to activate your virtual environment ...

**Exercice 1** : Verify your answer ✔

In [None]:
verify.verify_exercice_1()

**Exercice 1** : Solution 💡

In [None]:
answers.answer_exercice_1()

---

<span style="color:red">**Exercice 2**</span> : **random sample**

**Goal:**

- Get a random sample of the file `wine.csv` (n=10) (we won't use it, this exercise is just here to show what can be done)

**TODO:**
- Use the script `utils/0_create_samples.py` on the dataset `{{package_name}}-data/wine.csv`
- We want a sample of 10 lines

**Help:**
- The file `utils/0_create_samples.py` samples a dataset
- To get the possible arguments of the script: `python 0_create_samples.py --help`
- Don't forget to activate your virtual environment ...

**Exercice 2** : Verify your answer ✔

In [None]:
verify.verify_exercice_2()

**Exercice 2** : Solution 💡

In [None]:
answers.answer_exercice_2()

---

<span style="color:red">**Exercice 3**</span> : **EDA**

**Goal:**

- Visualize train and test dataset statistics thanks to [Sweetviz](https://github.com/fbdesignpro/sweetviz)

**TODO:**
- Use the script `utils/0_sweetviz_report.py` to generate a Sweetviz report that compare train and test datasets

**Help:**
- To get the possible arguments of the script: `python 0_sweetviz_report.py --help`
- Use `wine_train.csv` as source and `wine_test.csv` as comparison.
- Don't forget to activate your virtual environment ...

**Exercice 3** : Verify your answer ✔

In [None]:
verify.verify_exercice_3()

**Exercice 3** : Solution 💡

In [None]:
answers.answer_exercice_3()

---

<span style="color:red">**Exercice 4**</span> : **pre-processing**

- The script `1_preprocess_data.py` applies a preprocessing pipeline **to all columns of specified files except target columns**
- The argument `--target_cols` is used to specify which columns are targets in order to preserve these columns. Do not forget
to specify that `target` is the target column.
- It works as follows:
    - In `preprocessing/preprocess.py`: 
        - There is a dictionary of functions (`pipelines_dict`): key: str -> function 
            - /!\ Don't remove the default element 'no_preprocess': lambda x: x /!\ 
        - There are preprocessing functions
    - In `1_preprocess_data.py` :
        - We retrieve the dictionary of functions from `preprocessing/preprocess.py` 
        - If a `preprocessing` argument is specified, we keep only the corresponding key from the dictionnary 
        - Otherwise, we keep all keys (except `no_preprocess`) 
        - For each entry of the dictionary, we:
            - Get the associated preprocessing function
            - Load data
            - apply the preprocessing function
            - Save the result -> {file_name}_{key}.csv 
- To get the possible arguments of the script: `python 1_preprocess_data.py --help`
- Usually, pipelines are sklearn's ColumnTransformer objects :
    - These transformers must specify input columns.
    - Hence, even if all columns are loaded, some might just get removed if no transformer use it.
    - One can specify the option remainder='passthrough' in the ColumnTransformer to keep all input columns (even the preprocessed one).
    - Otherwise, it's easy to add a passtrough transformer to keep specific columns with no preprocess.
- We provide an example with the `preprocess_P1`. It uses different pipelines for numerical, categorical (commented), and textual data (commented).
- The `preprocess_P1` pipeline is displayed below (just run the next cell).
- Don't forget to activate your virtual environment ...

**Important:**
- Each preprocessed file is saved in the `{{package_name}}-data` folder.
- To track which pipeline has been used, we add a first line to these files as a metadata line (e.g. `#preprocess_P1_2022_09_08-17_35_06`).
- Each fitted pipeline is saved in the `{{package_name}}-pipelines` folder.
- If you need to use a custom preprocessing function `funcA` using `FunctionTransformer`, be aware that the pickled pipeline may not return wanted results if you later modify `funcA` definition. Please check https://github.com/OSS-Pole-Emploi/gabarit/issues/63.

In [None]:
from {{package_name}}.preprocessing.preprocess import preprocess_P1

# Source code of preprocess_P1 function
utils.display_source(preprocess_P1)

As you can see, the default pipeline `preprocess_P1` uses one transformer (called `num`, that's just a name), that applies the pipeline `numeric_pipeline` on all column with dtype `number`.  

This pipeline uses a `SimpleImputer` to fill NA values with the median and applies a `StandardScaler`.

**Goal:**

- Apply the default preprocessing to `wine.csv`

**TODO:**
- Use the script `1_preprocess_data.py` on the dataset `{{package_name}}-data/wine.csv` to apply the default pipeline (`preprocess_P1`)
- The target column `target` should be preserved by the preprocessing script

**Exercice 4** : Verify your answer ✔

In [None]:
verify.verify_exercice_4()

**Exercice 4** : Solution 💡

In [None]:
answers.answer_exercice_4()

---

<span style="color:red">**Exercice 5**</span> : **custom pre-processing**


We are going to use a [`sklearn.preprocessing.MinMaxScaler`](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MinMaxScaler.html#sklearn.preprocessing.MinMaxScaler) instead of the [`sklearn.preprocessing.StandardScaler`](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html#sklearn.preprocessing.StandardScaler) used in `preprocess_P1` to pre-process the train dataset.

**Goal:**

- Apply a new preprocess with `MinMaxScaler` to the train dataset

**TODO:**

- Create a `preprocess_P2` function in `{{package_name}}/preprocessing/preprocess.py` that use `MinMaxScaler` instead of `StandardScaler`
- Add `preprocess_P2` in `get_pipelines_dict` from `{{package_name}}/preprocessing/preprocess.py`
- Use the script `1_preprocess_data.py` on the dataset `{{package_name}}-data/wine.csv` to apply `preprocess_P2` pipeline

**Help:**

- You can use the source code of `preprocess_P1` seen in the previous exercice

**Exercice 5** : Verify your answer ✔

In [None]:
verify.verify_exercice_5()

**Exercice 5** : Solution 💡

In [None]:
answers.answer_exercice_5_preprocess_P2()

In [None]:
answers.answer_exercice_5_preprocess_script()

---

<span style="color:red">**Exercice 6**</span> : **Pre-processing on train and validation data**

In the previous exercice we used `1_preprocess_data.py` to apply a preprocessing pipeline to `wine.csv`. To prevent [data leakage](https://machinelearningmastery.com/data-leakage-machine-learning/) we should preprocess training data separately from validation and test data. 

Each time `1_preprocess_data.py` is called, a new folder is created in `{{package_name}}-pipelines`. It contains a pickled pipeline that can be used in `2_apply_existing_pipeling.py` to transform the validation dataset.

We are going to use `1_preprocess_data.py` to apply the default `preprocess_P1` to `wine_train.csv` and then apply the saved pipeline to `wine_valid.csv`. 


**Goal:**


- Fit a `preprocess_P1` pipeline to the train dataset.
- Apply the fitted pipeline to the validation dataset.


**TODO:**

- Apply the default `preprocess_P1` to `wine_train.csv` (previously we did it on the whole dataset, which is wrong).
- Find the name of the fitted pipeline inside {{package_name}}-pipelines.
- Use the script `2_apply_existing_piepline.py` to transform the validation set `wine_valid.csv` using the fitted `preprocess_P1` pipeline.


**Important:**

- Do not worry about applying the fitted pipeline to `wine_test.csv`. Our models will store the preprocessing pipelines and :
    - The prediction script `4_predict.py` will preprocess the test dataset with the model's preprocessing pipeline before sending the data to the model's predict function. This is the **batch mode**.
    - We also expose an agnostic `predict` function (in `utils_models`) to handle new data on the fly. It will preprocess it with the model's preprocessing pipeline before sending the data to the model's predict function. This is the **API mode**.

**Exercice 6** : Verify your answer ✔

In [None]:
verify.verify_exercice_6()

**Exercice 6** : Solution 💡

In [None]:
answers.answer_exercice_6()

---

<span style="color:red">**Exercice 7**</span> : **Train a model**


**Goal:**

- Train a classification model on the preprocessed data. 

- Use the default model in `3_training_classification.py` : [`ModelRidgeClassifier`](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.RidgeClassifier.html).

**TODO:**

- Use the script `3_training_classification.py` to train a `ModelRidgeClassifier` on `wine_train_preprocess_P1.csv`
- Use `wine_valid_preprocess_P1.csv` as validation dataset

**Help:**

- Use `3_training_classification.py -h` to see CLI helper.
- The script `3_training_classification.py` trains a model on a dataset
- It works as follows:
    - Reads a train .csv file as input 
        - If a validation file is given, it will use it as validation data 
    - Manages `y_col` argument: 
        - If there is only one value, training in mono-label mode 
        - If several values, training in multi-labels mode 
    - Instantiates a model class
    - Fits the model
    - Saves the model & some metrics
- **Manual modifications of the script**: 
    - **To change the model used** -> you have to comment / uncomment / modify the code in the "training" part (not compulsory for this exercise) 
    - **To load datasets** -> if a dataset is not in the right format, you have to adapt the loading part (not compulsory for this exercise) 
- Don't forget to activate your virtual environment ...
- If you get an `pkg_resources.DistributionNotFound`, you should try to install {{package_name}} : `pip install -e .` when in the folder containing the setup.py


**Important:**

- If you try to use this script with not preprocessed data, it will consider the preprocessing pipeline to be a 'passtrough', i.e. no preprocessing & all columns kept.
- The training script will raise a warning if train & validation dataset do not have the same preprocessing pipeline. We still continue even if it will probably lead to poor results.


**Exercice 7** : Verify your answer ✔

In [None]:
verify.verify_exercice_7()

**Exercice 7** : Solution 💡

In [None]:
answers.answer_exercice_7()

---

<span style="color:red">**Exercice 8**</span> : **Try another classification model**

The previous model already achieves a perfect score on validation dataset but the goal here is to see how to use a different kind of model for training.

**Goal:**

Train a `ModelLGBMClassifier` classification model on preprocessed data. 

**TODO:**

- Change the script `3_training_classification.py` so it uses a `ModelLGBMClassifier`
- Use the script `3_training_classification.py` to train `ModelLGBMClassifier` a on `wine_train_preprocess_P1.csv` with `wine_valid_preprocess_P1.csv` as validation dataset

**Help:**

If you look at `3_training_classification.py` you will see that a lot of models are commented :

```python
model = model_ridge_classifier.ModelRidgeClassifier(
    x_col=x_col,
    y_col=y_col,
    level_save=level_save,
    preprocess_pipeline=preprocess_pipeline,
    ridge_params={"alpha": 1.0},
    multi_label=multi_label,
)
# model = model_logistic_regression_classifier.ModelLogisticRegressionClassifier(
#     x_col=x_col,
#     y_col=y_col,
#     level_save=level_save,
#     preprocess_pipeline=preprocess_pipeline,
#     lr_params={"penalty": "l2", "C": 1.0, "max_iter": 100},
#     multi_label=multi_label,
# )
#
# [...]
```
Comment and uncomment proper lines to use `ModelLGBMClassifier` instead of `ModelRidgeClassifier`


**Exercice 8** : Verify your answer ✔

In [None]:
verify.verify_exercice_8()

**Exercice 8** : Solution 💡

In [None]:
answers.answer_exercice_8()

As you can see this model do not perform perfectly on validation dataset, probably due to overfitting. We are going to stick with our `ModelRidgeClassifier`

---

<span style="color:red">**Exercice 9**</span> : **Test your model on the test dataset**

**Goal:**

- Use your `ModelRidgeClassifier` model to predict on the test dataset

**TODO:**

- Use the script `4_predict.py` to make prediction about cultivars in test data.
- Use argument `[-y Y_COL [Y_COL ...]]` to obtain performance on test data.

**Help:**

- Use `4_predict.py -h` to see CLI helper.
- You **DO NOT** need to preprocess the test data ! As we said above, the preprocessing pipeline is saved alongside the model, and the script will preprocess the test data before sending it to the model's predict function.
- If you get an `ValueError: There are some missing mandatory columns`, you probably made predictions on a preprocessed version of `wine_test.csv` when you should have made predictions on `wine_test.csv`. This error is caused by the first row of the preprocessed file which contains metadata infos.

**Exercice 9** : Verify your answer ✔

In [None]:
verify.verify_exercice_9()

**Exercice 9** : Solution 💡

In [None]:
answers.answer_exercice_9()

---
---
---


## 3. Use a saved model in python

In this section, we will see how to load a saved model in python and use it with new data

### Load a saved model

First choose one of your saved models :

In [None]:
import os
from pathlib import Path
from {{package_name}}.utils import get_models_path
from {{package_name}}.models_training import utils_models

MODELS_PATH = Path(get_models_path())

# This line list saved model in {{package_name}}-models
saved_model_names = sorted([model.name for model in MODELS_PATH.glob("*/*")])
print("\n".join(saved_model_names))

Then load it with `utils_models.load_model` :

In [None]:
model_name = saved_model_names[-1]
print(model_name)

model, model_conf = utils_models.load_model(model_name)

### Make predictions on new data

In [None]:
from sklearn.datasets import load_wine
wine_dataset = load_wine(as_frame=True)

wine_dataset["data"].sample(3)

The `model` object as a `preprocess_pipeline` argument that can be used in combination to `utils_models.apply_pipeline` to apply the same preprocessing on the new data :

In [None]:
wine_dataset_prep = utils_models.apply_pipeline(wine_dataset["data"], model.preprocess_pipeline)
wine_dataset_prep.sample(3)

We can then simply use `model.predict` to make predictions : 

In [None]:
predictions1 = model.predict(wine_dataset_prep)

# Verifying accuracy :
accuracy1 = sum(predictions1 == wine_dataset["target"].astype(str)) / predictions1.shape[0]
print(f"Accuracy v1 : {accuracy1:.2%}")

### Make predictions on new data - using the `utils_models.predict` function

An alternative is to use the provided (model agnostic) `utils_models.predict` function.  

This function **does not need the data to be preprocessed**. Everthing is managed inside the function, you just have to provide the dataset and the model.

In [None]:
predictions2 = utils_models.predict(wine_dataset["data"], model)

# Verifying accuracy :
accuracy2 = sum(predictions2 == wine_dataset["target"].astype(str)) / len(predictions2)
print(f"Accuracy v2 : {accuracy2:.2%}")

---
---
---

## 4. Use the template for regression

In previous sections we saw how to train a model to solve classification problem thanks to the script `3_training_classification.py`. Here we are going to see how to use `3_training_regression.py` script for regression.




### Reuse of wine dataset

[![glass of wine](https://archive.ics.uci.edu/ml/assets/MLimages/Large109.jpg)](https://archive.ics.uci.edu/ml/datasets/wine)

We are going to reuse the wine dataset but instead of predicting cultivars, we are going to predict alcohol content of samples based on the other constituents and the cultivar.

We first reload the dataset and save it as `wine_reg.csv` in order to avoid conflicts with previous datasets : 

In [None]:
# Reload wine dataset
wine_dataset = load_wine(as_frame=True)
df_wine_reg = wine_dataset["data"]
df_wine_reg["cultivar"] = wine_dataset["target"]
del wine_dataset["target"]

# Save it as wine_reg.csv
WINE_REG_PATH = os.path.join(DATA_PATH, "wine_reg.csv")
df_wine_reg.to_csv(WINE_REG_PATH, sep=";", index=None)
print(f"Wine regression dataset saved to {WINE_REG_PATH}")

# See first rows
df_wine_reg.head()

<span style="color:red">**Exercice 10**</span> : **Preprocess, train and predict**

**Goal:**
- Use everything we learned from previous exercices to create a regressor model that is capable of predicting alcohol based on cultivar and other constituents.
- Split `wine_reg.csv`, adapt the preprocess step and train a regressor to make predictions on test data

**TODO:**

- Split `wine_reg.csv` into train / valid / test datasets thanks to `utils/0_split_train_valid_test.py`
- Create a `preprocess_P3` pipeline to handle `cultivar` feature wich is a categorical column.
    - We want this variable to be one hot encoded.
    - Help yourself with comments from `preprocess_P1` source code and [scikit-learn pipeline documentation](https://scikit-learn.org/stable/modules/compose.html).
- Preprocess `wine_reg_train.csv` with your `preprocess_P3` pipeline thanks to `1_preprocess_data.py`
- Apply preprocess pipeline on `wine_reg_valid.csv` thanks to `2_apply_existing_pipeline.py`
- Comment / uncomment `3_training_regression.py` to use a [`ModelKNNRegressor`](https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsRegressor.html)
- Train a regressor model on `wine_reg_train_preprocess_P3.csv` with `wine_reg_valid_preprocess_P3.csv` as validation dataset thanks to `3_training_regression.py`
- Make prediction on `wine_reg_test.csv` thanks to `4_predict.py`

**Help:**

- Each script has a CLI helper.
- `cultivar` column is also of dtype `number`. We can ignore it in the numeric pipeline by adding a negative pattern to the `make_column_selector` function. (e.g. `make_column_selector(pattern="^(?!cultivar).*$", dtype_include="number")`)
- To apply an OHE to `cultivar`, you will need to create a new pipeline and a new transformer. You can use the lines below :
    ```python
    ...
    cat_pipeline = make_pipeline(SimpleImputer(strategy='most_frequent'), OneHotEncoder(handle_unknown='ignore'))
    ...
    transformers = [
        # ...
        ("cat", cat_pipeline, make_column_selector(pattern="cultivar"), ),
    ]
    ```

**Exercice 10** : Verify your answer ✔

In [None]:
verify.verify_exercice_10()

**Exercice 10** : Solution 💡

In [None]:
answers.answer_exercice_10_preprocess_P3()

In [None]:
answers.answer_exercice_10_scripts()

---
---
---


## 5. BONUS : Start up a small web app to introduce your models 🚀 

You are now ready to demonstrate how good your models work. We implemented a default ***Streamlit*** app., let's try it !

```bash
# do not forget to activate your virtual environment
# source venv_num_template/bin/activate 

streamlit run {{package_name}}-scripts/5_demonstrator.py
```

It will start a Streamlit app on the default port (8501).

Visit [http://localhost:8501](http://localhost:8501) to see you demonstrator.

*Note: the default demonstrator consider all input columns as numbers. Hence, you should adapt it to your dataset if you have other types of data.*