# **Lab: ML Product**




## Exercise 2: AdaBoost

This time we will train an AdaBoost model.

**Pre-requisites:**
- Create a DockerHub account (https://hub.docker.com/)
- Create a Render account (https://render.com/)

The steps are:
1.   Create new Git branch
2.   Load the dataset
3.   Train AdaBoost model
4.   Hyperparameter Tuning
5.   Push changes


## 1. Create new Git branch


**[1.1]** Create a new git branch called `adv_mla_4_adaboost`


In [None]:
# Placeholder for student's code (command line)

In [None]:
# Solution:
git checkout -b adv_mla_4_adaboost

**[1.2]** Launch Jupyter Lab from your virtual environment

In [None]:
# Placeholder for student's code (command line)

In [None]:
# Solution:
poetry run jupyter lab

**[1.3]** Navigate the folder `notebooks` and create a new jupyter notebook called `2_adaboost.ipynb`

## 2. Load the dataset


**[2.1]** Launch magic commands to automatically reload modules

In [None]:
# Placeholder for student's code (Python code)

In [1]:
# Solution
%load_ext autoreload
%autoreload 2

**[2.2]** Import the pandas, numpy packages and dump from joblib

In [None]:
# Placeholder for student's code (Python code)

In [2]:
# Solution
import pandas as pd
import numpy as np
from joblib import dump

**[2.3]** Import the `load_sets()` function from your custom package

In [3]:
# Placeholder for student's code (Python code)

In [4]:
# Solution
from my_krml_studentid.data.sets import load_sets

**[2.4]** Load the saved sets from `data/processed`

In [5]:
# Placeholder for student's code (Python code)

In [6]:
#Solution:
X_train, y_train, X_val, y_val, X_test, y_test = load_sets(path='../data/processed/')

# 3. Train AdaBoost Regression model

**[3.1]** Import the linear regression module from sklearn

In [7]:
# Placeholder for student's code (Python code)

In [8]:
# Solution:
from sklearn.ensemble import AdaBoostRegressor

**[3.2]** Instantiate the LinearRegression class into a variable called reg

In [9]:
# Placeholder for student's code (Python code)

In [10]:
# Solution
adaboost = AdaBoostRegressor(random_state=0, n_estimators=100)

**[3.3]** Fit the model with the prepared data

In [11]:
# Placeholder for student's code (Python code)

In [12]:
# Solution
adaboost.fit(X_train, y_train)

**[3.4]** Import `dump` from `joblib` and save the fitted model into the folder `models` as a file called `adaboost_default`

In [13]:
# Placeholder for student's code (Python code)

In [14]:
# Solution:
from joblib import dump

dump(adaboost,  '../models/adaboost_default.joblib')

['../models/adaboost_default.joblib']

**[3.5]** Save the predictions from this model for the training and validation sets into 2 variables called `y_train_preds` and `y_val_preds`


In [15]:
# Placeholder for student's code (Python code)

In [16]:
# Solution:
y_train_preds = adaboost.predict(X_train)
y_val_preds = adaboost.predict(X_val)

**[3.6]** Import the function `print_regressor_scores()` from your cutom package

In [17]:
# Placeholder for student's code (Python code)

In [18]:
# Solution:
from my_krml_studentid.models.performance import print_regressor_scores

**[3.7]** Display the RMSE and MAE scores of this model on the training set

In [19]:
# Placeholder for student's code (Python code)

In [20]:
# Solution:
print_regressor_scores(y_preds=y_train_preds, y_actuals=y_train, set_name='Training')

RMSE Training: 35543.463493274976
MAE Training: 33988.39756811933




**[3.8]** Display the RMSE and MAE scores of this model on the validation set

In [21]:
# Placeholder for student's code (Python code)

In [22]:
# Solution:
print_regressor_scores(y_preds=y_val_preds, y_actuals=y_val, set_name='Validation')

RMSE Validation: 35736.93071901778
MAE Validation: 33856.46349697021




### 4. Hyperparameter Tuning

**[4.1]** Import the `fit_assess_regressor()` function from your custom package

In [23]:
# Placeholder for student's code (Python code)

In [24]:
# Solution:
from my_krml_studentid.models.performance import fit_assess_regressor

**[4.2]** Train an AdaBoost model with random state 0, n_estimators 100 and learning_rate 0.05 and print its scores on the training and validation sets

In [25]:
# Placeholder for student's code (Python code)

In [26]:
# Solution:
model1 = fit_assess_regressor(AdaBoostRegressor(random_state=0, n_estimators=100, learning_rate=0.05), X_train, y_train, X_val, y_val)

RMSE Training: 8946.199713874501
MAE Training: 4439.217877513725
RMSE Validation: 8675.471614917285
MAE Validation: 4387.799879796779




**[4.3]** Train an AdaBoost model with random state 0, n_estimators 100, learning_rate 0.05 and loss square and print its scores on the training and validation sets

In [27]:
# Placeholder for student's code (Python code)

In [28]:
# Solution:
model2 = fit_assess_regressor(AdaBoostRegressor(random_state=0, n_estimators=100, learning_rate=0.05, loss='square'), X_train, y_train, X_val, y_val)

RMSE Training: 8654.646234771952
MAE Training: 3501.2813387746296
RMSE Validation: 8528.402661471575
MAE Validation: 3457.20424288861




**[4.4]** Import `dump` from `joblib` package and save the trained model into `models` folder

In [None]:
# Placeholder for student's code (Python code)

In [None]:
# Solution:
from joblib import dump

dump(model2,  '../models/ada_reg.joblib')

# 5.   Push changes

**[5.1]** Add you changes to git staging area

In [None]:
# Placeholder for student's code (command line)

In [None]:
# Solution:
git add .

**[5.2]** Create the snapshot of your repository and add a description

In [None]:
# Placeholder for student's code (command line)

In [None]:
# Solution:
git commit -m "adaboost with hyperparameter tuning"

**[5.3]** Push your snapshot to Github

In [None]:
# Placeholder for student's code (command line)

In [None]:
# Solution:
git push --set-upstream origin adv_mla_2_adaboost

[5.4] Go to to github and merge your change to the master/main branch

**[5.5]** Check out to the master branch

In [None]:
# Placeholder for student's code (command line)

In [None]:
# Solution:
git checkout master

**[5.6]** Pull the latest updates

In [None]:
# Placeholder for student's code (command line)

In [None]:
# Solution:
git pull

**[5.7]** Stop Jupyter Lab