# Loss Functions

In this exercice, you will compare the effects of Loss functions on a linear regression model.

👇 Import the data from the attached csv file

In [10]:
!ls
import pandas as pd
data = pd.read_csv('data.csv')
data.head()

data.csv  Loss-Functions.ipynb	README.md


Unnamed: 0,Relative Compactness,Surface Area,Wall Area,Roof Area,Overall Height,Glazing Area,Average Temperature
0,0.98,514.5,294.0,110.25,7.0,0.0,18.44
1,0.98,514.5,294.0,110.25,7.0,0.0,18.44
2,0.98,514.5,294.0,110.25,7.0,0.0,18.44
3,0.98,514.5,294.0,110.25,7.0,0.0,18.44
4,0.9,563.5,318.5,122.5,7.0,0.0,24.56


🎯 Your task is to predict the average temperature inside a greenhouse based on its design. Your temperature predictions will help you select the appropriate greenhouse design for each plant, based on their climatic needs. 

🌿 You know that plants can handle small temperature variations, but are exponentially more sensitive as the temperature variations increase. 

## 1. Theory 

❓ Theoretically, which Loss function would you train your model on to limit the risk of killing plants?

<details>
<summary> Answer </summary>
    
By theory, you would use a Mean Square Error (MSE) Loss function. It would penalize outlier predictions and prevent your model from committing large errors. This would ensure smaller temperature variations and a lower risk for plants.

</details>

## 2. Application

### 2.1 Preprocessing

👇 Scale the features

In [11]:
from sklearn.preprocessing import MinMaxScaler

# Instanciate Scaler
scaler = MinMaxScaler()

# Transform features
X_scaled = scaler.fit_transform(data.drop(columns = 'Average Temperature'))

### 2.2 Modelling

In this section, you are going to verify the theory by evaluating models optimized on different Loss functions.

### Least Squares (MSE) Loss

👇 **10-Fold Cross-validate** a **Linear Regression** model optimized by **Stochastic Gradient Descent** (SDG) on a **Least Squares Loss** (MSE). What are its R2 score and biggest error?




<details>
<summary>💡 Hint</summary>

    
- Sklearn's `SGDRegressor()` allows you to specify the loss function you wish to optimize your model on.

</details>



In [37]:
from sklearn.linear_model import SGDRegressor
from sklearn.model_selection import cross_validate
model = SGDRegressor(loss='squared_loss')

# 10-Fold Cross validate model
cv_results_all = cross_validate(model, X_scaled, data['Average Temperature'], cv=10, scoring=['r2','max_error'])
meanscore = round(cv_results_all['test_r2'].mean(),2)
display(f'R2 score = {meanscore}')
error = round(cv_results_all['test_max_error'].mean(), 2)
display(f'Biggest error = {error}°C')

'R2 score = 0.89'

'Biggest error = -9.17°C'

In [40]:
import sklearn.metrics
sklearn.metrics.SCORERS.keys()

dict_keys(['explained_variance', 'r2', 'max_error', 'neg_median_absolute_error', 'neg_mean_absolute_error', 'neg_mean_squared_error', 'neg_mean_squared_log_error', 'neg_root_mean_squared_error', 'neg_mean_poisson_deviance', 'neg_mean_gamma_deviance', 'accuracy', 'roc_auc', 'roc_auc_ovr', 'roc_auc_ovo', 'roc_auc_ovr_weighted', 'roc_auc_ovo_weighted', 'balanced_accuracy', 'average_precision', 'neg_log_loss', 'neg_brier_score', 'adjusted_rand_score', 'homogeneity_score', 'completeness_score', 'v_measure_score', 'mutual_info_score', 'adjusted_mutual_info_score', 'normalized_mutual_info_score', 'fowlkes_mallows_score', 'precision', 'precision_macro', 'precision_micro', 'precision_samples', 'precision_weighted', 'recall', 'recall_macro', 'recall_micro', 'recall_samples', 'recall_weighted', 'f1', 'f1_macro', 'f1_micro', 'f1_samples', 'f1_weighted', 'jaccard', 'jaccard_macro', 'jaccard_micro', 'jaccard_samples', 'jaccard_weighted'])

### Mean Absolute Error (MAE) Loss

👇 **10-Fold Cross-validate** a **Linear Regression** model optimized by **Stochastic Gradient Descent** (SDG) on a **Mean Absolute Error** (MAE) Loss. What are its R2 score and biggest error?

<details>
<summary>💡 Hint 1</summary>

- MAE loss cannot be directly specified in `SGDRegressor`. It must be engineered by adjusting the right parameters

</details>

<details>
<summary>💡 Hint 2 </summary>

- In `SGDRegressor`, one type of loss "ignores errors less than epsilon and is linear past that"

</details>

In [39]:
MAE_model = SGDRegressor(loss='epsilon_insensitive', epsilon=0.1)

# 10-Fold Cross validate model
cv_results = cross_validate(MAE_model, X_scaled, data['Average Temperature'], cv=10, scoring=['r2','max_error'])
meanscore = round(cv_results['test_r2'].mean(),2)
display(f'R2 score = {meanscore}')
error = round(cv_results['test_max_error'].mean(), 2)
display(f'Biggest error = {error}°C')

'R2 score = 0.86'

'Biggest error = -11.05°C'

## 3. Conclusion

❓Which of the models you evaluated seems the most appropriate for your task?

<details>
<summary> Answer </summary>
    
- The model optimized on a MSE loss should, as predicted by the theory, make the smallest errors and be more suited for the task.
    
</details>

### ⚠️ Please, push your exercice when you are done 🙃

# 🏁