# Other Machine Learning Models

We will redo the exercice of the previous notebook, but using the following more complex models:
- **Linear Regression** (already done)
- **Ridge Regression**
- **Lasso**
- **Elastic Net**
- **Polynomial Regression**
- **Support Vector Machine (SVM)**
- **Decision Tree Regression**
- **Random Forest (RF)**

## Libraries

In [1]:
import pandas as pd
import numpy as np

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Linear models
from sklearn.linear_model import LinearRegression
from sklearn.linear_model import Ridge
from sklearn.linear_model import Lasso
from sklearn.linear_model import ElasticNet
# Non linear models
from sklearn.preprocessing import PolynomialFeatures
from sklearn.svm import SVR
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import RandomForestRegressor

import seaborn as sns
from matplotlib import pyplot as plt

%matplotlib inline
sns.set_theme()

## Data

- Download the **Bottle Database** (csv file) from the **California Cooperative Oceanic Fisheries Investigations (CalOFI)** portal:<br>
download: https://www.kaggle.com/datasets/sohier/calcofi<br>
    info: https://calcofi.org/data/oceanographic-data/bottle-database/
- Import the data and look at them with `pandas`.
- Select only the following colunms of the dataset:<br>
``columns = ["T_degC", "O2Sat", "O2ml_L", "STheta", "Salnty"]``
- Remove lines that contain empty values.<br>
`data = data[data[columns].notnull().all(1)]`

## Features and Target

The **feature variables** are `"O2Sat", "O2ml_L", "STheta", "O2Sat", "Salnty"`.

The **target variable** is `"Salnty"`.

We want to predict the **target** using the **features**.

- Create the feature tensor $\boldsymbol{X}$ (2D) and the target tensor $\boldsymbol{y}$ (1D).
- Shuffle the data and split them into train and test sets:<br>
(80% train / 20% test, use `train_test_split(...)`)

# Models

For each of the following models:
- **Linear Regression**<br>
https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html#sklearn.linear_model.LinearRegression
- **Ridge Regression**<br>
https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Ridge.html#sklearn.linear_model.Ridge
- **Lasso**<br>
https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Lasso.html#sklearn.linear_model.Lasso
- **Elastic Net**<br>
https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.ElasticNet.html#sklearn.linear_model.ElasticNet
- **Polynomial Regression**<br>
https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.PolynomialFeatures.html#sklearn.preprocessing.PolynomialFeatures<br>
and<br>
https://scikit-learn.org/stable/modules/linear_model.html#polynomial-regression-extending-linear-models-with-basis-functions
- **Support Vector Machine (SVM)**<br>
https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVR.html#sklearn.svm.SVR
- **Decision Tree Regression**<br>
https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeRegressor.html#sklearn.tree.DecisionTreeRegressor
- **Random Forest**<br>
https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html#sklearn.ensemble.RandomForestRegressor

1. Train the model on the train set.
2. Compute the *predictions* and the *score* on the test set (cf. doc for the score).

### Linear Regression

### Ridge Regression

### Lasso Regression

### Elastic Net

### Polynomial Regression

Fit a polynomial regression of degree 3:
1. transform your features with `PolynomialFeatures`:
2. then fit a classical `LinearRegression` on the transformed features.
For an example, see:<br>
https://scikit-learn.org/stable/modules/linear_model.html#polynomial-regression-extending-linear-models-with-basis-functions

### Support Vector Machine (SVM)

Before fitting your SVM, use a `StandardScaler` to scale your features between -1 and 1<br>
For further details, see:<br> https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVR.html#sklearn.svm.SVR

### Decision Tree Regression

### Random Forest (RF)

### Plot

Plot the respective scores of the models.