# Partial Least Squares (PLS) Regression
- The principle of partial least squares is similar to PCA regression, i.e. to help with ill-conditioned problems.


- Where PCA regression looks for uncorrelated components as a linear combination of the input variables only, PLS regression looks for linear combinations of the input variables AND tries to maximise the covariance between the input and output variables. 


- PLS Regression is a supervised alternative to PCA regression, making use of the targets vector (response vector)


- This is useful when there are more variables than observations (many columns compared to rows), or collinearity between input variables 

## Exercise: Creating PLS models

In this session we'll try to make a PLS model using the octane dataset.  The principles are the same as what we've done throughout today. We're going to take it in small steps.  Initially don't worry about cross-validation and just try to get some code working with a fixed number of components. We will eventually progress to using cross-validation to work out the best number of components to choose.

### 1) Import the required libraries

PLSRegression is used like the other supervised learning models and is found in:

```
from sklearn.cross_decomposition import PLSRegression
from sklearn import model_selection
from sklearn.metrics import mean_squared_error
```

And don't forget numpy, pandas and matplotlib


###  2) Import the dataset. 

Import the Octane dataset into a dataframe. Use .head() to have a look at it

###  3) Train a model

Separate the inputs and the outputs. Then train a model using a small number of components

Use it like this for a given number of components:

```
model = PLSRegression(number_of_components)
model.fit(X_training_data, y_training_data)
```


### 4) Look at how the model changes as you change the number of components.

i) Check the R-square value of the model

ii) Create a plot of fitted vs true value.

Some options: matplotlib's ```plt.plot()``` or seaborn's ```sns.scatterplot()```, or seaborn's ```sns.regplot()```

iii) Create a score plot like you did in PCA using a loop over number of components. 
     Try between 1 to 20 components.

iv) Find out how to look at the "components", called "loading vectors" in PLS jargon.  

v) Plot the component vectors

vi) (Optional) Plot the observations in the new component space. E.g., plot the observations using component 1 and 2 as the new axes. Try with component 2 and 3. Are there any possible clusters? Hint: See am2_dimreduction notebook for a similar example for PCA. 

###  5) Use cross validation to determine the best number of components

You can cross validate in multiple ways.  This may be the simplest:

```
kf_10 = model_selection.KFold(n_splits=10, shuffle=True, random_state=1)
mse = []

for idx in np.arange(1, 20):
    model = PLSRegression(n_components= idx)
    error_score = model_selection.cross_val_score(model, X_training_data,
        y_training_data, cv=kf_10, scoring='neg_mean_squared_error').mean()
    mse.append(error_score)
    
#plot the validation curve
plt.plot(np.arange(1, 20), np.array(mse), '-v')
plt.xlabel('Number of components for 10-fold CV')
plt.ylabel('MSE')
plt.title ('PLS Cross-validation')
```
