## - In this notbook I will explain:
#### 1- Pipelines
#### 2- Support Vector Regression (SVR) & Principal Component Analysis (PCA) Techniques
#### 3- Cross validation with pipelines

### Introduction :

1- Pipelines :
A machine learning pipeline is used to help automate machine learning workflows. They operate by enabling a sequence of data to be transformed and correlated together in a model that can be tested and evaluated to achieve an outcome, whether positive or negative.
You can imagine it like compining any algorithms together into a pipe then give the pipe to the model to make the process faster, it also very useful with cross validation!

2- Support Vector Regression (SVR):
Support Vector Regression (SVR) uses the same principle as SVM, but for regression problems. 
![](https://cdn.analyticsvidhya.com/wp-content/uploads/2020/03/SVR1.png)

Consider these two red lines as the decision boundary and the green line as the hyperplane. Our objective, when we are moving on with SVR, is to basically consider the points that are within the decision boundary line. Our best fit line is the hyperplane that has a maximum number of points.

3- Principal Component Analysis (PCA):
is a statistical procedure that uses an orthogonal transformation which converts a set of correlated variables to a set of uncorrelated variables. PCA is a most widely used tool in exploratory data analysis and in machine learning for predictive models. Moreover, PCA is an unsupervised statistical technique used to examine the interrelations among a set of variables. It is also known as a general factor analysis where regression determines a line of best fit.

___________________________________________________________________________________________________________________________

#### So we will combine PCA & SVR into a pipe line then pass the pipeline into crossvalidation and get the score!

### - Importing data and libraries

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns

In [None]:
df_happ2019= pd.read_csv('../input/world-happiness/2019.csv')
df_happ2019.head()

### - Quick visualization:

In [None]:
plt.figure(figsize=(10,10))
sns.heatmap(df_happ2019.corr(),annot=True)

In [None]:
sns.clustermap(df_happ2019.corr())

In [None]:
sns.lmplot(x='Perceptions of corruption',y='Healthy life expectancy',data=df_happ2019)

In [None]:
x= df_happ2019['Freedom to make life choices']
y= df_happ2019['Score']
cmap = sns.cubehelix_palette(light=1, as_cmap=True)
sns.kdeplot(x, y, cmap=cmap, shade=True);

In [None]:
sns.jointplot(x='GDP per capita',y='Social support',data=df_happ2019,kind='hex')

In [None]:
sns.jointplot(x='Healthy life expectancy',y='Social support',data=df_happ2019,kind='reg')

### - Remove Country or region col "cause pca and svm deal with numiric values only"

In [None]:
df_happ2019=df_happ2019.drop('Country or region', axis=1)

### - Check Nan values:

In [None]:
plt.figure(figsize=(10,10))
sns.heatmap(df_happ2019.isnull(),cmap="YlGnBu")

### - Repare importation for our model:

In [None]:
X= df_happ2019.drop('Score',axis=1)
y=df_happ2019['Score']

In [None]:
from sklearn.pipeline import make_pipeline
from sklearn.svm import SVR
from sklearn.decomposition import PCA


#### Then we compine PCA & SVR in the method make_pipeline :

In [None]:
my_pipeline= make_pipeline(PCA(),SVR())

#### Pass my_pipeline into cross_validation and get the score!

In [None]:
from sklearn.model_selection import cross_val_score
scores= cross_val_score(my_pipeline,X,y,scoring='neg_mean_absolute_error')
scores

#### Note :
Negative mean absolute error: As its name implies, negative MAE is simply the negative of the MAE, which (MAE) is by definition a positive quantity,negative error is also helpful in finding best algorithm when you are comparing multiple algorithms through GridSearchCV()or cross_val_score().

### Sources: 
1- [Pipelines](https://medium.com/analytics-vidhya/what-is-a-pipeline-in-machine-learning-how-to-create-one-bda91d0ceaca)

2- [SVR](https://www.analyticsvidhya.com/blog/2020/03/support-vector-regression-tutorial-for-machine-learning/)

3- [PCA](https://www.geeksforgeeks.org/ml-principal-component-analysispca/)

4-[Neg Mean Abs Error](https://stackoverflow.com/questions/55786121/what-is-the-negative-mean-absolute-error-in-scikit-learn-svm-library)