<h1 align="center">Regularization</h1>

In [1]:
import pandas as pd
import numpy as np
from sklearn.metrics import mean_squared_error

## Dataset

In [14]:
# Loading the data for Mini-Challenges
df = pd.read_csv('../data/auto.csv')
mpg = df.iloc[:,0]
df.drop(['origin','mpg'],axis=1,inplace=True)
df = pd.concat([df,mpg],axis=1)

# Printing the first 5 records 
print (df.head())

   displ   hp  weight  accel  size   mpg
0  250.0   88    3139   14.5  15.0  18.0
1  304.0  193    4732   18.5  20.0   9.0
2   91.0   60    1800   16.4  10.0  36.1
3  250.0   98    3525   19.0  15.0  18.5
4   97.0   78    2188   15.8  10.0  34.3


### Mini Challenge - 1
***
### Instructions
* Store all independent variables in `X` variable and target varaible(i.e mpg) in `y` variable.
* Perform a Hold-Out cross-validation by splitting the data into train and test with `random_state=9` and `test_size = 0.2`

In [3]:
X = df.loc [:,['displ','hp','weight', 'accel', 'size']]
y = df.loc [:, 'mpg']
print (X.shape)
print (y.shape)


(392, 5)
(392,)


In [4]:
from sklearn.model_selection import train_test_split

#split the Data into training and testing
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=6)

# As it is holdout method we split training data into training and validation
train_feat, test_feat, train_tar, test_tar = train_test_split(X_train, y_train, test_size=0.2, random_state=9)




### Mini Challenge - 2
***
### Instructions
* Fit a Linear Regression model on the validated dataset above and find the mean squared error for linear regression.

In [5]:
from sklearn.linear_model import LinearRegression

# instantiate linear regression model
regressor = LinearRegression()

# fit model on training data
regressor.fit(train_feat, train_tar)
y_pred = regressor.predict(test_feat)

mse=mean_squared_error(test_tar,y_pred)
print ('Mean Squared Error for Linear Regression: ', mse)

Mean Squared Error for Linear Regression:  15.514624539685553


### Mini Challenge - 3
***
### Instructions
* Fit a Lasso model on the same validated dataset having `alpha=140`, `max_iter=100000`, `random_state=9`

In [6]:
from sklearn.linear_model import Lasso
lassoreg = Lasso(alpha=140,normalize=True, max_iter=1e5, random_state=9)
lassoreg.fit(train_feat, train_tar)
lasso_pred= lassoreg.predict(test_feat)

### Mini Challenge - 4
***
### Instructions
* Fit a Ridge model having `alpha=0.00001`, `max_iter=100000`, `random_state=9`

In [7]:
from sklearn.linear_model import Ridge
ridgereg = Ridge(alpha=0.00001,normalize=True, max_iter=1e5, random_state=9)
ridgereg.fit(train_feat, train_tar)
rideg_pred= ridgereg.predict(test_feat)

### Mini Challenge - 5
***
### Instructions
* Perform a k-fold cross validation with parameter `cv=5` and model as linear regression and scoring parameter as  `neg_mean_squared_error`.

In [12]:
from sklearn.metrics import make_scorer
from sklearn.model_selection import cross_val_score
scorer = make_scorer(mean_squared_error, greater_is_better = False)


# cross validation with Lasso
rmse_lasso= abs(np.mean(cross_val_score(lassoreg, X_train, y_train, scoring=scorer, cv=5)))
print('Lasso Root mean square: ', rmse_lasso)
# cross validation with Ridge
rmse_ridge= abs(np.mean(cross_val_score(ridgereg, X_train, y_train, scoring=scorer, cv=5)))
print('Ridge Root meam Square:', rmse_ridge)
# select best model
Model = lassoreg if rmse_lasso<rmse_ridge else ridgereg
print ("Best Model is ", Model, sep ="\n")


Lasso Root mean square:  56.38551015628574
Ridge Root meam Square: 15.669132031402977
Best Model is 
Ridge(alpha=1e-05, copy_X=True, fit_intercept=True, max_iter=100000.0,
   normalize=True, random_state=9, solver='auto', tol=0.001)


In [9]:
model_pred = Model.predict(X_test)
mse=mean_squared_error(y_test,model_pred)
print ('Mean Squared Error for Linear Regression with Regularisation is : ', mse)

Mean Squared Error for Linear Regression with Regularisation is :  24.76387714705963


## Quiz

Q1. Which of the following statement(s) is / are true for Gradient Decent (GD) and Stochastic Gradient Decent (SGD)?
```python
1. In GD and SGD, you update a set of parameters in an iterative manner to minimize the error function. 
2. In SGD, you have to run through all the samples in your training set for a single update of a parameter 
   in each    iteration. 
3. In GD, you either use the entire data or a subset of training data to update a parameter in each 
   iteration. 

A) Only 1
B) Only 2
C) Only 3
D) 1 and 2
E) 2 and 3
F) 1,2 and 3

Ans: A

```
Q2. Which of the following options is/are true for K-fold cross-validation?
```python
1. Increase in K will result in higher time required to cross validate the result.
2. Higher values of K will result in higher confidence on the cross-validation result as compared to lower 
   value of K.
3. If K=1, then it is called Leave one out cross validation, where N is the number of observations.
 

A) 1 and 2
B) 2 and 3
C) 1 and 3
D) 1,2 and 3

Ans: D

```
Q3. Which of the following is true about “Ridge” or “Lasso” regression methods in case of feature selection?
```python
A. Ridge regression uses subset selection of features
B. Lasso regression uses subset selection of features
C. Both use subset selection of features
D. None of above

Ans: B
    
    
```
Q4. What is/are true about ridge regression?
```python
1. When lambda is 0, model works like linear regression model
2. When lambda is 0, model doesn’t work like linear regression model
3. When lambda goes to infinity, we get very, very small coefficients approaching 0
4. When lambda goes to infinity, we get very, very large coefficients approaching infinity

A. 1 and 3
B. 1 and 4
C. 2 and 3
D. 2 and 4

Ans: A
```

# Thank You