# Part 1: Polynomial Regression

### A) Use the Auto dataset, find the test $R^2$ score of a linear regression model that predicts the miles per gallon (mpg) from the horsepower.

### B) Use polynomial regression to include both the horsepower feature and $(horsepower)^2$ in the regression model. Find the $R^2$ metric. 

Hint: You can use [numpy.concatenate](https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.concatenate.html). For example to add to an array U a column vector $W^2$, we can use X=np.concatenate((U,W**2),axis=1)

In [39]:
from sklearn.model_selection import train_test_split
from pandas import read_csv
from sklearn.linear_model import LinearRegression
from sklearn.metrics import  r2_score



AutoData=read_csv('Auto_modify.csv') # read the data

X_auto_hp=AutoData.horsepower.values.reshape(-1,1) # define features: horsepower 
Y_auto_mpg=AutoData.mpg.values.reshape(-1,1) # define label: miles per gallon

X_train, X_test, Y_train, Y_test = train_test_split(X_auto_hp,Y_auto_mpg,random_state=0)
linreg=LinearRegression().fit(X_train,Y_train)
Target_predicted= linreg.predict(X_test)
print("R2 score for  linear regression",r2_score(Y_test,Target_predicted))

AutoData['horsepower2']=AutoData.horsepower*AutoData.horsepower
X = AutoData[['horsepower','horsepower2']].values
Y = AutoData.mpg
X_train, X_test, Y_train, Y_test = train_test_split(X,Y,random_state=0)
linreg=LinearRegression().fit(X_train,Y_train)
Target_predicted= linreg.predict(X_test)
print("R2 score for polynomial regression",r2_score(Y_test,Target_predicted))

R2 score for  linear regression 0.6217658811398383
R2 score for polynomial regression 0.7271031504642005


### C)Use KNN regression to predict the miles per gallon(mpg) with K=7, and find $R^2$ metric in the following cases 

- One feature: Horsepower only

- Two features: horsepower and $(horsepower)^2$ 

Hint: 

    Create KNN regression object using neighbors.KNeighborsRegressor:

    knnRegression = neighbors.KNeighborsRegressor(n_neighbors=7)

    Use the .fit and .score methods as before



In [2]:
from sklearn import neighbors
# add your solution here

AutoData['horsepower2']=AutoData.horsepower*AutoData.horsepower


X = AutoData[['horsepower']].values
Y = AutoData.mpg

knnRegression= neighbors.KNeighborsRegressor(n_neighbors=7)
X_train, X_test, Y_train, Y_test = train_test_split(X,Y,random_state=100)
knnRegression.fit(X_train,Y_train)
Target_predicted=knnRegression.predict(X_test)
print("R2 Score for KNN Regression with linear feature",r2_score(Y_test,Target_predicted))

X = AutoData[['horsepower','horsepower2']].values
X_train, X_test, Y_train, Y_test = train_test_split(X,Y,random_state=100)
knnRegression.fit(X_train,Y_train)
Target_predicted=knnRegression.predict(X_test)
print("R2 Score for KNN Regression with quadratic feature",r2_score(Y_test,Target_predicted))

# here the performance increased marginally while adding the quadratic feature

R2 Score for KNN Regression with linear feature 0.6010051329011783
R2 Score for KNN Regression with quadratic feature 0.6147596717312067


#### COMMENT on your results on (E) and (F): which model performs better? How does performance change when adding the quadratic feature?

# Part 2: Regularization

### A) Use the Boston dataset, and use Ridge regression model with tuning parameter set to 100 (alpha =100). Find the $R^2$ score and number of non zero coefficients.

###  B) Use Lasso regression instead of Ridge regression, also set the tuning parameter to 100. Find the $R^2$ score and number of non zero coefficients.

### C) Change the tuning parameter of the Lasso model to a very low value (alpha =0.001). What is the $R^2$ score.



In [38]:
from sklearn.datasets import load_boston
from sklearn.linear_model import Ridge 
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.linear_model import Ridge 
from sklearn.linear_model import Lasso
import numpy as np

dataset = load_boston()
X=dataset.data
Y=dataset.target
X_train, X_test, Y_train, Y_test = train_test_split(X,Y,random_state=100)

RidgeModel=Ridge(alpha=100).fit(X_train, Y_train)                 # Ridge model with alpha =100
print("R2 score for ridge model",RidgeModel.score(X_test,Y_test))
nozr=np.sum(RidgeModel.coef_!=0)
print("The number of non zero coeficient in Ridge model are",nozr)


LassoModel=Lasso(alpha=100).fit(X_train, Y_train)                 # Lasso model with alpha =100
print("R2 score for lasso model",LassoModel.score(X_test,Y_test))
nozl=np.sum(LassoModel.coef_!=0)
print("The number of non zero coeficient in lasso model are",nozl)

Lassolowalpha=Lasso(alpha=0.001).fit(X_train, Y_train)          # Lasso model with alpha =0.001
print("R2 score for low alpha lasso model",Lassolowalpha.score(X_test,Y_test))
nozl1=np.sum(Lassolowalpha.coef_!=0)
print("The number of non zero coeficient in low alpha lasso model are",nozl1)




R2 score for ridge model 0.6916568212105104
The number of non zero coeficient in Ridge model are 13
R2 score for lasso model 0.2247796373828993
The number of non zero coeficient in lasso model are 2
R2 score for low alpha lasso model 0.7241866900918309
The number of non zero coeficient in low alpha lasso model are 13
R2 score for lasso2 model 0.7245555583182401
The number of non zero coeficient in Ridge model are 13


### D) Comment on your result. In this problem, do all feature seem important in making predictions?


In [None]:
# For the lasso model, high value of alpha(100) leads to model which only has 2 non zero coefficients while a really low
# value of alpha(0.001) leads to a model which has 13 non zero coefficients. The better R2 score(0.72 vs 0.22) for low aplha model 
# suggests that atleast more than 2 features are required to make the predication. The value of non zero coefficents depends 
# on alpha and whether the R2 score peaked with alpha=0.001 cant be said with complete certainity.Also features vary in
# scale and making a model without feature scaling will have shortcomings. Though the R2 score from the low aplha lasso
# model makes it seem like that all features are important in making predictions.