# Part 1: Regularization

A) Use the Boston dataset, and use Ridge regression model with tuning parameter set to 100 (alpha =100). Find the $R^2$ score and number of non zero coefficients.

B) Use Lasso regression instead of Ridge regression, also set the tuning parameter to 100. Find the $R^2$ score and number of non zero coefficients.

C) Change the tuning parameter of the Lasso model to a very low value (alpha =0.001). What is the $R^2$ score.

D) Comment on your result. In this problem, do all feature seem important in making predictions?

In [57]:
from sklearn.datasets import load_boston
from sklearn.linear_model import Ridge 
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.linear_model import Ridge 
from sklearn.linear_model import Lasso
import numpy as np

dataset = load_boston()
X=dataset.data
Y=dataset.target
X_train, X_test, Y_train, Y_test= train_test_split(X, Y, random_state= 0)

RidgeModel = Ridge(alpha=100).fit(X_train, Y_train)
print("R squared score using Ridge is: ",RidgeModel.score(X_test,Y_test))
print("number of non zero coefficients using Ridge are: ", np.sum(RidgeModel.coef_!=0))
print("\n")
LassoModel=Lasso(alpha=100). fit(X_train, Y_train)
print("R squared score using Lasso is: ",LassoModel.score(X_test,Y_test))
print("number of non zero coefficients using Lasso are: ", np.sum(LassoModel.coef_!=0))
print("\n")
LassoModel=Lasso(alpha=0.001). fit(X_train, Y_train)
print("R squared score using Lasso with alpha = 0.001 is: ",LassoModel.score(X_test,Y_test))

R squared score using Ridge is:  0.592535803616
number of non zero coefficients using Ridge are:  13


R squared score using Lasso is:  0.118669161755
number of non zero coefficients using Lasso are:  2


R squared score using Lasso with alpha = 0.001 is:  0.635035312517


COMMENT:
    The Highest Accuracy recorded was when the tuning parameter is set to 0.001 using the Lasso Regression. This means that the low tuning parameter suggests that the linear regression model by itself is more accurate for the dataset to predict targets. When Ridge regression is used, we get a decent accuracy score and there are 13 non zero coefficients compared to only 2 non zero coefficients using the Lasso Regression with a low accuracy score. This suggests that more features are important for this dataset.

# Part 2: Logistic Regression

In this exercise, you will use logistic regression to classify breast cancer as malignant or benign using the sklearn data set. Run the code below to print and read the description of the data set. Use logistic regression, with Lasso regularization (penelty =l1) and the default regularization parameter to build the classifier. What is the accuracy?


In [65]:
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_breast_cancer
import numpy as np

DataCancer=load_breast_cancer()
print(DataCancer.keys())
print(DataCancer.DESCR)

X_features=DataCancer.data
Y_targetClass=DataCancer.target

X_train, X_test, Y_train, Y_test= train_test_split(X_features, Y_targetClass, random_state= 0)

dict_keys(['data', 'target', 'target_names', 'DESCR', 'feature_names'])
Breast Cancer Wisconsin (Diagnostic) Database

Notes
-----
Data Set Characteristics:
    :Number of Instances: 569

    :Number of Attributes: 30 numeric, predictive attributes and the class

    :Attribute Information:
        - radius (mean of distances from center to points on the perimeter)
        - texture (standard deviation of gray-scale values)
        - perimeter
        - area
        - smoothness (local variation in radius lengths)
        - compactness (perimeter^2 / area - 1.0)
        - concavity (severity of concave portions of the contour)
        - concave points (number of concave portions of the contour)
        - symmetry 
        - fractal dimension ("coastline approximation" - 1)

        The mean, standard error, and "worst" or largest (mean of the three
        largest values) of these features were computed for each image,
        resulting in 30 features.  For instance, field 3 is Mean Ra

In [67]:
LogRegModel= LogisticRegression(C=1,penalty="l1").fit(X_train, Y_train)
print ("The accuracy using Lasso Regularization is: ",LogRegModel.score(X_test,Y_test))
Probabilities=LogRegModel.predict_proba(X_test)
print("\n",Probabilities)

The accuracy using Lasso Regularization is:  0.958041958042

 [[  9.93838691e-01   6.16130908e-03]
 [  2.60271961e-02   9.73972804e-01]
 [  1.48948192e-03   9.98510518e-01]
 [  1.55831805e-01   8.44168195e-01]
 [  5.79824241e-05   9.99942018e-01]
 [  2.28919439e-03   9.97710806e-01]
 [  6.33236003e-03   9.93667640e-01]
 [  1.06936823e-03   9.98930632e-01]
 [  3.66236698e-02   9.63376330e-01]
 [  1.58492590e-04   9.99841507e-01]
 [  4.37898303e-01   5.62101697e-01]
 [  1.47799029e-01   8.52200971e-01]
 [  3.10041566e-03   9.96899584e-01]
 [  7.65816380e-01   2.34183620e-01]
 [  1.88852569e-01   8.11147431e-01]
 [  9.93205761e-01   6.79423912e-03]
 [  2.11321868e-02   9.78867813e-01]
 [  9.99999999e-01   1.05608009e-09]
 [  9.99300338e-01   6.99662072e-04]
 [  1.00000000e+00   1.40527168e-12]
 [  9.99981881e-01   1.81185072e-05]
 [  9.39924827e-01   6.00751727e-02]
 [  1.20668753e-03   9.98793312e-01]
 [  8.58693728e-03   9.91413063e-01]
 [  9.96879932e-01   3.12006778e-03]
 [  8.1726836

COMMENT:
    The Accuracy is very high when using the Lasso Regularization. We can conclude by the Accuracy score that the features are very much related to the response and could be used to predict accurate targets.