# Linear Regression


In this code, we imply Linear Regression. To do so, we use Boston Housing Dataset which can be downloaded from sklearn datasets. 

The Boston Housing Dataset consists of price of houses in various places in Boston. Here is the list of all attribute:

    CRIM      per capita crime rate by town
    ZN        proportion of residential land zoned for lots over 25,000 sq.ft.
    INDUS     proportion of non-retail business acres per town
    CHAS      Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
    NOX       nitric oxides concentration (parts per 10 million)
    RM        average number of rooms per dwelling
    AGE       proportion of owner-occupied units built prior to 1940
    DIS       weighted distances to five Boston employment centres
    RAD       index of accessibility to radial highways
    TAX       full-value property-tax rate per 10,000 dollar
    PTRATIO   pupil-teacher ratio by town
    B         1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town
    LSTAT     lower status of the population
    MEDV      Median value of owner-occupied homes in $1000's
   



Here we use linear regression to find the linear relationship between the features and target variable. Since 
we use only two features of data, the relation would be 

y = a_0 + a_1 x_1 + a_2 x_2 

In this code we will find the coefficients a_0, a_1 and a_2.


In [11]:
# Line Fitting
from sklearn.linear_model import LinearRegression
from sklearn.datasets import load_boston

boston = load_boston()  
X = boston.data[:, 0:2]   
y = boston.target           

#print(boston.feature_names)       # features name
#print(boston.DESCR)               # statistical description of the data
regression = LinearRegression()

model = regression.fit(X, y)

print("The model is: y =", model.intercept_,'+', model.coef_[0],"x_1 +", model.coef_[1], "x_2")



(506, 2)
The model is: y = 22.485628113468223 + -0.35207831564026765 x_1 + 0.11610909184400937 x_2


It is possible that sometimes the target variable not only depends on the feature variables x_1 and x_2, but also depends on the interaction between x_a and x_2, i.e., x_1 * x_2. In this case we use PolynomialFeatures command with interaction_only=True to include a new variable x_3 = x_1*x_2. The model becomes 
y = a_0 + a_1 x_1 + a_2 x_2 + a_3 x_3.

In [28]:
# Feature Interaction

from sklearn.linear_model import LinearRegression
from sklearn.datasets import load_boston
from sklearn.preprocessing import PolynomialFeatures

boston = load_boston()
X = boston.data[:, 0:2]
y = boston.target

interaction = PolynomialFeatures(degree=3, include_bias=False, interaction_only=True)
X_interaction = interaction.fit_transform(X)

regression = LinearRegression()

model = regression.fit(X_interaction, target)

print("The model is:", model.intercept_,'+', model.coef_[0],"x_1 + ", 
                          model.coef_[1], "x_2", '+', model.coef_[2], "x_3" )


The model is: 22.07715825584366 + -0.3371515939259498 x_1 +  0.08155746534799126 x_2 + 0.806620004402748 x_3


# Non-Linear Regression
If the relation between target variable and feature is NOT linear, we should apply nonlinear regression. In this case, we are looking for the model of the form 

y = a_0 + a_1 x + a_2 x^2 + ... + a_d x^d 

where d is the degree of the polynomial. We use PolynomialFeatures to construct features x^2, ..., x^2 with degree=d. 
The rest is the same as linear regression. 

In [29]:
# POLYNOMIAL REGRESSION

from sklearn.linear_model import LinearRegression
from sklearn.datasets import load_boston
from sklearn.preprocessing import PolynomialFeatures

boston = load_boston()
X = boston.data[:, 0:1]
y = boston.target

poly = PolynomialFeatures(degree=3, include_bias=False )
X_poly = poly.fit_transform(X)

regression = LinearRegression()

model = regression.fit(X_poly, target)

print("The model is:", model.intercept_,'+', model.coef_[0],"x + ", 
                          model.coef_[1], "x^2", '+', model.coef_[2], "x^3" )


The model is: 25.190479369326752 + -1.1364007230671813 x +  0.023784825366425656 x^2 + -0.00014887208958576755 x^3


# Ridge Regression
In order to reduce variance of linear regression, two regularizations can be used: Ridge and Lasso. In Ridge regularization we add 

alpha \sigma_{j=1}^{n} a_j^2 

to the cost function, while in Lasso regression we use 

alpha \sigma_{j=1}^{n} |a_j|.

Here alpha is a hyperparameter. 


In [30]:
# RIDGE REGRESSION

from sklearn.linear_model import Ridge
from sklearn.datasets import load_boston
from sklearn.preprocessing import StandardScaler

boston = load_boston()
X = boston.data
y = boston.target

scaler = StandardScaler()

X_standard = scaler.fit_transform(feature)

regression = Ridge(alpha=0.5)

model = regression.fit(X_standard, target)

model.coef_


array([-0.92396151,  1.07393055,  0.12895159,  0.68346136, -2.0427575 ,
        2.67854971,  0.01627328, -3.09063352,  2.62636926, -2.04312573,
       -2.05646414,  0.8490591 , -3.73711409])

In [27]:
# RIDGE REGRESSION WITH CV

from sklearn.linear_model import RidgeCV
from sklearn.datasets import load_boston
from sklearn.preprocessing import StandardScaler

boston = load_boston()
X = boston.data
y = boston.target

scaler = StandardScaler()

X_standard = scaler.fit_transform(feature)

regression_cv = RidgeCV(alphas=[0.1, 1.0, 10.0])

model = regression_cv.fit(X_standard, target)

model.coef_


array([-0.91987132,  1.06646104,  0.11738487,  0.68512693, -2.02901013,
        2.68275376,  0.01315848, -3.07733968,  2.59153764, -2.0105579 ,
       -2.05238455,  0.84884839, -3.73066646])

In [31]:
# LASSO REGRESSION

from sklearn.linear_model import Lasso
from sklearn.datasets import load_boston
from sklearn.preprocessing import StandardScaler

boston = load_boston()
X = boston.data
y = boston.target

scaler = StandardScaler()

X_standard = scaler.fit_transform(feature)

regression = Lasso(alpha=0.5)

model = regression.fit(X_standard, target)

model.coef_

array([-0.11526463,  0.        , -0.        ,  0.39707879, -0.        ,
        2.97425861, -0.        , -0.17056942, -0.        , -0.        ,
       -1.59844856,  0.54313871, -3.66614361])