# 1.Linear Regression


## 1.1 Fitting a Line

You want to train a model that represents a linear relationship between the feature
and target vector.

Use a linear regression (in scikit-learn, LinearRegression):

In [1]:
# Load libraries
from sklearn.linear_model import LinearRegression
from sklearn.datasets import load_boston

In [2]:
# Load data with only two features
boston = load_boston()
features = boston.data[:,0:2]
target = boston.target

In [3]:
# Create linear regression
regression = LinearRegression()
regression

LinearRegression()

In [5]:
# Fit the linear regression
model = regression.fit(features, target)
model

LinearRegression()

### Formula 

β0, also called the bias or intercept, can
be viewed using intercept_:

In [6]:
# View the intercept
model.intercept_

22.485628113468223

And β1 and β2 are shown using coef_:

In [8]:
# First value in the target vector multiplied by 1000
target[0]*1000

24000.0

In [9]:
# Predict the target value of the first observation, multiplied by 1000
model.predict(features)[0]*1000

24573.366631705547

In [8]:
# View the feature coefficients
model.coef_

array([-0.35207832,  0.11610909])

## 1.2 Handling Interactive Effects

You have a feature whose effect on the target variable depends on another feature

Create an interaction term to capture that dependence using scikit-learn’s **Polynomial
Features**:

In [34]:
# Load libraries
from sklearn.linear_model import LinearRegression
from sklearn.datasets import load_boston
from sklearn.preprocessing import PolynomialFeatures

In [35]:
# Load data with only two features
boston = load_boston()
features = boston.data[:,0:2]
target = boston.target

In [36]:
# Create interaction term
interaction = PolynomialFeatures(
degree=3, include_bias=False, interaction_only=True)
features_interaction = interaction.fit_transform(features)

In [37]:
# Create linear regression
regression = LinearRegression()

In [38]:
# Fit the linear regression
model = regression.fit(features_interaction, target)

## Formula

In [47]:
features_interaction[0]

array([6.3200e-03, 1.8000e+01, 1.1376e-01])

In [41]:
# View the feature values for first observation
features[0]

array([6.32e-03, 1.80e+01])

In [42]:
# Import library
import numpy as np
# For each observation, multiply the values of the first and second feature
interaction_term = np.multiply(features[:, 0], features[:, 1])

In [48]:
# View interaction term for first observation
interaction_term[0]

0.11376

## 1.3 Fitting a Nonlinear Relationship

You want to model a nonlinear relationship

Create a polynomial regression by including polynomial features in a linear regression
model:

In [50]:
# Load library
from sklearn.linear_model import LinearRegression
from sklearn.datasets import load_boston
from sklearn.preprocessing import PolynomialFeatures

In [51]:
# Load data with one feature
boston = load_boston()
features = boston.data[:,0:1]
target = boston.target

In [52]:
# Create polynomial features x^2 and x^3
polynomial = PolynomialFeatures(degree=3, include_bias=False)
features_polynomial = polynomial.fit_transform(features)

In [53]:
# Create linear regression
regression = LinearRegression()

In [54]:
# Fit the linear regression
model = regression.fit(features_polynomial, target)

In [55]:
# View first observation
features[0]

array([0.00632])

In [56]:
# View first observation raised to the second power, x^2
features[0]**2

array([3.99424e-05])

In [57]:
# View first observation raised to the third power, x^3
features[0]**3

array([2.52435968e-07])

In [58]:
# View the first observation's values for x, x^2, and x^3
features_polynomial[0]

array([6.32000000e-03, 3.99424000e-05, 2.52435968e-07])

### 1.3 Reducing Variance Through Regularization

You want to reduce the variance of your linear regression model.

Use a learning algorithm that includes a shrinkage penalty (also called regularization)
like ridge regression and lasso regression

In [30]:
# Load libraries
from sklearn.linear_model import Ridge
from sklearn.datasets import load_boston
from sklearn.preprocessing import StandardScaler

In [31]:
# Load data
boston = load_boston()
features = boston.data
target = boston.target

In [32]:
# Standardize features
scaler = StandardScaler()
features_standardized = scaler.fit_transform(features)

In [33]:
# Create ridge regression with an alpha value
regression = Ridge(alpha=0.5)

In [34]:
# Fit the linear regression
model = regression.fit(features_standardized, target)

## formula

In [35]:
model

Ridge(alpha=0.5, copy_X=True, fit_intercept=True, max_iter=None,
      normalize=False, random_state=None, solver='auto', tol=0.001)

scikit-learn includes a RidgeCV method that allows us to select the ideal value for α:

In [36]:
# Load library
from sklearn.linear_model import RidgeCV
# Create ridge regression with three alpha values
regr_cv = RidgeCV(alphas=[0.1, 1.0, 10.0])


In [37]:
# Fit the linear regression
model_cv = regr_cv.fit(features_standardized, target)

In [38]:
# View coefficients
model_cv.coef_

array([-0.91987132,  1.06646104,  0.11738487,  0.68512693, -2.02901013,
        2.68275376,  0.01315848, -3.07733968,  2.59153764, -2.0105579 ,
       -2.05238455,  0.84884839, -3.73066646])

We can then easily view the best model’s α value:

In [39]:
# View alpha
model_cv.alpha_

1.0

## 1.5 Reducing Features with Lasso Regression

You want to simplify your linear regression model by reducing the number of features.

In [59]:
# Load library
from sklearn.linear_model import Lasso
from sklearn.datasets import load_boston
from sklearn.preprocessing import StandardScaler

In [60]:
# Load data
boston = load_boston()
features = boston.data
target = boston.target

In [61]:
# Standardize features
scaler = StandardScaler()
features_standardized = scaler.fit_transform(features)

In [62]:
# Create lasso regression with alpha value
regression = Lasso(alpha=0.5)

In [63]:
# Fit the linear regression
model = regression.fit(features_standardized, target)

### Explanation








In [64]:
# View coefficients
model.coef_

array([-0.11526463,  0.        , -0.        ,  0.39707879, -0.        ,
        2.97425861, -0.        , -0.17056942, -0.        , -0.        ,
       -1.59844856,  0.54313871, -3.66614361])