## CHAPTER 13
---
# LINEAR REGRESSION

---
- Linear regression is one of the simplest supervised learning algorithms in our toolkit
- It is so simple that it is sometimes not considered machine learning at all!
- The fact is that linear regression—and its extensions—continues to be a common and useful method of making predictions when the target vector is a quantitative value (e.g., home price, age).

## 13.1 Fitting a Line

**Problem:** You want to train a model that represents a linear relationship between the feature and target vector.

**Solution:** Use a linear regression (in scikit-learn, ${LinearRegression}$)

In [9]:
# Load libraries
from sklearn.linear_model import LinearRegression
from sklearn.datasets import load_boston

# Load data with only two features
boston = load_boston()
features = boston.data[:,0:2]
target = boston.target

# Create linear regression
regression = LinearRegression()

# Fit the linear regression
model = regression.fit(features, target)

# View the intercept
print('Intercept:', model.intercept_)

# View the feature coefficients
print('Coefficients:', model.coef_)

# First value in the target vector multiplied by 1000
print('First Target Value:', target[0]*1000)

# Predict the target value of the first observation, multiplied by 1000
print('First Observation:', model.predict(features)[0]*1000)

# First coefficient multiplied by 1000
print('First Coefficient:', model.coef_[0]*1000)

Intercept: 22.485628113468223
Coefficients: [-0.35207832  0.11610909]
First Target Value: 24000.0
First Observation: 24573.366631705547
First Coefficient: -352.07831564026765


#### Discussion:
- In our dataset, the target value is the median value of a Boston home (in the 1970s) in thousands of dollars
- The major advantage of linear regression is its interpretability, in large part because the coefficients of the model are the effect of a one-unit change on the target vector.
- For example, the first feature in our solution is the number of crimes per resident.
    - Our model’s coefficient of this feature was ~–0.35, meaning that if we multiply this coefficient by 1,000, we have the change in house price for each additional one crime per capita
- This says that every single crime per capita will decrease the price of the house by approximately $350!

## 13.2 Handling Interactive Effects

**Problem:** You have a feature whose effect on the target variable depends on another feature.

**Solution:** Create an interaction term to capture that dependence using scikit-learn’s $PolynomialFeatures$

In [10]:
# Load libraries
from sklearn.linear_model import LinearRegression
from sklearn.datasets import load_boston
from sklearn.preprocessing import PolynomialFeatures

# Load data with only two features
boston = load_boston()
features = boston.data[:,0:2]
target = boston.target

# Create interaction term
interaction = PolynomialFeatures(
    degree=3, include_bias=False, interaction_only=True)
features_interaction = interaction.fit_transform(features)

# Create linear regression
regression = LinearRegression()

# Fit the linear regression
model = regression.fit(features_interaction, target)

# View the feature values for first observation
print('First Observation Feature Values:', features[0])

# Import library
import numpy as np

# For each observation, multiply the values of the first and second feature
interaction_term = np.multiply(features[:, 0], features[:, 1])

# View interaction term for first observation
print('First Observation Interaction Term:', interaction_term[0])

# View the values of the first observation
print('First Observation Values:', features_interaction[0])

First Observation Feature Values: [6.32e-03 1.80e+01]
First Observation Interaction Term: 0.11376
First Observation Values: [6.3200e-03 1.8000e+01 1.1376e-01]
