##### The basic steps to implement Linear Regression
1. Import the packages and classes you need;
2. Provide data to work with and eventually do appropriate cleaning.
3. Create a regression model and fit it with existing data;
4. Check the results of model fitting to know whether the model is satisfactory;
5. Apply the model for predictions.

#### I. Linear Regression

In [1]:
# step 1 import the packages and classes
import numpy as np
from sklearn.linear_model import LinearRegression


In [2]:
# step 2 provide the data
# array.reshape(-1, 1) is to make the array two-dimensional,
# to have one column and as many rows as necesary.
x = np.array([5, 15, 25, 35, 45, 55]).reshape(-1, 1)
y = np.array([5, 20, 14, 32, 22, 38])
display(x, y)

array([[ 5],
       [15],
       [25],
       [35],
       [45],
       [55]])

array([ 5, 20, 14, 32, 22, 38])

In [3]:
x.shape

(6, 1)

In [4]:
y.shape    # y has a single dimension, 

(6,)

In [5]:
# step 3 create a model and fit it
model = LinearRegression()

#### LinearRegression provides several optional prarmeters to optimize
1. fit_intercept is a Boolean (True by default) to decide whether to calculate the intercept b0 (True) or consider it equal to zero(False).
2. normalize is a Boolean (False by default) to decide whether to normalize the input variables (True) or not (False).
3. copy_x is a Boolean (True by default) to decide whether to copy (True) or overwrite the input variables (False).
4. n_jobs is an integer or None (default) to represent the # of jobs used in parallel computation. None = one job, -1 = to use all processes

In [6]:
# use default values of all parameters
# to start using the model.
# to calculate the optimal values of weights b0 and b1
# using the existing input and output (x and y) as the arguments.
model.fit(x, y)

LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None, normalize=False)

In [7]:
# step 4 get results
# the arguments of .score() are also the predictor x and regresson y,
# and the return value is R^2
r_sq = model.score(x, y)
r_sq        # coefficient of determination

0.715875613747954

In [8]:
model.intercept_    # represents the coefficient, b0

5.633333333333329

In [9]:
model.coef_        # represents the b1

array([0.54])

In [10]:
# step 5 predict
x_new = np.arange(5).reshape(-1, 1)
y_new = model.predict(x_new)
y_new

array([5.63333333, 6.17333333, 6.71333333, 7.25333333, 7.79333333])

#### II. Multiple Linear Regression

In [11]:
# step 1
import numpy as np
from sklearn.linear_model import LinearRegression

In [12]:
# step 2
x = [[0, 1], [5, 1], [15, 2], [25, 5], [35, 11], [45, 15], [55, 34], [60, 35]]
y = [4, 5, 20, 14, 32, 22, 38, 43]
x, y = np.array(x), np.array(y)
display(x, y)

array([[ 0,  1],
       [ 5,  1],
       [15,  2],
       [25,  5],
       [35, 11],
       [45, 15],
       [55, 34],
       [60, 35]])

array([ 4,  5, 20, 14, 32, 22, 38, 43])

In [13]:
# step 3
model2 = LinearRegression()
model2.fit(x, y)

LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None, normalize=False)

In [14]:
# step 4
r_sq = model2. score(x, y)
r_sq          # coefficient of determination


0.8615939258756776

In [15]:
display(model2. intercept_, model2.coef_)   # intercept and coefficient

5.52257927519819

array([0.44706965, 0.25502548])

In [17]:
# step 5
y_pred = model2.predict(x)
y_pred

array([ 5.77760476,  8.012953  , 12.73867497, 17.9744479 , 23.97529728,
       29.4660957 , 38.78227633, 41.27265006])

In [18]:
y_pred = model2.intercept_ + np.sum(model2.coef_ * x, axis = 1)
y_pred

array([ 5.77760476,  8.012953  , 12.73867497, 17.9744479 , 23.97529728,
       29.4660957 , 38.78227633, 41.27265006])

In [19]:
# predict new value
x_new = np.arange(10).reshape((-1, 2))
x_new


array([[0, 1],
       [2, 3],
       [4, 5],
       [6, 7],
       [8, 9]])

In [22]:
y_new = model2.predict(x_new)
y_new.reshape((-1,1))

array([[ 5.77760476],
       [ 7.18179502],
       [ 8.58598528],
       [ 9.99017554],
       [11.3943658 ]])

#### III. Polynomial Regression 
There is only one extra step: to transform the array of inputs to include non-linear terms such as x^2

In [25]:
# step 1
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures

In [26]:
# step 2a provide data
x = np.array([5, 15, 25, 35, 45, 55]).reshape((-1, 1))
y = np.array([15, 11, 2, 8, 25, 32])
display(x, y)

array([[ 5],
       [15],
       [25],
       [35],
       [45],
       [55]])

array([15, 11,  2,  8, 25, 32])

In [27]:
# step 2b transform input data
transformer = PolynomialFeatures(degree = 2, include_bias = False)