# Project - Linear Regression
## Phi Le
### DATA - 4319
### FALL 2021

![](https://i2.wp.com/techvidvan.com/tutorials/wp-content/uploads/sites/2/2020/03/linear-regressions.jpg?ssl=1)

Linear regression method is used for predicting the value of the dependent variable by using the values of the independent variable.

The linear regression model is suitable for predicting the value of a continuous quantity.

OR

The linear regression model represents the relationship between the input variables (x) and the output variable (y) of a dataset in terms of a line given by the equation,

![](https://www.alpharithms.com/wp-content/uploads/1156/slope-intercept-vs-linear-regression-equations.jpg)

$$
y = b_0 + b_1x
$$

Where,

- y is the dependent variable whose value we want to predict.
- x is the independent variable whose values are used for predicting the dependent variable.
- $b_0$ and $b_1$ are constants in which b0 is the Y-intercept and b1 is the slope.

The main aim of this method is to find the value of $b_0$ and $b_1$ to find the best fit line that will be covering or will be nearest to most of the data points.

### Project Example

$\textbf{Step 1: Import packages and classes}$

The first step is to import the package numpy and the class LinearRegression from sklearn.linear_model:

In [1]:
import numpy as np
from sklearn.linear_model import LinearRegression

$\textbf{Step 2: Provide data}$

The second step is defining data to work with. The inputs (regressors, 𝑥) and output (predictor, 𝑦) should be arrays (the instances of the class numpy.ndarray) or similar objects. This is the simplest way of providing data for regression:

In [2]:
x = np.array([5, 15, 25, 35, 45, 55]).reshape((-1, 1))
y = np.array([5, 20, 14, 32, 22, 38])

In [3]:
print(x)
print(y)

[[ 5]
 [15]
 [25]
 [35]
 [45]
 [55]]
[ 5 20 14 32 22 38]


$\textbf{Step 3: Create a model and fit it}$

The next step is to create a linear regression model and fit it using the existing data.

In [4]:
model = LinearRegression()

In [5]:
model.fit(x, y)

LinearRegression()

In [6]:
model = LinearRegression().fit(x, y)

$\textbf{Step 4: Get results}$

Once you have your model fitted, you can get the results to check whether the model works satisfactorily and interpret it.

You can obtain the coefficient of determination ($𝑅^2$) with .score() called on model:

In [7]:
r_sq = model.score(x, y)
print('coefficient of determination:', r_sq)

coefficient of determination: 0.715875613747954


In [8]:
print('intercept:', model.intercept_)
print('slope:', model.coef_)

intercept: 5.633333333333329
slope: [0.54]


In [9]:
new_model = LinearRegression().fit(x, y.reshape((-1, 1)))
print('intercept:', new_model.intercept_)
print('slope:', new_model.coef_)

intercept: [5.63333333]
slope: [[0.54]]


$\textbf{Step 5: Predict response}$

Once there is a satisfactory model, you can use it for predictions with either existing or new data.

To obtain the predicted response, use .predict():

In [10]:
y_pred = model.predict(x)
print('predicted response:', y_pred, sep='\n')

predicted response:
[ 8.33333333 13.73333333 19.13333333 24.53333333 29.93333333 35.33333333]


In [11]:
x_new = np.arange(5).reshape((-1, 1))
print(x_new)

y_new = model.predict(x_new)
print(y_new)

[[0]
 [1]
 [2]
 [3]
 [4]]
[5.63333333 6.17333333 6.71333333 7.25333333 7.79333333]


Here .predict() is applied to the new regressor x_new and yields the response y_new. This example conveniently uses arange() from numpy to generate an array with the elements from 0 (inclusive) to 5 (exclusive), that is 0, 1, 2, 3, and 4.
https://realpython.com/linear-regression-in-python/