https://realpython.com/linear-regression-in-python/

# Simple Linear Regression

* There are 7 basic steps when you’re implementing linear regression:

1. Import the packages and classes that you need.
2. Provide data to work with, and eventually do appropriate transformations.
3. Create a regression model and fit it with existing data.
4. Check the results of model fitting to know whether the model is satisfactory.
5. Apply the model for predictions.
6. Checking the ERROR of your model.
7. Let predict the unknown data.

## Step 1: Import packages and classes

In [None]:
import numpy as np
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt

## Step 2: Provide data

- Create the data as input(x) and response(y) as below table:
![image.png](attachment:image.png)

In [None]:
x = np.array([10, 15, 25, 35, 45, 60])
y = np.array([5, 20, 22, 30, 25, 44])

* Let visulize the data by scatter plot

In [None]:
plt.figure(figsize = (5,5))
plt.scatter(x,y)

## Step 3: Create a model and fit it

- The next step is to create a linear regression model and fit it using the existing data.
- Create an instance of the class *LinearRegression*, which will represent the regression model:

In [None]:
model = LinearRegression()

- You should call *.reshape()* on x because this array must be two-dimensional, or more precisely, it must have one column and as many rows as necessary. That’s exactly what the argument (-1, 1) of .reshape() specifies.

In [None]:
x = x.reshape((-1, 1))

- It’s time to start using the model. First, you need to call .fit() on model:
- With .fit(), you calculate the optimal values of the weights 𝑏₀ and 𝑏₁, using the existing input and output, x and y, as the arguments. In other words, .fit() fits the model. 

In [None]:
model.fit(x, y)

## Step 4: Get results
- Once you have your model fitted, you can get the results to check whether the model works satisfactorily and to interpret it.

- You can obtain the coefficient of determination, 𝑅², with .score() called on model:

In [None]:
r_sq = model.score(x, y)
print(f"coefficient of determination: {r_sq}")

- When you’re applying .score(), the arguments are also the predictor x and response y, and the return value is 𝑅².

- The attributes of model are .intercept_, which represents the coefficient 𝑏₀, and .coef_, which represents 𝑏₁:

In [None]:
print(f"intercept: {model.intercept_}")
print(f"slope: {model.coef_}")

In [None]:
model.coef_

According to above value, we can explain that...

- The value of 𝑏₀ is approximately 4.95. This illustrates that your model predicts the response 4.95 when 𝑥 is zero. 
- The value 𝑏₁ = 0.61 means that the predicted response rises by 0.61 when 𝑥 is increased by one.

## Step 5: Predict response
- Once you have a satisfactory model, then you can use it for predictions with either existing or new data. To obtain the predicted response, use .predict():

In [None]:
y_pred = model.predict(x)
print(f"predicted response:\n{y_pred}")

- You can also create the eqa

In [None]:
y_pred = model.intercept_ + model.coef_ * x
print(f"predicted response:\n{y_pred}")

## Step 6: Checking the ERROR

- For linear regression problems, people typically use a Cost Function that measures the distance between the linear model’s predictions and the training examples; the objective is to minimize this distance.
- Let’s consider the **Mean Square Error (MSE)** for a cost function.
- MSE measures the average squared difference between an observation’s actual and predicted values. 


![image-3.png](attachment:image-3.png)

In [None]:
plt.figure(figsize = (5,5))
plt.scatter(x,y)
plt.plot(x,y_pred, 'red')

In [None]:
from sklearn.metrics import mean_squared_error, r2_score

In [None]:
rmse = np.sqrt(mean_squared_error(y,y_pred))
print(rmse)

In [None]:
r2 = r2_score(y,y_pred)
r2

## Step 7: Predicting the new data with your model

In [None]:
new_data = np.array(input("Input the number :"))
predicted_val = model.predict(new_data.reshape(-1,1))
print("the predicted value is ", predicted_val[0])