### Simple Linear Regression Mode
#### x = np.array([1, 2, 3, 4, 5, 6]).reshape(-1, 1)
#### y = np.array([6,1,9,5,17,12])
- Build a linear regression model for y in terms of x.
- Print out the slope and intercept.
- For all the values of x what are the predicted values of y
- Draw a scatter ploy for x and y
- Add the line of best fit to the plot.
- Find R squared from the linear model (model.score(X,y))

In [None]:
import numpy as np
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt
from sklearn.metrics import mean_squared_error

X = np.array([1, 2, 3, 4, 5, 6]).reshape(-1, 1)
y = np.array([6,1,9,5,17,12])
print("X shape: ", X.shape)
print("y shape: ", y.shape)

model = LinearRegression()
model.fit(X, y)

print("Slope (a): ", model.coef_)
print("Intercept (b): ", model.intercept_)

y_hat = model.predict(x)
print("Predicted value of y for X: ", y_hat, sep='\n')

plt.scatter(X, y, color='blue', label='Actual')
plt.plot(X, y_hat, color='green', label='Predicted', linewidth=3)
plt.xlabel('Actual')
plt.ylabel('Predicted')
plt.legend(loc='upper left')
plt.title('Predicted and Actual Values of y')
plt.show()
# plt.savefig('plots/p2predictedActual.png')

print("R squared: ", model.score(X, y))

### Calculate parameters without sklearn.linear_model.LinearRegression
#### For the regression line y = a x + b, and x and y given above, calculate the values of a and b using the following equations.
- a = (nΣxy - (Σx)(Σy)) / (nΣxxy - (Σxy - (Σx)(Σy)) / (nΣxx)(Σxy - (Σx)(Σy)) / (nΣxy)) / (nΣxy - (Σx)(Σy)) / (nΣxx2
 – (Σxy - (Σx)(Σy)) / (nΣxx)2
)
- b = (Σxy - (Σx)(Σy)) / (nΣxy - a(Σxy - (Σx)(Σy)) / (nΣxx)) / n
##### a is the slope and b is the intercept in the linear regression model. The values of a and b should be the same as found by LinearRegression fit() function.
##### Use x*y to multiply the corresponding elements of x and y.
##### Use sum (or np.sum) to sum the elements of an array.

In [None]:
X = np.array([1, 2, 3, 4, 5, 6]).reshape(-1, 1)
y = np.array([6,1,9,5,17,12])

n = len(X)
print(n)

a_numerator = (n * (sum(X*y))) - (sum(X)*sum(y))
a_denominator = (n*sum(np.square(X))) - np.square(sum(X))
a = a_numerator/a_denominator
print("Slope (a): ", a)

b_numerator = sum(y) - a*sum(X)
b = b_numerator/n
print("Intercept (b): ", b)

#### x = np.array([1, 2, 3, 4, 5, 6]).reshape(-1, 1)
#### y = np.array([6,1,9,5,17,12])
- Calculate R squared by hand
- Find R squared from the linear model (model.score(x,y))
- Calculate RMSE by hand
- Calculate RMSE using mean_square_error and setting the parameter squared to False.
##### Use np.square to square an array.
##### Use np.mean to get the mean of an array.

### From the model

In [None]:
import math

X = np.array([1, 2, 3, 4, 5, 6]).reshape(-1, 1)
y = np.array([6,1,9,5,17,12])

model = LinearRegression()
model.fit(X, y)

r_squared = model.score(X, y)
print('Coefficient of determination (R^2):', r_squared)
print('Correlation coefficient (r):', math.sqrt(r_squared))

### By Hand

In [None]:
y_hat = model.predict(X)
print('predicted values:', y_hat)

errors = y_hat-y

# Sum of square of Residuals (Errors)
SSres = sum(np.square(errors))
# print(SSres)

# Sum of square of Total Variance
SStot = sum(np.square(y-np.mean(y)))
# print(SStot)

# R squared
rSquared = 1 - (SSres/SStot)
print("Calculated R squared", rSquared)
print("Using sklearn.linear_model.LinearRegression:", model.score(X,y))

#### Calculate RMSE by hand

In [None]:
  from sklearn.metrics import mean_squared_error

rmse = math.sqrt(np.mean(np.square(errors)))
print("Calculated RMSE: ", rmse)
print("RMSE using mean_square_error(): ", mean_squared_error(y, y_hat))