### Quantifying Regression

Focus is to find a way to quanitfy an error in the model.

In [2]:
# Dependencies
from sklearn.datasets import make_regression
from sklearn.linear_model import LinearRegression



In [3]:
# Code used to generate some random data in linear regression models

X, y = make_regression(n_samples=20, n_features=1, random_state=0, noise=4, bias=100.0)
print(X.shape)
print(y.shape)

(20, 1)
(20,)


In [4]:
# create the model and fit it
model = LinearRegression()
model.fit(X, y)

LinearRegression()

### Quantifying our Model: Finding our 'Loss'

* Mean Squared Error (MSE)
* R2 Score

In [5]:
# We will quantify how well our model does using r2 and MSE
from sklearn.metrics import mean_squared_error, r2_score

# first use a model to predict
predicted = model.predict(X)

# Score the prediction with MSE and R2
mse = mean_squared_error(y, predicted)
r2 = r2_score(y, predicted)

print("Mean Squared Error: {}".format(mse))
print("R-Squared: {}".format(r2))

Mean Squared Error: 11.933040779746149
R-Squared: 0.903603363418708


### What is Mean Squared Error (technical)

NOTE: "(y, predicted)" - y is our actual (provided) values. and predicted is a result of model.predict(X)  

Mean Squared Error - take all y and predicted, subtract them, then square them. Then find the mean of all those squares. This is the average the square between the actual and predicted values.

### What is R Squared?

R2 - how much of a variance did the regression capture over the actual values

### Bottom Line:

A "good" MSE score will be close to zero while a "good" R squared score will be close to 1.

R squared is the default scoring for many of the Sklearn models.

The "best" way to test your model is to input your own made up values and checking how accurate / how big of an error you receive.

In [7]:
# Continue...

# model.score gives R2
model.score(X, y)

0.903603363418708