# Least squares example 4

In this example, we will do least squares fitting to some data with noise. We will do the following to inspect
our results:

1. We will calculate R² and the root mean squared error.
2. We will plot the measured $y$ vs. the predicted $\hat{y}$ from the model we make.
3. We will plot the residuals.

In [None]:
import numpy as np
from matplotlib import pyplot as plt
from sklearn.linear_model import LinearRegression
%matplotlib notebook
# First, we will generate some data:
from sklearn.datasets import make_regression  # A method for generating data for regression
intercept = 10
X, y, coef = make_regression(
    n_samples=100, n_features=1, coef=True, noise=50, bias=intercept, random_state=1234
)
# Let us also calculate the underlying model:
y_true = intercept + coef * X
print('True coefficients:', coef)
print(y.shape, X.shape)

# Plot the generated points:
fig, ax = plt.subplots()
ax.scatter(X, y, label='Data points ("measurements")')
# Plot the true line:
ax.plot(X, y_true, color='k', label='True line', lw=3)
ax.set(xlabel='x', ylabel='y', title='Generated data')
ax.legend();

## Performing least squares fitting:

In [None]:
model = LinearRegression(fit_intercept=True)
model.fit(X, y)
print(f'Intercept: {model.intercept_:.3g}')
print(f'Coefficients: {model.coef_}')
# Use the model to calculate y-values for the given X-values:
y_predict = model.predict(X)

## Calculating R²

The [coefficient of determination](https://en.wikipedia.org/wiki/Coefficient_of_determination) (R²) is given by

\begin{equation}
R^{2}=1-{SS_{\rm {res}} \over SS_{\rm {tot}}}
\end{equation}

where $SS_{\rm {res}}$ is the sum of squared residuals and $SS_{\rm {tot}}$ is the total sum of squares

\begin{equation}
SS_{\text{res}}=\sum _{i}(y_{i}-\hat{y}_{i})^{2}=\sum _{i}e_{i}^{2}, \quad SS_{\text{tot}}=\sum _{i}(y_{i}-{\bar {y}})^{2}
\end{equation}

Effectively, we are comparing our model to just using the average value of $y$:
* If R² is 0, then our prediction is no better than just predicting the average value of $y$ for all points.
- If R² is 1, then the error term (sum of squared residuals) is 0. This means that we predict $y$ perfectly.

In [None]:
y_hat = model.predict(X)

SSR = sum((y - y_hat)**2)
SST = sum((y - y.mean())**2)

rsquared = 1 - SSR/SST

rmse = np.sqrt(np.mean((y - y_hat)**2))

print(f'R² = {rsquared:.3g}')
print(f'RMSE = {rmse:.3g}')

In [None]:
# Alternative calculation:
from sklearn.metrics import r2_score, mean_squared_error

rsquared_ = r2_score(y, y_hat)
rmse_ = np.sqrt(mean_squared_error(y, y_hat))
print(f'R² = {rsquared_:.3g}')
print(f'RMSE = {rmse_:.3g}')

In [None]:
# Since we have 1D data, we can plot the predicted y vs x:
fig, ax = plt.subplots()
ax.scatter(X, y, label='Data points ("measurements")')
ax.plot(X, y_true, color='k', label='True line', lw=2)
ax.plot(X, y_hat, color='darkorange', label=f'Predicted by model (R²: {rsquared:.3g})', lw=3)
ax.set(xlabel='x', ylabel='y', title='Generated data + linear fit')
ax.legend();

## Plotting y vs ŷ

The plot of y vs. ŷ is useful for cases where we have many variables and can't plot the predicted y as a
function of all the variables:

In [None]:
fig, ax = plt.subplots()
ax.scatter(y_hat, y)
ax.set(xlabel='ŷ', ylabel='y', title=f'Predicted vs. measured y (R²: {rsquared:.3g})')
# Add x=y line
ax.plot([-300, 300], [-300, 300], lw=2, color='k', ls=':')
ax.set_aspect('equal', adjustable='box')

## Plotting the residuals

If we have made a good model, the residuals should just be noise. We can check this visually by plotting the residuals.

In [None]:
residuals = y - y_hat

fig, ax = plt.subplots()
ax.scatter(y_hat, residuals)
ax.set(xlabel='ŷ', ylabel='y - ŷ', title=f'Residuals (R²: {rsquared:.3g})')
# Add y=0 line
ax.axhline(y=0, lw=2, color='k', ls=':');