### R-squared (R²)

R-squared, also known as the coefficient of determination, measures the proportion of the variance in the dependent variable that is predictable from the independent variable(s). It provides an indication of the goodness of fit of the model.

#### Mathematical Formula

\[ R^2 = 1 - \frac{SS_{\text{res}}}{SS_{\text{tot}}} \]

Where:
- ${ SS_{\text{res}} }$ (Sum of Squares of Residuals) = ${ \sum (y_i - \hat{y}_i)^2 }$
- ${ SS_{\text{tot}} }$ (Total Sum of Squares) = ${ \sum (y_i - \bar{y})^2 }$
- ${ y_i }$ are the actual values
- ${ \hat{y}_i }$ are the predicted values
- ${ \bar{y} }$ is the mean of the actual values

#### Explanation
1. **Total Sum of Squares (SS_tot)**: Measures the total variance in the actual values ${ y_i }$.
2. **Sum of Squares of Residuals (SS_res)**: Measures the variance in the residuals (differences between actual and predicted values).
3. **R-squared Value**: The closer ${ R^2 }$ is to 1, the better the model explains the variance in the dependent variable.

In [3]:
# Sample dataset
data = {
  'Actual': [3, -0.5, 2, 7, 4.2, 5.1, 8, 6.7, 3.5, 1.3],
  'Predicted': [2.5, 0.0, 2, 8, 4, 5.5, 7.8, 6.4, 3.6, 1.8]
}

In [4]:
# Simple

def r_square(y_true, y_pred):
  n = len(y_true)
  y_mean = sum(y_true) / n
  
  ss_tot = 0
  for num in y_true:
    ss_tot += (num - y_mean) ** 2
  
  ss_res = 0
  for i in range(n):
    ss_res += (y_true[i] - y_pred[i]) ** 2

  return 1 - (ss_res / ss_tot)

In [5]:
r_square(data['Actual'], data['Predicted'])adjusted_r_2.ipynb

0.9684866030367455