# R Squared

R-squared (R²) indicates how well the independent variables explain the variability of the dependent variable.
If R² = 0.6, we can say that 'x' explains 60% of the variation in 'y'

However, one of its limitations is that it always increases when more variables are added to the model—even if those variables are not meaningful. This can give a false impression of model improvement.

# Adjusted R Squared

Adjusted R-squared, on the other hand, accounts for the number of predictors in the model. It only increases if the new variable improves the model more than would be expected by chance. It can even decrease if the added variable is not helpful.
So, adjusted R-squared is especially useful when comparing models with different numbers of predictors, such as during feature selection. It helps avoid overfitting by penalizing unnecessary complexity

In [1]:
from sklearn.metrics import r2_score
import numpy as np

In [2]:
# Sample actual and predicted values
y_true = np.array([100, 150, 200, 250, 300])
y_pred = np.array([110, 140, 195, 260, 310])

# Calculate R-squared
r_squared = r2_score(y_true, y_pred)

# Manually compute Adjusted R-squared
n = len(y_true)         # number of observations
k = 2                   # number of predictors used in the model (example)
adjusted_r_squared = 1 - (1 - r_squared) * (n - 1) / (n - k - 1)

print(f"R-squared: {r_squared:.4f}")
print(f"Adjusted R-squared: {adjusted_r_squared:.4f}")


R-squared: 0.9830
Adjusted R-squared: 0.9660
