# Interpreting Coefficients and adjusted R² 

## Unstandardized vs Standardized Coefficients
- Unstandardized coefficients are in the original units of the predictors, while standardized coefficients are in standard deviation units.
- Standardized coefficients allow for comparison of the relative importance of predictors.
- Standardized coefficients are useful when predictors are on different scales.
- they remain consistent regardless of units 
- This allows for direct comparision of the effect sizes of different predictors.

example :

consider age and weight as predictors of blood pressure.
- Unstandardized coefficients:
  - Age coefficient: 0.5 (each additional year increases blood pressure by 0.5 mmHg)
  - Weight coefficient: 0.2 (each additional pound increases blood pressure by 0.2 mmHg)
- Standardized coefficients:
  - Age coefficient: 0.6 (age has a stronger effect on blood pressure)
  - Weight coefficient: 0.4 (weight has a weaker effect on blood pressure)


## p-value 

- helps to determine if the relationship between the indepenedent variable and dependent variable is statistically significant.
- A low p-value (typically < 0.05) indicates that the predictor is significantly associated with the outcome variable.
- A high p-value suggests that the predictor may not have a meaningful impact on the outcome variable (random chance).

## R - Multiple correlation coefficient

- Measures the correlation between the dependent variable and all independent variables combined.
- R ranges from 0 to 1, where 0 indicates no correlation and 1 indicates perfect correlation.
- A higher R value indicates a stronger relationship between the predictors and the outcome variable

## R² - Coefficient of determination

- Represents the proportion of variance in the dependent variable that can be explained by the independent variables.
- R² ranges from 0 to 1, where 0 means the model explains none of the variance and 1 means it explains all the variance.
- A higher R² value indicates a better fit of the model to the data.
- eg : An R² of 0.75 means that 75% of the variance in the dependent variable is explained by the independent variables.

## Adjusted R²

- Adjusted R² is a modified version of R² that takes into account the number of predictors in the model.
- When there are multiple predictors, R² can artificially increase even if the new predictors do not improve the model.
- Adjusted R² penalizes the addition of irrelevant predictors, providing a more accurate measure of model fit.

In [None]:
# Python code to find these parameters using scikit-learn and statsmodels
import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler
import statsmodels.api as sm

# ----------------------------
# SAMPLE DATA
# ----------------------------
data = {
    'Age': [25, 32, 47, 51, 62, 22, 37, 44, 57, 63],
    'Weight': [55, 60, 72, 80, 85, 58, 65, 70, 78, 90],
    'BloodPressure': [120, 125, 135, 140, 155, 118, 130, 138, 150, 160]
}

df = pd.DataFrame(data)

# ----------------------------
# UNSTANDARDIZED REGRESSION
# ----------------------------
X = df[['Age', 'Weight']]
y = df['BloodPressure']

X_const = sm.add_constant(X)                      
model = sm.OLS(y, X_const).fit()

print("\n--- Unstandardized Coefficients ---")
print(model.params)

print("\n--- p-values ---")
print(model.pvalues)

# R (multiple correlation coefficient)
R = np.sqrt(model.rsquared)
print("\nR:", R)

# R²
print("R²:", model.rsquared)

# Adjusted R²
print("Adjusted R²:", model.rsquared_adj)


# ----------------------------
# STANDARDIZED COEFFICIENTS
# ----------------------------
scaler = StandardScaler()

X_scaled = scaler.fit_transform(X)
y_scaled = scaler.fit_transform(y.values.reshape(-1, 1))

X_scaled_const = sm.add_constant(X_scaled)
model_std = sm.OLS(y_scaled, X_scaled_const).fit()

print("\n--- Standardized Coefficients (Beta coefficients) ---")
print(model_std.params)



--- Unstandardized Coefficients ---
const     83.958182
Age        0.733211
Weight     0.292855
dtype: float64

--- p-values ---
const     0.000583
Age       0.046590
Weight    0.462203
dtype: float64

R: 0.9831240232275906
R²: 0.9665328450472042
Adjusted R²: 0.9569708007749768

--- Standardized Coefficients (Beta coefficients) ---
[3.33066907e-16 7.46954567e-01 2.40779552e-01]
