# R-squared (R²) Calculation

## Introduction

**R-squared (R²)**, also known as the Coefficient of Determination, measures how well the predicted values approximate the actual values. It indicates the proportion of the variance in the dependent variable that is predictable from the independent variable(s).

This notebook demonstrates:
1. Manual implementation of the R² calculation.
2. Testing the implementation with a sample dataset.

---

## Function Definition

The `r2_score` function is defined to:
- Take two lists: `y_true` (actual values) and `y_pred` (predicted values).
- Calculate the R² score by:
  - Computing the **Total Sum of Squares (TSS)**.
  - Computing the **Residual Sum of Squares (RSS)**.
  - Substituting these values into the formula: \( R^2 = 1 - \frac{\text{RSS}}{\text{TSS}} \).

The function includes input validation to ensure both lists have the same length and raises an error if they differ.

---

In [1]:
def r2_score(y_true, y_pred):
    """
    Calculate R-squared (R²) manually.

    Parameters:
    y_true (list): Actual values
    y_pred (list): Predicted values

    Returns:
    float: R-squared value
    """
    # Ensure y_true and y_pred are of the same length
    if len(y_true) != len(y_pred):
        raise ValueError("The length of y_true and y_pred must be the same.")
    
    # Calculate mean of the actual values
    mean_y_true = sum(y_true) / len(y_true)
    
    # Calculate total sum of squares (TSS)
    tss = 0
    for y in y_true:
        tss += (y - mean_y_true) ** 2
    
    # Calculate residual sum of squares (RSS)
    rss = 0
    for i in range (len(y_true)):
        rss += (y_true[i] - y_pred[i]) ** 2
    
    # Calculate R-squared
    r2 = 1 - (rss / tss)
    return r2

In [2]:
# Example dataset
y_true = [10, 20, 30, 40, 50]
y_pred = [12, 18, 35, 38, 55]

# Calculate R²
r2 = r2_score(y_true, y_pred)
print(f"R-squared (R²) Score: {r2}")

R-squared (R²) Score: 0.938
