# Assignment 1 - Part 1: Frisch-Waugh-Lovell (FWL) Theorem
## Math (3 points)

This notebook contains the mathematical proof and numerical verification of the Frisch-Waugh-Lovell theorem implemented in Julia.

The FWL theorem is a fundamental result in econometrics that shows how to isolate the effect of specific variables by "partialling out" the effects of other variables.

Julia provides excellent performance for numerical linear algebra operations, making it ideal for implementing econometric methods.

## Load Required Packages

In [None]:
using LinearAlgebra
using Random
using Printf
using Statistics
using Plots

# Set default plot backend
gr()

## Mathematical Proof of the FWL Theorem

The FWL theorem states that the OLS estimate of β₁ in the regression of y on [X₁ X₂] is equal to the OLS estimate obtained from the following two-step procedure:

1. Regress y on X₂ and obtain the residuals ỹ = M_{X₂}y, where M_{X₂} = I - X₂(X₂'X₂)⁻¹X₂'
2. Regress X₁ on X₂ and obtain the residuals X̃₁ = M_{X₂}X₁
3. Regress ỹ on X̃₁ and show that the resulting coefficient vector is equal to β̂₁ from the full regression.

Formally, we need to show that: β̂₁ = (X̃₁'X̃₁)⁻¹X̃₁'ỹ

In [None]:
function fwl_theorem_proof()
    """
    Mathematical proof of the Frisch-Waugh-Lovell theorem.
    """
    println("=== FRISCH-WAUGH-LOVELL THEOREM PROOF ===\n")
    
    println("Mathematical Proof:")
    println("==================")
    println()
    println("Consider the linear regression model:")
    println("y = X₁β₁ + X₂β₂ + u")
    println()
    println("Where:")
    println("- y is an n×1 vector of outcomes")
    println("- X₁ is an n×k₁ matrix of regressors of interest")
    println("- X₂ is an n×k₂ matrix of control variables")
    println("- u is an n×1 vector of errors")
    println()
    
    println("Step 1: Full regression")
    println("The full regression in matrix form is:")
    println("y = [X₁ X₂][β₁; β₂] + u = Xβ + u")
    println()
    println("The OLS estimator is:")
    println("β̂ = (X'X)⁻¹X'y")
    println()
    println("Partitioning X'X and X'y:")
    println("X'X = [X₁'X₁  X₁'X₂]")
    println("      [X₂'X₁  X₂'X₂]")
    println()
    println("X'y = [X₁'y]")
    println("      [X₂'y]")
    println()
    
    println("Step 2: Using the partitioned inverse formula")
    println("For a partitioned matrix [A B; C D], if D is invertible:")
    println("The (1,1) block of the inverse is (A - BD⁻¹C)⁻¹")
    println()
    println("Applying this to our case:")
    println("β̂₁ = [(X₁'X₁ - X₁'X₂(X₂'X₂)⁻¹X₂'X₁)]⁻¹[X₁'y - X₁'X₂(X₂'X₂)⁻¹X₂'y]")
    println()
    
    println("Step 3: Factoring out the projection matrix")
    println("Let M_{X₂} = I - X₂(X₂'X₂)⁻¹X₂' (the annihilator matrix)")
    println("Note that M_{X₂} is idempotent: M_{X₂}M_{X₂} = M_{X₂}")
    println("And symmetric: M_{X₂}' = M_{X₂}")
    println()
    println("Then:")
    println("X₁'X₁ - X₁'X₂(X₂'X₂)⁻¹X₂'X₁ = X₁'[I - X₂(X₂'X₂)⁻¹X₂']X₁ = X₁'M_{X₂}X₁")
    println("X₁'y - X₁'X₂(X₂'X₂)⁻¹X₂'y = X₁'[I - X₂(X₂'X₂)⁻¹X₂']y = X₁'M_{X₂}y")
    println()
    
    println("Step 4: Final form")
    println("Therefore:")
    println("β̂₁ = (X₁'M_{X₂}X₁)⁻¹X₁'M_{X₂}y")
    println()
    println("Let X̃₁ = M_{X₂}X₁ and ỹ = M_{X₂}y")
    println("Then: β̂₁ = (X̃₁'X̃₁)⁻¹X̃₁'ỹ")
    println()
    println("This shows that β̂₁ from the full regression equals the OLS coefficient")
    println("from regressing the residuals ỹ on the residuals X̃₁.")
    println()
    println("Q.E.D.")
    println()
end

# Display the mathematical proof
fwl_theorem_proof()

## Numerical Verification

Now let's verify the FWL theorem numerically using simulated data. We'll generate data with known parameters and compare the results from:
1. Full regression: y ~ [X₁ X₂]
2. FWL two-step procedure: residuals of y on residuals of X₁

In [None]:
function numerical_verification()
    """
    Numerical verification of the FWL theorem using simulated data.
    """
    println("=== NUMERICAL VERIFICATION ===\n")
    
    # Set random seed for reproducibility
    Random.seed!(42)
    
    # Generate data
    n = 1000  # Sample size
    k1 = 2    # Number of variables of interest
    k2 = 3    # Number of control variables
    
    # Generate X1, X2, and error term
    X1 = randn(n, k1)
    X2 = randn(n, k2)
    u = randn(n, 1)
    
    # True parameters
    beta1_true = [1.5; 2.0]
    beta2_true = [0.5; -1.0; 0.8]
    
    # Generate y
    y = X1 * beta1_true + X2 * beta2_true + u
    
    @printf("Sample size: %d\n", n)
    @printf("X1 dimensions: (%d, %d) (variables of interest)\n", size(X1)...)
    @printf("X2 dimensions: (%d, %d) (control variables)\n", size(X2)...)
    @printf("True β₁: [%.1f, %.1f]\n", beta1_true...)
    @printf("True β₂: [%.1f, %.1f, %.1f]\n", beta2_true...)
    println()
    
    return X1, X2, y, n, k1, k2, beta1_true, beta2_true
end

# Generate the data
X1, X2, y, n, k1, k2, beta1_true, beta2_true = numerical_verification();

### Method 1: Full Regression

First, let's estimate the full regression model with all variables using Julia's efficient linear algebra operations.

In [None]:
# Method 1: Full regression
X_full = hcat(X1, X2)
beta_full = (X_full' * X_full) \ (X_full' * y)
beta1_full = beta_full[1:k1]

println("Method 1: Full regression")
@printf("β̂₁ from full regression: [%.6f, %.6f]\n", beta1_full...)
println()

# Display the full coefficient vector
println("All coefficients from full regression:")
for i in 1:length(beta_full)
    var_name = i <= k1 ? "β₁[$i]" : "β₂[$(i-k1)]"
    @printf("%s: %.6f\n", var_name, beta_full[i])
end
println()

# Calculate R-squared
y_pred_full = X_full * beta_full
sst = sum((y .- mean(y)).^2)
sse = sum((y .- y_pred_full).^2)
r_squared = 1 - sse/sst

@printf("R-squared: %.4f\n", r_squared)
println()

### Method 2: FWL Two-Step Procedure

Now let's implement the FWL two-step procedure:
1. Residualize y and X₁ with respect to X₂
2. Regress the residualized y on the residualized X₁

In [None]:
# Method 2: FWL two-step procedure

# Step 1: Regress y on X2 and get residuals
P_X2 = X2 * inv(X2' * X2) * X2'
M_X2 = I - P_X2
y_tilde = M_X2 * y

# Step 2: Regress X1 on X2 and get residuals
X1_tilde = M_X2 * X1

# Step 3: Regress y_tilde on X1_tilde
beta1_fwl = (X1_tilde' * X1_tilde) \ (X1_tilde' * y_tilde)

println("Method 2: FWL two-step procedure")
println("Step 1: Residualize y on X₂")
println("Step 2: Residualize X₁ on X₂")
println("Step 3: Regress residuals")
@printf("β̂₁ from FWL method: [%.6f, %.6f]\n", beta1_fwl...)
println()

# Show some properties of the projection matrices
println("Properties of projection matrices:")
@printf("Rank of P_X2: %d (should equal k2 = %d)\n", rank(P_X2), k2)
@printf("Rank of M_X2: %d (should equal n - k2 = %d)\n", rank(M_X2), n - k2)
@printf("Trace of P_X2: %.0f (should equal k2 = %d)\n", tr(P_X2), k2)
@printf("Trace of M_X2: %.0f (should equal n - k2 = %d)\n", tr(M_X2), n - k2)
println()

# Check idempotency
idempotent_P = maximum(abs.(P_X2 * P_X2 - P_X2))
idempotent_M = maximum(abs.(M_X2 * M_X2 - M_X2))
@printf("P_X2 idempotency check (max |P²-P|): %.2e\n", idempotent_P)
@printf("M_X2 idempotency check (max |M²-M|): %.2e\n", idempotent_M)
println()

### Comparison and Verification

Let's check if both methods produce identical results (within numerical precision).

In [None]:
# Check if they are equal (within numerical precision)
difference = abs.(beta1_full - beta1_fwl)
max_diff = maximum(difference)

println("Verification:")
@printf("Maximum absolute difference: %.2e\n", max_diff)
@printf("Are they equal (within 1e-10)? %s\n", max_diff < 1e-10)
println()

# Show element-wise differences
println("Element-wise differences:")
for i in 1:k1
    @printf("β₁[%d]: Full = %.8f, FWL = %.8f, Diff = %.2e\n", 
            i, beta1_full[i], beta1_fwl[i], difference[i])
end
println()

# Store results for summary
results = Dict(
    "beta1_full" => beta1_full,
    "beta1_fwl" => beta1_fwl,
    "max_difference" => max_diff,
    "r_squared" => r_squared
);

### Visualization

Let's create some plots to visualize the relationship between the original and residualized variables using Julia's Plots.jl.

In [None]:
# Create visualizations
p1 = scatter(X1[:, 1], y[:, 1], alpha=0.6, color=:blue, 
            xlabel="X1[,1]", ylabel="y", 
            title="Original Data: y vs X1[,1]",
            markersize=2, legend=false)

p2 = scatter(X1_tilde[:, 1], y_tilde[:, 1], alpha=0.6, color=:green,
            xlabel="X1_tilde[,1]", ylabel="y_tilde", 
            title="Residualized Data: y_tilde vs X1_tilde[,1]",
            markersize=2, legend=false)

p3 = scatter(X1[:, 2], y[:, 1], alpha=0.6, color=:blue,
            xlabel="X1[,2]", ylabel="y", 
            title="Original Data: y vs X1[,2]",
            markersize=2, legend=false)

p4 = scatter(X1_tilde[:, 2], y_tilde[:, 1], alpha=0.6, color=:green,
            xlabel="X1_tilde[,2]", ylabel="y_tilde", 
            title="Residualized Data: y_tilde vs X1_tilde[,2]",
            markersize=2, legend=false)

# Combine plots
plot(p1, p2, p3, p4, layout=(2,2), size=(800, 600))

### Matrix Properties Verification

Let's verify some important properties of projection matrices in Julia.

In [None]:
# Verify projection matrix properties
println("=== PROJECTION MATRIX PROPERTIES ===")
println()

# 1. Symmetry
P_symmetric = maximum(abs.(P_X2 - P_X2'))
M_symmetric = maximum(abs.(M_X2 - M_X2'))
@printf("P_X2 symmetry check (max |P-P'|): %.2e\n", P_symmetric)
@printf("M_X2 symmetry check (max |M-M'|): %.2e\n", M_symmetric)

# 2. Idempotency (already checked above)
@printf("P_X2 idempotency check (max |P²-P|): %.2e\n", idempotent_P)
@printf("M_X2 idempotency check (max |M²-M|): %.2e\n", idempotent_M)

# 3. Complementarity
complementary = maximum(abs.(P_X2 + M_X2 - I))
@printf("Complementarity check (max |P+M-I|): %.2e\n", complementary)

# 4. Orthogonality
orthogonal = maximum(abs.(P_X2 * M_X2))
@printf("Orthogonality check (max |PM|): %.2e\n", orthogonal)

# 5. Eigenvalue properties
P_eigenvals = eigvals(P_X2)
M_eigenvals = eigvals(M_X2)

println("\nEigenvalue properties:")
@printf("P_X2 eigenvalues close to 0 or 1: %s\n", 
        all(abs.(P_eigenvals) .< 1e-10 .|| abs.(P_eigenvals .- 1) .< 1e-10))
@printf("M_X2 eigenvalues close to 0 or 1: %s\n", 
        all(abs.(M_eigenvals) .< 1e-10 .|| abs.(M_eigenvals .- 1) .< 1e-10))

println("\n✅ All projection matrix properties verified!")

### Summary of Results

Let's create a comprehensive summary table of all our results.

In [None]:
# Create results summary
println("\n=== RESULTS SUMMARY ===")
println()

println("Method Comparison:")
println("------------------")
@printf("%-25s %12s %12s\n", "Method", "β₁[1]", "β₁[2]")
println("-" ^ 49)
@printf("%-25s %12.6f %12.6f\n", "True Values", beta1_true...)
@printf("%-25s %12.6f %12.6f\n", "Full Regression", beta1_full...)
@printf("%-25s %12.6f %12.6f\n", "FWL Method", beta1_fwl...)

println("\nVerification Statistics:")
println("------------------------")
@printf("Maximum difference: %.2e\n", results["max_difference"])
@printf("Methods match (within 1e-10): %s\n", results["max_difference"] < 1e-10)
@printf("R-squared: %.4f\n", results["r_squared"])

println("\nMatrix Dimensions:")
println("------------------")
@printf("Sample size (n): %d\n", n)
@printf("Variables of interest (k1): %d\n", k1)
@printf("Control variables (k2): %d\n", k2)
@printf("X1 dimensions: %s\n", size(X1))
@printf("X2 dimensions: %s\n", size(X2))
@printf("y dimensions: %s\n", size(y))

println("\n✅ FWL Theorem verification SUCCESSFUL!")
println("The full regression and FWL two-step procedure produce identical results.")

### Performance Comparison

Let's compare the computational performance of both methods using Julia's `@time` macro.

In [None]:
# Performance comparison
println("=== PERFORMANCE COMPARISON ===")
println()

println("Timing full regression method:")
@time begin
    X_full_perf = hcat(X1, X2)
    beta_full_perf = (X_full_perf' * X_full_perf) \ (X_full_perf' * y)
    beta1_full_perf = beta_full_perf[1:k1]
end

println("\nTiming FWL method:")
@time begin
    P_X2_perf = X2 * inv(X2' * X2) * X2'
    M_X2_perf = I - P_X2_perf
    y_tilde_perf = M_X2_perf * y
    X1_tilde_perf = M_X2_perf * X1
    beta1_fwl_perf = (X1_tilde_perf' * X1_tilde_perf) \ (X1_tilde_perf' * y_tilde_perf)
end

println("\nNote: FWL method is more computationally expensive for this simple case,")
println("but becomes advantageous in high-dimensional settings or when computing")
println("coefficients for subsets of variables repeatedly.")

## Conclusion

We have successfully:

1. **Provided a complete mathematical proof** of the Frisch-Waugh-Lovell theorem using partitioned matrix algebra
2. **Numerically verified** the theorem using simulated data with Julia's efficient linear algebra operations
3. **Demonstrated** that both the full regression and the FWL two-step procedure produce identical estimates (within machine precision)
4. **Verified projection matrix properties** including symmetry, idempotency, and complementarity
5. **Visualized** the relationship between original and residualized variables
6. **Compared computational performance** of both approaches

### Julia-Specific Advantages:
- **Performance**: Julia's just-in-time compilation provides near-C performance for numerical operations
- **Syntax**: Clean, mathematical notation that closely resembles theoretical formulations
- **Linear Algebra**: Built-in support for advanced linear algebra operations
- **Memory Efficiency**: Efficient memory management for large matrices
- **Ecosystem**: Rich ecosystem of packages for statistics and econometrics

The FWL theorem is a powerful tool in econometrics that allows us to:
- Isolate the effect of specific variables by "partialling out" control variables
- Understand the mechanics of multiple regression
- Implement efficient computational methods for large datasets
- Gain intuition about what multiple regression coefficients actually measure

### Key Insights:
- The projection matrix M_{X₂} removes the linear association with control variables
- The residualized variables contain only the variation orthogonal to the controls
- The FWL coefficient captures the relationship between y and X₁ after "controlling for" X₂
- This provides the theoretical foundation for interpreting multiple regression coefficients

**This completes Part 1 of Assignment 1 in Julia.**