In [21]:
import pandas as pd
import numpy as np

from sklearn.linear_model import LogisticRegression
from scipy.spatial import distance


## Question 1 & 2

Using homework_8.1.csv, find the Average treatment effect with inverse probability weighting. Then, include your code and a written explanation of your work (mentioning any choices or strategies you made in writing the code) in your homework reflection.  

Here are some steps to follow: 

* Estimate the propensity scores using logistic regression. Fit the model so that the Z values predict ﻿X﻿. 

* Use the model to predict the propensity scores (e.g., using predict_proba if you are using sklearn). 

* Calculate inverse probability weights (﻿1 over P﻿ for ﻿X equals 1﻿ and ﻿fraction numerator 1 over denominator 1 minus P end fraction﻿ for ﻿X equals 0﻿). 

* Estimate the average treatment effect (the Y difference between ﻿X equals 1﻿ and ﻿X equals 0﻿, using the appropriate weights for each). 



In [5]:
df = pd.read_csv('homework_8.1.csv')

In [9]:
# Step 1: Estimate propensity scores P(X=1 | Z)
logit = LogisticRegression()
logit.fit(df[['Z']], df['X'])
propensity_scores = logit.predict_proba(df[['Z']])[:, 1]  # probability of X=1



In [10]:
# Step 2: Calculate inverse probability weights
df['weights'] = np.where(df['X'] == 1, 1 / propensity_scores, 1 / (1 - propensity_scores))


In [12]:
# Step 3: Compute weighted average outcomes
treated = df[df['X'] == 1]
control = df[df['X'] == 0]
# Weighted means
mean_treated = np.sum(treated['Y'] * treated['weights']) / np.sum(treated['weights'])
mean_control = np.sum(control['Y'] * control['weights']) / np.sum(control['weights'])

In [13]:
# Step 4: Estimate Average Treatment Effect (ATE)
ate = mean_treated - mean_control

print(f"Average Treatment Effect (ATE) using IPW: {ate:.4f}")

Average Treatment Effect (ATE) using IPW: 2.2743


In [19]:
#The First 3 propensity scores
propensity_scores_first_three = propensity_scores[:3]
print("Propensity scores for the first three items:", propensity_scores_first_three)

Propensity scores for the first three items: [0.84011371 0.58464597 0.71108245]


## Question 3 & 4

Using homework_8.2.csv, match all treated items to the single nearest untreated item using the Mahalanobis distance. (Do this with replacement — the same untreated item can be used again.) 

* Use the Mahalanobis function from scipy.spatial.distance 

* For the inverse covariance matrix, use all ﻿Z 1﻿ values and all ﻿Z 2﻿ values, make them into a ﻿2 x N﻿ matrix, find its ﻿2 x 2﻿ covariance, and invert. 

In [22]:
# Load data
df = pd.read_csv('homework_8.2.csv')

In [23]:
# Split treated and control
treated = df[df['X'] == 1].reset_index(drop=True)
control = df[df['X'] == 0].reset_index(drop=True)

# Extract covariates
covariates = ['Z1', 'Z2']

In [24]:
# Compute inverse covariance matrix (2 x 2) using all Z1, Z2
Z_all = df[covariates].values
cov_matrix = np.cov(Z_all, rowvar=False)
inv_cov_matrix = np.linalg.inv(cov_matrix)

In [25]:
# For each treated, find nearest control
matches = []
for i, treated_row in treated.iterrows():
    treated_point = treated_row[covariates].values
    dists = control[covariates].apply(
        lambda row: distance.mahalanobis(treated_point, row.values, inv_cov_matrix), axis=1)
    
    nearest_idx = dists.idxmin()
    nearest_control = control.loc[nearest_idx]
    nearest_distance = dists[nearest_idx]
    
    matches.append({
        'treated_index': treated_row.name,
        'treated_Z1': treated_row['Z1'],
        'treated_Z2': treated_row['Z2'],
        'control_index': nearest_idx,
        'control_Z1': nearest_control['Z1'],
        'control_Z2': nearest_control['Z2'],
        'mahalanobis_distance': nearest_distance
    })


In [28]:
# Get Y values for treated and matched controls
treated_Ys = treated['Y']
matched_control_Ys = control.loc[matches_df['control_index']]['Y'].values

# Calculate differences
diffs = treated_Ys.values - matched_control_Ys

# Calculate ATE
ate = np.mean(diffs)

print(f"\nEstimated ATE (Average Treatment Effect) from matched pairs: {ate:.4f}")


Estimated ATE (Average Treatment Effect) from matched pairs: 3.4377


In [33]:
matches_df = pd.DataFrame(matches)
#order by furthest mahalanobis_distance 
matches_df = matches_df.sort_values(by='mahalanobis_distance', ascending=False)
print("\nMatches DataFrame:")
print(matches_df.head())


Matches DataFrame:
     treated_index  treated_Z1  treated_Z2  control_index  control_Z1  \
241            241    2.696224    0.538155            217    1.519995   
429            429    2.594425    2.893138             77    1.929532   
352            352    2.497200    2.639007             43    1.895889   
27              27   -0.028182    3.142793            117   -0.637437   
147            147    2.303917    2.464578            481    1.325014   

     control_Z2  mahalanobis_distance  
241   -1.282208              1.383005  
429    3.355691              1.326817  
352    1.095807              1.164020  
27     1.683363              1.089164  
147    1.169978              1.053704  


## Reflection Questions

Include the code you used to solve the two coding quiz problems and write about the obstacles / challenges / insights you encountered while solving them.

### Obstacles / Challenges
* **Understanding the data structure**
    The first challenge was making sure the correct variables were used, especially ensuring we correctly separated treated vs. control and identified the covariates 𝑍1, 𝑍2 for distance calculations.
* **Mahalanobis distance implementation**
    Mahalanobis distance needs the inverse covariance matrix of the covariates. Remembering to calculate it properly (across all data, not just treated or controls) and understanding the scipy function parameters required careful attention.
* **Matching with replacement logic**
    It was important to allow controls to be re-used across treated units, rather than doing a one-to-one matching without replacement, which is more complex.
* **Keeping track of indices after filtering**
    Since pandas resets indices when you filter (like with .reset_index()), matching back to the original dataframe required careful handling to avoid misreporting which control matched.



### Insights

* **Propensity score weighting is sensitive to model specification**
    I learned that estimating good propensity scores depends on correctly specifying the model of 𝑋 on covariates. Poor models can lead to extreme weights and unstable ATE estimates.

* **Common support matters for causal inference**
    Finding the treated units with poor matches (least common support) helps reveal where the treatment and control groups don’t overlap well, which is a key assumption for causal interpretation.

* **Mahalanobis distance gives a multidimensional notion of similarity**
    Unlike just matching on raw differences, Mahalanobis distance accounts for the variance and correlation between covariates, making it a more robust matching method when multiple covariates are involved.
* **Careful data handling matters as much as the math**
    A lot of the challenge was not just the formulas but handling the dataframe operations cleanly — especially tracking row indices, merging results, and ensuring distances matched the right pairs.