In [2]:
! pip install gurobipy

Collecting gurobipy
  Downloading gurobipy-12.0.1-cp311-cp311-win_amd64.whl.metadata (16 kB)
Downloading gurobipy-12.0.1-cp311-cp311-win_amd64.whl (11.2 MB)
   ---------------------------------------- 0.0/11.2 MB ? eta -:--:--
   ---------------------------------------- 0.1/11.2 MB 1.7 MB/s eta 0:00:07
    --------------------------------------- 0.2/11.2 MB 2.4 MB/s eta 0:00:05
   -- ------------------------------------- 0.6/11.2 MB 4.3 MB/s eta 0:00:03
   --- ------------------------------------ 1.0/11.2 MB 5.4 MB/s eta 0:00:02
   ---- ----------------------------------- 1.3/11.2 MB 5.5 MB/s eta 0:00:02
   ---- ----------------------------------- 1.4/11.2 MB 4.9 MB/s eta 0:00:02
   ------ --------------------------------- 1.8/11.2 MB 5.7 MB/s eta 0:00:02
   ------- -------------------------------- 2.0/11.2 MB 5.4 MB/s eta 0:00:02
   -------- ------------------------------- 2.5/11.2 MB 6.1 MB/s eta 0:00:02
   --------- ------------------------------ 2.6/11.2 MB 6.1 MB/s eta 0:00:02
   

In [1]:
import time
from itertools import product, combinations

import numpy as np
import pandas as pd
import gurobipy as gb
from sklearn.linear_model import LinearRegression


# WLS credentials
WLSACCESSID = 'ccc2c36a-db14-4956-b2e3-60adc45e9957'
WLSSECRET = '1e0e3dbf-7933-44dc-8f81-e0482ded7ac8'
LICENSEID = 2586688

# Create the Gurobi environment with parameters
env = gb.Env(empty=True)  # Start with an empty environment
env.setParam('WLSACCESSID', WLSACCESSID)
env.setParam('WLSSECRET', WLSSECRET)
env.setParam('LICENSEID', LICENSEID)
env.start() 


Set parameter WLSAccessID
Set parameter WLSSecret
Set parameter LicenseID to value 2586688
Academic license 2586688 - for non-commercial use only - registered to ru___@ucsd.edu


<gurobipy.Env, Parameter changes: WLSAccessID=(user-defined), WLSSecret=(user-defined), LicenseID=2586688>

In [2]:
df = pd.read_csv('GA_features.csv')
df.columns

Index(['Unnamed: 0.2', 'Unnamed: 0', 'Unnamed: 0.1', 'county', 'tweets',
       'contribution', 'n_poll', 'frac_unem', 'frac_votes', 'total_votes',
       ...
       'frac_voted_A', 'frac_voted_B', 'frac_voted_C', 'frac_voted_D',
       'frac_voted_E', 'frac_voted_F', 'frac_voted_G', 'total_registers',
       'latitude', 'longitude'],
      dtype='object', length=125)

# Explanation of Dataset Features

### General Information
- **`Unnamed: 0`**: Likely an index column automatically generated during data import. If not meaningful, it can be dropped.
- **`latitude`**: Latitude of the school, used for geographic analysis.
- **`longitude`**: Longitude of the school, used for geographic analysis.

---

### Input Features (`X`)
These represent characteristics of schools that may affect student outcomes:
- **`ap_ib`**: Indicator or count of students enrolled in Advanced Placement (AP) or International Baccalaureate (IB) programs. Higher values indicate better academic resources or rigor.
- **`calculus`**: Indicator or count of students enrolled in Calculus courses, which may act as a proxy for advanced math preparation.
- **`counselors`**: Number of counselors available at the school, potentially influencing college readiness and student support.
- **`frpl_rate`**: Percentage of students eligible for Free or Reduced-Price Lunch (FRPL), a socioeconomic indicator where higher values suggest greater economic disadvantage.

---

### Outcome Variables (`y`)
These represent the target outcomes or results that the model aims to improve:
- **`frac_sat_act`**: Fraction of students who took the SAT or ACT, a measure of college readiness.
- **`n_sat_act`**: Count of students who took the SAT or ACT.

---

### Demographic-Specific Counts (`n_*`)
These represent the **count of students** in specific demographic categories:
- **By Gender and Category (e.g., `n_A_m`, `n_A_f`)**:
  - `n_A_m`: Number of male students in demographic category A.
  - `n_A_f`: Number of female students in demographic category A.
  - Categories B through G follow the same format.
- **Aggregated by Category (e.g., `n_A`)**:
  - Total number of students in category A, regardless of gender.
  - Categories B through G are aggregated similarly.

---

### Demographic-Specific Fractions (`frac_*`)
These represent the **proportion of students** in specific demographic categories:
- **By Gender and Category (e.g., `frac_A_m`, `frac_A_f`)**:
  - `frac_A_m`: Fraction of male students in demographic category A.
  - `frac_A_f`: Fraction of female students in demographic category A.
  - Categories B through G follow the same format.
- **Aggregated by Category (e.g., `frac_A`)**:
  - Total fraction of students in category A, regardless of gender.
  - Categories B through G are aggregated similarly.

---

### SAT/ACT-Specific Counts and Fractions
These measure SAT/ACT participation within specific demographic categories:

#### Counts (`n_sat_act_*`):
- **By Gender and Category** (e.g., `n_sat_act_A_m`):
  - Count of male students in category A who took the SAT/ACT.
- **Aggregated by Category** (e.g., `n_sat_act_A`):
  - Total count of students in category A who took the SAT/ACT.
- Categories B through G follow the same format.

#### Fractions (`frac_sat_act_*`):
- **By Gender and Category** (e.g., `frac_sat_act_A_m`):
  - Fraction of male students in category A who took the SAT/ACT.
- **Aggregated by Category** (e.g., `frac_sat_act_A`):
  - Total fraction of students in category A who took the SAT/ACT.
- Categories B through G follow the same format.

---

### Other Features
- **`total_students`**: Total number of students in the school, regardless of demographic categories. Useful for normalizing counts or computing participation rates.

---

### Summary of Feature Groups

| **Feature Group**             | **Description**                                               |
|-------------------------------|-------------------------------------------------------------|
| General Information           | School index, latitude, longitude                           |
| Input Features                | Academic resources, socioeconomic data (`ap_ib`, `frpl_rate`) |
| Outcome Variables             | SAT/ACT participation metrics                               |
| Demographic Counts (`n_*`)    | Counts of students by demographic category and gender       |
| Demographic Fractions (`frac_*`) | Proportions of students by demographic category and gender  |
| SAT/ACT Counts and Fractions  | Participation counts and fractions by demographic group     |

---


In [3]:
# Define Constants
SOCIAL_CATEGORIES = ['A', 'B', 'C', 'D', 'E', 'F', 'G']
BUDGET = 100
TAU_VALUES = [0.566, None]  # Define fairness constraints for optimization

# Data Preparation
df = pd.read_csv('GA_features.csv')
# X_columns = ['frpl_rate', 'calculus', 'ap_ib', 'counselors']
X_columns = ['frac_unem', 'n_poll', 'contribution', 'tweets']
count_columns = [f'registered_{category}' for category in SOCIAL_CATEGORIES]
frac_columns = [f'frac_registered_{category}' for category in SOCIAL_CATEGORIES]

X = df[X_columns]
A_frac = df[frac_columns]
y_train = df['frac_votes'].values

neighbor_distance_matrix = np.load('distance_matrix.npy')
neighbor_index_matrix = np.load('index_matrix.npy')

contribution = X['contribution'].values
n_poll = X['n_poll'].values
tweets = X['tweets'].values
n = len(X)

In [None]:
X

In [None]:
# AP_IB = X['ap_ib'].values
# COUNSELORS = X['counselors'].values
# FRPL = np.ones_like(X['frpl_rate'].values)
# A_FRAC = df[frac_columns]
# A_MATRIX = A_FRAC.values


# NEIGHBOR_INDEX_MATRIX = np.load('neighbor_index_matrix.npy')
# NEIGHBOR_DISTANCE_MATRIX = np.load('neighbor_distance_matrix.npy')
# NUM_SCHOOLS = X.shape[0]
# # weight_df = pd.read_csv('params_7_disagg.csv', index_col=0)
# # WEIGHT_MATRIX = weight_df.values

# #possible intervention - column represents neighbours
# NUM_NEIGHBORS = NEIGHBOR_INDEX_MATRIX.shape[1]
# intervention_sample_spaces = [(0, 1)] * NUM_NEIGHBORS
# POSSIBLE_INTERVENTIONS_MATRIX = np.array(list(
#     product(*intervention_sample_spaces)
# ))
# NUM_POSSIBLE_INTERVENTIONS = POSSIBLE_INTERVENTIONS_MATRIX.shape[0]

# BUDGET = 100

# NUM_CATEGORIES = 28
# CATEGORIES = list(range(NUM_CATEGORIES))
# CATEGORY_PAIRS = list(combinations(CATEGORIES, 2))

# DEMOGRAPHIC_COUNTERFACTUALS = [0, 1]
# NUM_COUNTERFACTUALS = len(DEMOGRAPHIC_COUNTERFACTUALS)

# TOTAL_STUDENTS = df['total_students'].values
# R_COUNTS = df[count_columns].values
# R_COUNTS_TOTAL = R_COUNTS.sum(axis=0)

# CALCULUS = X['calculus']
# A_DIMENSION = A_MATRIX.shape[1]

# WHETHER_OR_NOT_CALCULUS_GIVEN_INTERFERENCE = np.max(
#     NEIGHBOR_DISTANCE_MATRIX * CALCULUS.values, axis=1)

In [4]:
# Define Constants
SOCIAL_CATEGORIES = ['A', 'B', 'C', 'D', 'E', 'F', 'G']
BUDGET = 100
TAU_VALUES = [0.566, None]  # Define fairness constraints for optimization

# Data Preparation
df = pd.read_csv('GA_features.csv')

# Define relevant columns
X_columns = ['frac_unem', 'n_poll', 'contribution', 'tweets']
count_columns = [f'registered_{category}' for category in SOCIAL_CATEGORIES]
frac_columns = [f'frac_registered_{category}' for category in SOCIAL_CATEGORIES]

# Extract features and targets
X = df[X_columns]
A_frac = df[frac_columns]
y_train = df['frac_votes'].values

# Prepare other required matrices and values
CALCULUS = X['frac_unem'].values  # Assuming 'frac_unem' represents calculus data
COUNSELORS = X['n_poll'].values
FRPL = np.ones_like(X['contribution'].values)
A_MATRIX = A_frac.values
TOTAL_R = df['total_registers'].values
R_COUNTS = df[count_columns].values
R_COUNTS_TOTAL = R_COUNTS.sum(axis=0)

# Load neighborhood matrices
NEIGHBOR_INDEX_MATRIX = np.load('index_matrix.npy')
NEIGHBOR_DISTANCE_MATRIX = np.load('distance_matrix.npy')

# Calculate dimensions and possible interventions
NUM_SCHOOLS = X.shape[0]
NUM_NEIGHBORS = NEIGHBOR_INDEX_MATRIX.shape[1]
intervention_sample_spaces = [(0, 1)] * NUM_NEIGHBORS
POSSIBLE_INTERVENTIONS_MATRIX = np.array(list(
    product(*intervention_sample_spaces)
))
NUM_POSSIBLE_INTERVENTIONS = POSSIBLE_INTERVENTIONS_MATRIX.shape[0]

# Define demographic counterfactuals
DEMOGRAPHIC_COUNTERFACTUALS = [0, 1]
NUM_COUNTERFACTUALS = len(DEMOGRAPHIC_COUNTERFACTUALS)

# Interference effect calculation (example)
WHETHER_OR_NOT_CALCULUS_GIVEN_INTERFERENCE = np.max(
    NEIGHBOR_DISTANCE_MATRIX * CALCULUS[:, None], axis=1
)

# Additional features
contribution = X['contribution'].values
n_poll = X['n_poll'].values
tweets = X['tweets'].values
n = len(X)


In [None]:
# neighbor_index_matrix = np.load('neighbor_index_matrix.npy')
# neighbor_distance_matrix = np.load('neighbor_distance_matrix.npy')

# # Print basic information
# print("Neighbor Index Matrix:")
# print("Shape:", neighbor_index_matrix.shape)
# print("Content (First 5 Rows):\n", neighbor_index_matrix[:5])

# print("\nNeighbor Distance Matrix:")
# print("Shape:", neighbor_distance_matrix.shape)
# print("Content (First 5 Rows):\n", neighbor_distance_matrix[:5])

In [8]:
neighbor_distance_matrix.sum()

180.03710752209687

In [5]:
# Find and analyze differences
diff_indices = np.where(neighbor_distance_matrix != neighbor_distance_matrix.T)
for i, j in zip(diff_indices[0], diff_indices[1]):
    print(f"M[{i}, {j}] = {neighbor_distance_matrix[i, j]}, M[{j}, {i}] = {neighbor_distance_matrix[j, i]}")


M[3, 43] = 0.0, M[43, 3] = 0.01944789998479488
M[3, 64] = 0.0, M[64, 3] = 0.018502561336574925
M[3, 124] = 0.0, M[124, 3] = 0.016939410396047995
M[5, 67] = 0.0, M[67, 5] = 0.032337242186982404
M[7, 109] = 0.02828219683566139, M[109, 7] = 0.0
M[7, 114] = 0.0, M[114, 7] = 0.02416468780808177
M[8, 141] = 0.02599353807160201, M[141, 8] = 0.0
M[9, 76] = 0.0, M[76, 9] = 0.02736538389201286
M[10, 101] = 0.0, M[101, 10] = 0.03229806923148114
M[11, 86] = 0.0, M[86, 11] = 0.02617989520909695
M[11, 157] = 0.0, M[157, 11] = 0.02301055534989381
M[12, 23] = 0.0, M[23, 12] = 0.020616915300489147
M[12, 147] = 0.0, M[147, 12] = 0.022240357163258322
M[13, 34] = 0.02352673796726905, M[34, 13] = 0.0
M[13, 36] = 0.026657819795958757, M[36, 13] = 0.0
M[17, 84] = 0.03449541102608217, M[84, 17] = 0.0
M[17, 101] = 0.0, M[101, 17] = 0.032597171867988245
M[17, 125] = 0.03274595872522721, M[125, 17] = 0.0
M[18, 29] = 0.0, M[29, 18] = 0.028271956228138797
M[19, 97] = 0.01479858237149285, M[97, 19] = 0.0
M[20, 81] 

In [6]:
# # Calculate adjusted features for regression model
# def compute_adjusted_features(feature_values, A_frac, neighbor_distance_matrix):
#     max_neighbor_influence = np.max(neighbor_distance_matrix * feature_values.T, axis=1).reshape(n, 1)
#     return A_frac * max_neighbor_influence

# a_max_Sij_Pj = compute_adjusted_features(ap_ib, A_frac, neighbor_distance_matrix)
# a_max_Sij_Cj = compute_adjusted_features(calculus, A_frac, neighbor_distance_matrix)
# a_Fj = A_frac * counselors.reshape(n, 1)

# # Combine features for regression model
# X_train = np.concatenate((a_max_Sij_Pj, a_max_Sij_Cj, a_Fj, A_frac), axis=1)

# # Train linear regression model
# linmod = LinearRegression(fit_intercept=False).fit(X_train, y_train)
# model_weights = linmod.coef_
# param_dims = len(SOCIAL_CATEGORIES)

# # Extract regression weights
# weight_dict = {
#     'alpha': model_weights[param_dims:param_dims*2],
#     'beta': model_weights[:param_dims],
#     'gamma': model_weights[param_dims*2:param_dims*3],
#     'theta': model_weights[-param_dims:]
# }
# params = pd.DataFrame(weight_dict)

# ALPHA, BETA, GAMMA, THETA = (params['alpha'].values, params['beta'].values, 
#                              params['gamma'].values, params['theta'].values)

# ALPHA, BETA, GAMMA, THETA

NameError: name 'ap_ib' is not defined

In [7]:
# Define updated features
X_columns = ['frac_unem', 'n_poll', 'contribution', 'tweets']

# Extract updated features and targets
X = df[X_columns]
frac_unem = X['frac_unem'].values
n_poll = X['n_poll'].values
contribution = X['contribution'].values
tweets = X['tweets'].values

# Calculate adjusted features for regression model
def compute_adjusted_features(feature_values, A_frac, neighbor_distance_matrix):
    # Calculate maximum neighbor influence scaled by distance
    max_neighbor_influence = np.max(neighbor_distance_matrix * feature_values[:, None], axis=1).reshape(n, 1)
    return A_frac * max_neighbor_influence

# Compute adjusted features using the updated columns
a_max_Sij_frac_unem = compute_adjusted_features(frac_unem, A_frac, neighbor_distance_matrix)
a_max_Sij_n_poll = compute_adjusted_features(n_poll, A_frac, neighbor_distance_matrix)
a_max_Sij_contribution = compute_adjusted_features(contribution, A_frac, neighbor_distance_matrix)
a_max_Sij_tweets = compute_adjusted_features(tweets, A_frac, neighbor_distance_matrix)

# Combine features for regression model
X_train = np.concatenate((a_max_Sij_frac_unem, a_max_Sij_n_poll, 
                          a_max_Sij_contribution, a_max_Sij_tweets, A_frac), axis=1)

# Train linear regression model
linmod = LinearRegression(fit_intercept=False).fit(X_train, y_train)
model_weights = linmod.coef_

# Define parameter dimensions based on social categories
param_dims = len(SOCIAL_CATEGORIES)

# Extract regression weights for the updated features
# ALPHA: frac_unem
# BETA: n_poll
# GAMMA: contribution
# DELTA: tweets
# THETA: A_frac
weight_dict = {
    'alpha': model_weights[:param_dims],
    'beta': model_weights[param_dims:param_dims*2],
    'gamma': model_weights[param_dims*2:param_dims*3],
    'delta': model_weights[param_dims*3:param_dims*4],
    'theta': model_weights[param_dims*4:]
}

# Store weights in a DataFrame
params = pd.DataFrame(weight_dict, index=SOCIAL_CATEGORIES)

# Extract weight vectors
ALPHA = params['alpha'].values
BETA = params['beta'].values
GAMMA = params['gamma'].values
DELTA = params['delta'].values
THETA = params['theta'].values

# View results
print("ALPHA (frac_unem):", ALPHA)
print("BETA (n_poll):", BETA)
print("GAMMA (contribution):", GAMMA)
print("DELTA (tweets):", DELTA)
print("THETA (A_frac):", THETA)


ALPHA (frac_unem): [  0.58228815   0.80293196 -23.93764618   0.13683626  35.05485125
  -6.12616903  -9.43794085]
BETA (n_poll): [-0.15566318 -0.11416816 -1.14465908  1.05640318 39.34117276  1.90360031
 -0.14928243]
GAMMA (contribution): [-2.64687684e-09  2.45915475e-08  4.80094343e-08 -1.42543061e-07
 -6.36376246e-06  6.33192583e-08  4.45895417e-08]
DELTA (tweets): [ 0.00393742 -0.00113047  0.01636672  0.00101458  0.80113931 -0.02424243
 -0.02411203]
THETA (A_frac): [  0.59310839   0.68230219   2.56738399  -0.2678885  -18.10622622
   1.09305964   1.35503998]


In [12]:
params.to_csv('results/params.csv', index=False)

In [13]:
# # Optimization Helpers
# def calculate_expected_impact(index, intervention_array, demographic_vector):
#     nearest_neighbors = neighbor_index_matrix[index, :]
#     neighbor_distances = neighbor_distance_matrix[index, nearest_neighbors]

#     calculus_term = np.dot(demographic_vector, ALPHA) * np.max(neighbor_distances * intervention_array)
#     ap_ib_term = np.dot(demographic_vector, BETA) * np.max(neighbor_distances * ap_ib[nearest_neighbors])
#     counselors_term = np.dot(demographic_vector, GAMMA) * counselors[index]
#     race_term = np.dot(demographic_vector, THETA)

#     impact = calculus_term + ap_ib_term + counselors_term + race_term
#     return max(min(impact, 1), 0)

In [14]:
df.columns

Index(['Unnamed: 0.2', 'Unnamed: 0', 'Unnamed: 0.1', 'county', 'tweets',
       'contribution', 'n_poll', 'frac_unem', 'frac_votes', 'total_votes',
       ...
       'frac_voted_A', 'frac_voted_B', 'frac_voted_C', 'frac_voted_D',
       'frac_voted_E', 'frac_voted_F', 'frac_voted_G', 'total_registers',
       'latitude', 'longitude'],
      dtype='object', length=125)

In [8]:
# Optimization Helper
def expected_impact_i(index, intervention_array):
    """
    Calculate the expected impact for a given index, intervention array, and demographic vector.
    """
    racial_prop = ['frac_registered_A', 'frac_registered_B', 'frac_registered_C', 'frac_registered_D', 
                       'frac_registered_E', 'frac_registered_F', 'frac_registered_G']
    demographic_vector = df.loc[i,  racial_prop]
    
    # Get nearest neighbors and distances for the given index
    nearest_neighbors = neighbor_index_matrix[index, :]
    neighbor_distances = neighbor_distance_matrix[index, nearest_neighbors]
    
    # Compute terms for each feature using the revised features and weights
    frac_unem_term = np.dot(demographic_vector, ALPHA) * np.max(neighbor_distances * intervention_array)
    n_poll_term = np.dot(demographic_vector, BETA) * np.max(neighbor_distances * n_poll[nearest_neighbors])
    contribution_term = np.dot(demographic_vector, GAMMA) * np.max(neighbor_distances * contribution[nearest_neighbors])
    tweets_term = np.dot(demographic_vector, DELTA) * np.max(neighbor_distances * tweets[nearest_neighbors])
    demographic_term = np.dot(demographic_vector, THETA)

    # Calculate total impact
    impact = frac_unem_term + n_poll_term + contribution_term + tweets_term + demographic_term
    
    # Clamp impact between 0 and 1
    return max(min(impact, 1), 0)

In [9]:
def expected_impact(intervention_array):
        impact = 0
        for i in range(len(intervention_array)):
            impact += expected_impact_i(i, intervention_array[i])
        return impact

In [10]:
def calculate_all_possible_impacts(index, POSSIBLE_INTERVENTIONS_MATRIX):
    possible_impacts = np.empty(len(POSSIBLE_INTERVENTIONS_MATRIX))
    for k, intervention_array in enumerate(POSSIBLE_INTERVENTIONS_MATRIX):
        possible_impacts[k] = expected_impact_i(index, intervention_array)
    return possible_impacts

In [11]:
# Optimization Routine
def optimize_interventions(tau_value, A_frac, POSSIBLE_INTERVENTIONS_MATRIX):
    print(f'Running optimization for tau={tau_value}')
    model = gb.Model(env=env)

    interventions = model.addVars(n, vtype=gb.GRB.BINARY, name="interventions")
    model.addConstr(sum(interventions.values()) <= BUDGET, "budget_constraint")

    def add_auxiliary_constraints(index):
        demographic_vector = A_frac.values[index, :]
        factual_impacts = calculate_all_possible_impacts(index, POSSIBLE_INTERVENTIONS_MATRIX)

        auxiliary_vars = model.addVars(
            len(factual_impacts), obj=factual_impacts, vtype=gb.GRB.CONTINUOUS
        )
        model.update()

        for j, intervention in enumerate(POSSIBLE_INTERVENTIONS_MATRIX):
            for k, neighbor in enumerate(neighbor_index_matrix[index]):
                if intervention[k] == 1:
                    model.addConstr(auxiliary_vars[j] <= interventions[neighbor])
                else:
                    model.addConstr(auxiliary_vars[j] <= 1 - interventions[neighbor])
        model.addConstr(sum(auxiliary_vars.values()) == 1)

        if tau_value is not None:
            for group_idx in range(A_frac.shape[1]):
                group_impact_diff = calculate_all_possible_impacts(index, POSSIBLE_INTERVENTIONS_MATRIX) - factual_impacts
                model.addConstr(
                    sum(auxiliary_vars[j] * group_impact_diff[j] for j in range(len(factual_impacts))) <= tau_value
                )

    for index in range(n):
        add_auxiliary_constraints(index)

    model.setObjective(model.getObjective(), gb.GRB.MAXIMIZE)
    model.optimize()

    if model.status == gb.GRB.OPTIMAL:
        return np.array([interventions[i].X for i in range(n)]).astype(bool)
    else:
        raise RuntimeError("Optimization failed.")

# # Run optimization for each tau value
# for tau_value in TAU_VALUES:
#     try:
#         optimal_interventions = optimize_interventions(tau_value, A_frac, POSSIBLE_INTERVENTIONS_MATRIX)
#         print(f"Optimal interventions: {np.where(optimal_interventions)}")
#     except RuntimeError as e:
#         print(f"Optimization failed for tau={tau_value}: {e}")


In [None]:
BUDGET = 130
TAU_VALUES = [0.43, 0.75, None]  # Define fairness constraints for optimization

SAVE_DIR = "results/"
import os
if not os.path.exists(SAVE_DIR):
    os.makedirs(SAVE_DIR)

impact_results = []

# Run optimization for each tau value
for tau_value in TAU_VALUES:
    try:
        optimal_interventions = optimize_interventions(tau_value, A_frac, POSSIBLE_INTERVENTIONS_MATRIX)
        impact = expected_impact(optimal_interventions)
        impact_results.append(impact)
        print(f"Optimal interventions: {np.where(optimal_interventions)}")
        print(f"impact: {impact}")
        
        # Save results
        file_name = os.path.join(SAVE_DIR, f"budget{BUDGET}_tau_{tau_value}.npy")
        np.save(file_name, optimal_interventions)
    except RuntimeError as e:
        print(f"Optimization failed for tau={tau_value}: {e}")


pd.Series(impact_results).to_csv('results/impact_results.csv', index=False, header=False)


Running optimization for tau=0.43
Gurobi Optimizer version 12.0.1 build v12.0.1rc0 (win64 - Windows 11.0 (26100.2))

CPU model: 13th Gen Intel(R) Core(TM) i5-13500H, instruction set [SSE2|AVX|AVX2]
Thread count: 12 physical cores, 16 logical processors, using up to 16 threads

Academic license 2586688 - for non-commercial use only - registered to ru___@ucsd.edu
Optimize a model with 62329 rows, 10335 columns and 132447 nonzeros
Model fingerprint: 0xc18b63ad
Variable types: 10176 continuous, 159 integer (159 binary)
Coefficient statistics:
  Matrix range     [1e+00, 1e+00]
  Objective range  [6e-01, 1e+00]
  Bounds range     [1e+00, 1e+00]
  RHS range        [4e-01, 1e+02]
Found heuristic solution: objective 108.7659237
Presolve removed 1113 rows and 0 columns
Presolve time: 0.37s
Presolved: 61216 rows, 10335 columns, 132447 nonzeros
Variable types: 10176 continuous, 159 integer (159 binary)

Deterministic concurrent LP optimizer: primal and dual simplex (primal and dual model)
Showing 

In [None]:
optimal_interventions

In [None]:
expected_impact(optimal_interventions)