# Part I: Statistical Analysis for Reproducing Paper Figures

This notebook contains the statistical analysis code used in the paper **"Insights from Reinforcement Learning and individual-based model simulations on population compliance and policy optimization during COVID-19"** to support the second objective: evaluating how varying levels of population compliance influence outcomes, including the severity and duration of interventions, while accounting for demographic structure.  


 ## What this notebook includes:
- Bootstrap confidence intervals for key outcome metrics (`Hospitalize`, `Deceased`, `Rt`, `CumulativeEconomicIndex`)
- T-tests and Tukey HSD tests across compliance groups
- Analysis of intervention durations by severity
- Visualizations: boxplots, heatmaps, and policy distribution charts

---

## Dataset Location and Loading Instructions

This analysis is based on the simulation output file:

experiment2_results_UP.csv

If you are running this notebook on **Google Colab**, follow these steps to load the dataset from your Google Drive:

1. Mount your Google Drive:

```python
from google.colab import drive
drive.mount('/content/drive')

	2.	Load the dataset:

import pandas as pd

# === Dataset Loading ===
data = pd.read_csv("/content/drive/Name_Your_Drive/path_to_the_data_file", header=1)

Make sure to replace "Name_Your_Drive/path_to_the_data_file" with the full path to experiment1_results_UP.csv inside your own Google Drive.

**Note:**

This notebook includes sections titled
## Supplementary Visualizations (Not Included in the Published Article)
These parts contain extended plots and visual summaries that go beyond what was included in the published article.
They are intended to help better understand the simulation dynamics and provide additional context for interpreting the results.

In [None]:
"""
This notebook cell initializes the environment and loads the dataset required for analysis.

The following steps are included:
- Importing essential libraries for data manipulation, visualization, and statistical analysis.
- Mounting Google Drive to access project files.
- Loading a CSV dataset into a pandas DataFrame for further exploration.

Assumes the notebook is running in a Google Colab environment.
"""

# === Library Imports ===

# Core data science libraries
import pandas as pd  # Data manipulation and analysis
import numpy as np  # Numerical computations

# Statistical analysis tools
from scipy.stats import ttest_ind, f_oneway, sem, t
import statsmodels.api as sm
from statsmodels.formula.api import ols
from statsmodels.stats.multicomp import pairwise_tukeyhsd
from statsmodels.stats.multicomp import MultiComparison

# Visualization libraries
import matplotlib.pyplot as plt
import seaborn as sns
from matplotlib.gridspec import GridSpec
from matplotlib.patches import Patch
import matplotlib.patheffects as path_effects
from matplotlib.colors import Normalize
from matplotlib.cm import ScalarMappable
from matplotlib.ticker import FuncFormatter
# Machine learning tools
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, r2_score
import re  # Used for working with regular expressions (e.g., extracting numbers or matching text patterns)
# === Google Drive Mounting (Colab-specific) ===
from google.colab import drive,files
drive.mount('/content/drive')

# === Dataset Loading ===

data = pd.read_csv("/content/drive/Name_Your_Drive/path_to_the_data_file", header=0)
data.columns = data.columns.str.strip()
data['City_OrderVac'] = data['City'] + ' - ' + data['Order Vac']

In [None]:
print(data.columns)
print(data.head(10))

In [None]:

data.columns = data.columns.str.strip()
print(data.columns)


In [None]:
"""
Displays the unique city names present in the dataset.

Useful for exploratory data analysis and to verify the values
prior to filtering by city.
"""

print(data['City'].unique())

In [None]:
"""
Cleans the 'City' column by removing extra whitespace,
then filters the dataset to include only selected cities.

Steps
-----
1. Strips leading and trailing spaces from all entries in the 'City' column.
2. Filters the dataset to include only rows where 'City' is 'Holon' or 'Bene Beraq'.
3. Prints the unique values in the filtered 'City' column to verify the result.
"""

# Remove extra spaces from city names
data['City'] = data['City'].str.strip()

# Filter the dataset for specific cities after cleaning
filtered_data = data[data['City'].isin(['Holon', 'Bene Beraq'])]

# Verify the unique values in the filtered dataset
print(filtered_data['City'].unique())

In [None]:
"""
Performs linear regression to evaluate the relationship between 'Compliance'
and multiple dependent variables, across combinations of 'Order Vac' and 'City'.

For each subset of the filtered data (by city and vaccination order), a
linear regression model is trained using 'Compliance' as the independent variable.
The model is evaluated on test data using MSE and R-squared metrics.

Steps
-----
1. Define dependent variables of interest.
2. Retrieve unique values from 'Order Vac' and 'City' for subsetting.
3. Iterate over each combination of these two dimensions.
4. For each dependent variable:
    - Split the data into training and testing sets.
    - Fit a linear regression model.
    - Make predictions on the test set.
    - Evaluate using MSE and R².
    - Store and print the results.
"""

# Define the dependent variables
y_columns = ['Cumulative economic index', 'Deceased', 'Hospitalize']

# Get unique values of 'Order Vac' and 'City'
order_vac_values = filtered_data['Order Vac'].unique()
city_values = filtered_data['City'].unique()

# Prepare results dictionary
results = {}

# Loop through all combinations of OrderVac and City
for order_vac in order_vac_values:
    for city in city_values:
        print(f"\nResults for OrderVac = {order_vac}, City = {city}\n")

        # Subset the data
        subset_data = filtered_data[
            (filtered_data['Order Vac'] == order_vac) &
            (filtered_data['City'] == city)
        ]

        # Independent variable
        X = subset_data[['Compliance']]

        # Initialize dictionary for results
        results[(order_vac, city)] = {}

        # Loop through each dependent variable
        for y_column in y_columns:
            y = subset_data[y_column]

            # Train/test split
            X_train, X_test, y_train, y_test = train_test_split(
                X, y, test_size=0.2, random_state=42
            )

            # Create and fit linear regression model
            model = LinearRegression()
            model.fit(X_train, y_train)

            # Predict and evaluate
            y_pred = model.predict(X_test)
            mse = mean_squared_error(y_test, y_pred)
            r2 = r2_score(y_test, y_pred)

            # Store results
            results[(order_vac, city)][y_column] = {
                'Coefficient': model.coef_[0],
                'Intercept': model.intercept_,
                'MSE': mse,
                'R^2': r2
            }

            # Print results
            print(f"Results for {y_column}:")
            print(f"  Coefficient (Compliance): {results[(order_vac, city)][y_column]['Coefficient']}")
            print(f"  Intercept: {results[(order_vac, city)][y_column]['Intercept']}")
            print(f"  Mean Squared Error (MSE): {results[(order_vac, city)][y_column]['MSE']}")
            print(f"  R^2 Score: {results[(order_vac, city)][y_column]['R^2']}\n")

In [None]:
"""
Performs descriptive statistical analysis by calculating means and standard deviations
for each outcome variable, grouped by 'City', 'Compliance', and 'Order Vac'.

This step helps summarize the central tendency and variability of the dependent
variables across different subgroups in the dataset.

Grouped Variables
-----------------
- City
- Compliance
- Order Vac

Measured Outcomes
-----------------
- Cumulative economic index: mean and standard deviation
- Hospitalize: mean and standard deviation
- Deceased: mean and standard deviation
"""

# Group data and compute descriptive statistics (mean and std)
stats = data.groupby(['City', 'Compliance', 'Order Vac']).agg({
    'Cumulative economic index': ['mean', 'std'],
    'Hospitalize': ['mean', 'std'],
    'Deceased': ['mean', 'std']
}).reset_index()

# Display the results
print("Means and standard deviations for each metric:")
print(stats)

In [None]:
def calculate_confidence_interval(data, confidence=0.95):
    """
    Calculates a confidence interval for the mean of a dataset using a t-distribution.

    Parameters
    ----------
    data : array-like
        The sample data to compute the interval for.
    confidence : float, optional
        Confidence level for the interval, by default 0.95.

    Returns
    -------
    tuple of float
        A tuple containing (mean, lower bound, upper bound).
    """
    n = len(data)
    mean = np.mean(data)
    std_err = sem(data)
    h = std_err * t.ppf((1 + confidence) / 2, n - 1)
    return mean, mean - h, mean + h


# === Confidence Interval Computation for Each Group ===

"""
Computes 95% confidence intervals for each outcome variable:
- 'Cumulative economic index'
- 'Hospitalize'
- 'Deceased'

Grouping is done by:
- City
- Compliance level
- Order Vac type

The result is a DataFrame where each row corresponds to one group, with
confidence intervals reported for each metric.
"""

ci_results = []
for (city, compliance, order_vac), group in data.groupby(['City', 'Compliance', 'Order Vac']):
    econ_index_ci = calculate_confidence_interval(group['Cumulative economic index'])
    hosp_ci = calculate_confidence_interval(group['Hospitalize'])
    death_ci = calculate_confidence_interval(group['Deceased'])

    ci_results.append({
        'City': city,
        'Compliance': compliance,
        'Order Vac': order_vac,
        'Economic Index CI': econ_index_ci,
        'Hospitalize CI': hosp_ci,
        'Deceased CI': death_ci
    })

# Convert to DataFrame for display or export
ci_df = pd.DataFrame(ci_results)

# Print the confidence intervals
print("Confidence intervals for each metric:")
print(ci_df)

In [None]:
"""
Performs a one-way ANOVA test for each of the specified outcome variables,
to determine whether differences in means exist between levels of 'Compliance'.

For each metric (dependent variable), the data is grouped by 'Compliance'
and tested using scipy.stats.f_oneway.

Tested Metrics
--------------
- Cumulative economic index
- Hospitalize
- Deceased

Returns
-------
dict
    Dictionary containing p-values from the ANOVA test for each metric.
    A small p-value (typically < 0.05) indicates a statistically significant difference.
"""

from scipy.stats import f_oneway

# Store ANOVA results
anova_results = {}

# Loop over each metric
for metric in ['Cumulative economic index', 'Hospitalize', 'Deceased']:
    # Group the data by 'Compliance' and extract values for each group
    groups = [group[metric] for _, group in data.groupby('Compliance')]

    # Perform one-way ANOVA
    anova_result = f_oneway(*groups)

    # Store the p-value
    anova_results[metric] = anova_result.pvalue

# Display the results
print("ANOVA test results (p-values):")
print(anova_results)

In [None]:
"""
Performs a factorial ANOVA (Type II) using ordinary least squares (OLS) regression,
to assess the main effects and interaction effects of categorical variables on
'Cumulative economic index'.

The model includes:
- Main effects: Compliance, City, Order Vac
- Two-way interactions: Compliance × City, Compliance × Order Vac, City × Order Vac
- Three-way interaction: Compliance × City × Order Vac

A Type II ANOVA table is generated from the fitted model.

Formula
-------
'Cumulative economic index ~ Compliance + City + Order Vac
 + Compliance:City + Compliance:Order Vac + City:Order Vac
 + Compliance:City:Order Vac'

Returns
-------
pandas.DataFrame
    ANOVA table with sum of squares, degrees of freedom, F-values, and p-values
    for each term in the model.
"""

# Define the factorial model formula (with interaction terms)
formula = 'Q("Cumulative economic index") ~ C(Compliance) + C(City) + C(Q("Order Vac")) + \
           C(Compliance):C(City) + C(Compliance):C(Q("Order Vac")) + C(City):C(Q("Order Vac")) + \
           C(Compliance):C(City):C(Q("Order Vac"))'

# Fit the OLS model
model = ols(formula, data=data).fit()

# Perform Type II ANOVA
anova_table = sm.stats.anova_lm(model, typ=2)

# Display the ANOVA results
print("ANOVA results (Type II):")
print(anova_table)

Supplementary Visualizations

In [None]:
"""
Fits a factorial ANOVA model and visualizes F-values for selected effects,
with renamed axis labels for clarity.

This step is intended to communicate statistically significant factors more
clearly to readers by:
1. Removing residuals and less relevant interaction terms from the plot.
2. Renaming model term labels to plain English descriptions.
3. Visualizing F-statistics with a reference significance threshold.

Model
-----
'Cumulative economic index ~ Compliance + City + Order Vac
 + Compliance × City
 + Compliance × Order Vac
 + City × Order Vac
 + Compliance × City × Order Vac'

Excluded from Plot
------------------
- Residual (always present, non-factor)
- Compliance × City
- Compliance × Order Vac

Displayed Effects
-----------------
- Compliance Level
- City (Holon vs. Bnei Brak)
- Vaccination Order (Asc vs Desc)
- City × Vaccination Order
- Compliance × City × Vaccination Order
"""

# Define full factorial model formula
formula = 'Q("Cumulative economic index") ~ C(Compliance) + C(City) + C(Q("Order Vac")) + ' \
          'C(Compliance):C(City) + C(Compliance):C(Q("Order Vac")) + C(City):C(Q("Order Vac")) + ' \
          'C(Compliance):C(City):C(Q("Order Vac"))'

# Fit model using OLS
model = ols(formula, data=data).fit()

# Perform Type II ANOVA
anova_table = sm.stats.anova_lm(model, typ=2)

# Remove specific effects from the plot (not from the analysis)
effects_to_remove = [
    'Residual',
    'C(Compliance):C(City)',
    'C(Compliance):C(Q("Order Vac"))'
]
filtered_anova = anova_table.drop(index=effects_to_remove)

# Map technical term names to more reader-friendly labels
rename_map = {
    'C(Compliance)': 'Compliance Level',
    'C(City)': 'City (Holon vs. Bnei Brak)',
    'C(Q("Order Vac"))': 'Vaccination Order (Asc vs Desc)',
    'C(City):C(Q("Order Vac"))': 'City × Vaccination Order',
    'C(Compliance):C(City):C(Q("Order Vac"))': 'Compliance × City × Vaccination Order'
}

# Apply renaming for visualization only
filtered_anova_renamed = filtered_anova.rename(index=rename_map)

# Display full ANOVA table with original term names
print("ANOVA results (full model, original term names):")
print(anova_table)

# Plot bar chart of F-statistics for selected terms
fig, ax = plt.subplots(figsize=(10, 6))
filtered_anova_renamed["F"].plot(kind='bar', ax=ax, color='skyblue', alpha=0.8)

# Add reference line for approximate significance threshold
ax.axhline(y=4.0, color='red', linestyle='--', label='F ≈ 4 (p ≈ 0.05)')

# Customize plot appearance
ax.set_title("ANOVA: F Values for Selected Effects (Renamed)", fontsize=14, weight='bold')
ax.set_ylabel("F Value", fontsize=12)
ax.set_xticklabels(filtered_anova_renamed.index, rotation=45, ha='right')
ax.grid(axis='y', linestyle='--', alpha=0.7)
ax.legend()

plt.tight_layout()
plt.show()

Supplementary Visualizations

In [None]:
# List of metrics for analysis
metrics = ['Cumulative economic index', 'Hospitalize', 'Deceased']

# Dictionary to store Tukey test results
tukey_results = {}

# Perform Tukey's HSD test for each metric
for metric in metrics:
    tukey = pairwise_tukeyhsd(endog=data[metric], groups=data['Compliance'], alpha=0.05)
    tukey_results[metric] = tukey
    print(f"\n### Tukey HSD results for {metric}: ###\n")
    print(tukey.summary())
    print("\n")

    # Create Tukey plot
    fig = plt.figure(figsize=(10, 6))
    tukey.plot_simultaneous()
    plt.title(f"Tukey HSD for {metric}")
    plt.xlabel("Difference in Means")
    plt.show()

# Create boxplots for each metric
for metric in metrics:
    plt.figure(figsize=(12, 6))
    sns.boxplot(x='Compliance', y=metric, data=data, palette="viridis")
    plt.title(f"Boxplot of {metric} by Compliance Level")
    plt.xlabel("Compliance Level")
    plt.ylabel(metric)
    plt.show()

In [None]:
data.rename(columns={
    'Cumulative economic index': 'CumulativeEconomicIndex',
    'Hospitalize ': 'Hospitalize',
    'Order Vac': 'OrderVac',
    'Deceased ': 'Deceased'
}, inplace=True)


In [None]:
"""
Performs two-way ANOVA with interaction for each selected outcome variable,
evaluating the effects of Compliance, Vaccination Order, and their interaction.

This analysis is applied to:
- CumulativeEconomicIndex
- Hospitalize
- Deceased

Each model includes:
- Main effect of Compliance
- Main effect of OrderVac
- Interaction effect: Compliance × OrderVac

ANOVA tables (Type II) are stored for each metric and printed sequentially.

Steps
-----
1. Loop over each outcome metric.
2. Define a model formula with two-way interaction.
3. Fit the model using OLS.
4. Perform ANOVA (type 2) and store results.
5. Print the ANOVA table for each metric.

Returns
-------
- Dictionary of ANOVA tables (one per metric)
- Printed ANOVA results
"""

anova_results = {}

for metric in ['CumulativeEconomicIndex', 'Hospitalize', 'Deceased']:
    formula = f'{metric} ~ C(Compliance) + C(OrderVac) + C(Compliance):C(OrderVac)'
    model = ols(formula, data=data).fit()
    anova_table = sm.stats.anova_lm(model, typ=2)
    anova_results[metric] = anova_table

# Display results for each metric
for metric, table in anova_results.items():
    print(f"\nANOVA results for metric: {metric}")
    print(table)

Supplementary Visualizations

In [None]:
"""
Conducts Welch's independent t-tests to compare ASCENDING vs DESCENDING vaccination
order within each (City × Compliance) subgroup for selected outcome metrics.
Also visualizes the resulting t-statistics by metric and city.

This test helps identify whether the direction of vaccination prioritization
has a statistically significant effect on key health and economic indicators.

Metrics Tested
--------------
- CumulativeEconomicIndex
- Hospitalize
- Deceased

Steps
-----
1. Loop through all combinations of City and Compliance.
2. Subset ASCENDING and DESCENDING groups for each metric.
3. Perform Welch’s t-test (assuming unequal variances).
4. Store results in a structured DataFrame.
5. Save results to CSV.
6. Plot bar chart of t-statistics with visual significance thresholds.

Significance Reference
----------------------
A threshold of |t| ≈ 2 is used to indicate approximate statistical significance.

Returns
-------
- Printed DataFrame of t-test results
- CSV file: 't_test_results.csv'
- Bar plot of t-statistics by metric and city
"""

from scipy.stats import ttest_ind
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd

# Initialize list to store test results
t_test_results = []

# Loop over all City × Compliance combinations
for city in data['City'].unique():
    for compliance in data['Compliance'].unique():
        # Subset data for each OrderVac condition
        ascending = data[(data['City'] == city) &
                         (data['Compliance'] == compliance) &
                         (data['OrderVac'] == 'ASCENDING')]

        descending = data[(data['City'] == city) &
                          (data['Compliance'] == compliance) &
                          (data['OrderVac'] == 'DESCENDING')]

        # Perform test if both groups have data
        if not ascending.empty and not descending.empty:
            for metric in ['CumulativeEconomicIndex', 'Hospitalize', 'Deceased']:
                t_stat, p_value = ttest_ind(ascending[metric], descending[metric], equal_var=False)
                t_test_results.append({
                    'City': city,
                    'Compliance': compliance,
                    'Metric': metric,
                    'T-statistic': t_stat,
                    'P-value': p_value
                })

# Convert results to DataFrame
t_test_df = pd.DataFrame(t_test_results)

# Print results
print("T-test results comparing ASCENDING vs DESCENDING:")
print(t_test_df)

# Save results to CSV
t_test_df.to_csv("t_test_results.csv", index=False)

# ===== Visualization: Barplot of T-statistics =====
plt.figure(figsize=(14, 7))
sns.barplot(data=t_test_df, x='Metric', y='T-statistic', hue='City', ci=None)

# Add horizontal significance threshold lines at ±2
plt.axhline(2, color='red', linestyle='--', label='Significance Threshold (t ≈ ±2)')
plt.axhline(-2, color='red', linestyle='--')

# Customize plot
plt.title("T-statistics by Metric and City (ASCENDING vs DESCENDING)")
plt.ylabel("T-statistic")
plt.xlabel("Metric")
plt.legend()
plt.tight_layout()
plt.show()

In [None]:
"""
Performs independent t-tests comparing ASCENDING vs DESCENDING vaccination orders
within each combination of City and Compliance level, for selected outcome metrics.

This analysis evaluates whether the direction of the vaccination campaign
(ascending vs descending) significantly affects each metric, when stratified
by both city and compliance level.

Metrics Tested
--------------
- CumulativeEconomicIndex
- Hospitalize
- Deceased

Steps
-----
1. Loop over each unique (City, Compliance) pair.
2. Subset the data for ASCENDING and DESCENDING OrderVac.
3. Perform Welch’s t-test (unequal variance assumed) for each metric.
4. Store the test statistic and p-value.
5. Output results as a DataFrame and optionally save to CSV.

Returns
-------
- Printed table of t-test results
- CSV file: 't_test_results.csv'
"""

from scipy.stats import ttest_ind

# List to collect t-test results
t_test_results = []

# Iterate over all combinations of City and Compliance
for city in data['City'].unique():
    for compliance in data['Compliance'].unique():
        # Filter data for ASCENDING and DESCENDING OrderVac within each group
        ascending = data[(data['City'] == city) &
                         (data['Compliance'] == compliance) &
                         (data['OrderVac'] == 'ASCENDING')]

        descending = data[(data['City'] == city) &
                          (data['Compliance'] == compliance) &
                          (data['OrderVac'] == 'DESCENDING')]

        # Ensure both groups contain data
        if not ascending.empty and not descending.empty:
            # Perform t-test for each outcome metric
            for metric in ['CumulativeEconomicIndex', 'Hospitalize', 'Deceased']:
                t_stat, p_value = ttest_ind(
                    ascending[metric],
                    descending[metric],
                    equal_var=False  # Welch’s t-test
                )
                t_test_results.append({
                    'City': city,
                    'Compliance': compliance,
                    'Metric': metric,
                    'T-statistic': t_stat,
                    'P-value': p_value
                })

# Convert results to DataFrame
t_test_df = pd.DataFrame(t_test_results)

# Display the results
print("T-test results comparing ASCENDING vs DESCENDING within each City × Compliance group:")
print(t_test_df)

# Save results to CSV (optional)
t_test_df.to_csv("t_test_results.csv", index=False)

Supplementary Visualizations

In [None]:
t_test_results = []


for city in data['City'].unique():
    for compliance in data['Compliance'].unique():
      # Order Vac
        ascending = data[(data['City'] == city) & (data['Compliance'] == compliance) & (data['OrderVac'] == 'ASCENDING')]
        descending = data[(data['City'] == city) & (data['Compliance'] == compliance) & (data['OrderVac'] == 'DESCENDING')]


        if not ascending.empty and not descending.empty:

            for metric in ['CumulativeEconomicIndex', 'Hospitalize', 'Deceased']:
                t_stat, p_value = ttest_ind(ascending[metric], descending[metric], equal_var=False)
                t_test_results.append({
                    'City': city,
                    'Compliance': compliance,
                    'Metric': metric,
                    'T-statistic': t_stat,
                    'P-value': p_value
                })

# DataFrame For result t
t_test_df = pd.DataFrame(t_test_results)


print(" ASCENDING,T,DESCENDING:")
print(t_test_df)


t_test_df.to_csv("t_test_results.csv", index=False)


plt.figure(figsize=(14, 7))
sns.barplot(data=t_test_df, x='Metric', y='T-statistic', hue='City', ci=None)


plt.axhline(2, color='red', linestyle='--', label='Significance Threshold (t ≈ ±2)')
plt.axhline(-2, color='red', linestyle='--')

plt.title("T-statistics by Metric and City (ASCENDING vs DESCENDING)")
plt.ylabel("T-statistic")
plt.xlabel("Metric")
plt.legend()
plt.tight_layout()
plt.show()

In [None]:
"""
Performs pairwise t-tests comparing the CumulativeEconomicIndex between
ASCENDING and DESCENDING vaccination order strategies for each
combination of City and Compliance level.

This targeted analysis helps isolate whether the order of vaccine distribution
significantly affects economic outcomes within each subgroup.

Metric Analyzed
---------------
- CumulativeEconomicIndex

Steps
-----
1. Loop over all combinations of City × Compliance.
2. Subset ASCENDING and DESCENDING groups.
3. Perform independent two-sample t-test (equal variances assumed by default).
4. Store the p-value for each comparison.

Returns
-------
- Printed DataFrame of t-test p-values
"""

from scipy.stats import ttest_ind
import pandas as pd

# Store t-test results
ttest_results = []

# Iterate over combinations of City and Compliance
for city in data['City'].unique():
    for compliance in data['Compliance'].unique():
        asc = data[
            (data['City'] == city) &
            (data['Compliance'] == compliance) &
            (data['OrderVac'] == 'ASCENDING')
        ]['CumulativeEconomicIndex']

        desc = data[
            (data['City'] == city) &
            (data['Compliance'] == compliance) &
            (data['OrderVac'] == 'DESCENDING')
        ]['CumulativeEconomicIndex']

        # Perform t-test if both groups have data
        if not asc.empty and not desc.empty:
            ttest_result = ttest_ind(asc, desc)
            ttest_results.append({
                'City': city,
                'Compliance': compliance,
                'T-test p-value': ttest_result.pvalue
            })

# Convert results to DataFrame
ttest_df = pd.DataFrame(ttest_results)

# Print results
print("T-test results comparing ASCENDING vs DESCENDING (CumulativeEconomicIndex):")
print(ttest_df)

Supplementary Visualizations

In [None]:
"""
Generates three faceted bar plots (catplots) to visualize the relationship between
Compliance level, Vaccination Order, and outcome metrics across different cities.

Each plot compares ASCENDING vs DESCENDING strategies across compliance groups,
with separate subplots for each city.

Metrics Visualized
------------------
1. Deceased
2. CumulativeEconomicIndex
3. Hospitalize

Faceting & Grouping
-------------------
- x-axis: Compliance level
- hue: OrderVac (vaccination strategy)
- col: City

Returns
-------
- Three Seaborn catplots (bar type)
"""
# === 1. Deceased by Compliance × OrderVac × City ===
sns.catplot(
    data=data,
    x='Compliance',
    y='Deceased',
    hue='OrderVac',
    col='City',
    kind='bar',
    height=5,
    aspect=1.2
)
plt.subplots_adjust(top=0.9)
plt.suptitle("Deceased by Compliance Level, Vaccination Strategy, and City", fontsize=14, weight='bold')
plt.xlabel("Compliance Level")
plt.ylabel("Number of Deceased")
plt.show()

# === 2. Cumulative Economic Index by Compliance × OrderVac × City ===
sns.catplot(
    data=data,
    x='Compliance',
    y='CumulativeEconomicIndex',
    hue='OrderVac',
    col='City',
    kind='bar',
    height=5,
    aspect=1.2
)
plt.subplots_adjust(top=0.9)
plt.suptitle("Cumulative Economic Index by Compliance Level, Vaccination Strategy, and City", fontsize=14, weight='bold')
plt.xlabel("Compliance Level")
plt.ylabel("Cumulative Economic Index")
plt.show()

# === 3. Hospitalizations by Compliance × OrderVac × City ===
sns.catplot(
    data=data,
    x='Compliance',
    y='Hospitalize',
    hue='OrderVac',
    col='City',
    kind='bar',
    height=5,
    aspect=1.2
)
plt.subplots_adjust(top=0.9)
plt.suptitle("Hospitalizations by Compliance Level, Vaccination Strategy, and City", fontsize=14, weight='bold')
plt.xlabel("Compliance Level")
plt.ylabel("Number of Hospitalizations")
plt.show()


In [None]:
"""
Performs bootstrap resampling to estimate the mean and confidence interval
for selected outcome metrics, stratified by:
- City
- Compliance level
- Vaccination order (OrderVac)

This method is useful for estimating uncertainty in the mean
without relying on parametric assumptions.

Metrics Analyzed
----------------
- CumulativeEconomicIndex
- Hospitalize
- Deceased

Bootstrap Settings
------------------
- n_iterations: 1000 (resamples)
- ci: 95% (confidence level)

Steps
-----
1. Strip extra spaces from column names to ensure consistency.
2. Define a reusable function `bootstrap_ci` for bootstrapping.
3. For each (City × Compliance × OrderVac) subgroup:
   a. Filter data
   b. Compute bootstrap mean and CI for each metric
   c. Print results to console

Returns
-------
- Console printout of bootstrapped means and 95% confidence intervals
"""
# Ensure clean column names
data.columns = data.columns.str.strip()

# Define bootstrap function
def bootstrap_ci(data, metric, n_iterations=1000, ci=95):
    """
    Calculates the bootstrap mean and confidence interval for a metric.

    Parameters
    ----------
    data : pd.DataFrame
        The subset of data to sample from.
    metric : str
        Name of the column to analyze.
    n_iterations : int, optional
        Number of bootstrap resamples (default is 1000).
    ci : float, optional
        Confidence level percentage (default is 95).

    Returns
    -------
    tuple
        (bootstrap_mean, (lower_bound, upper_bound))
    """
    bootstrapped_means = []
    for _ in range(n_iterations):
        sample = data.sample(frac=1, replace=True)
        bootstrapped_means.append(sample[metric].mean())
    lower = np.percentile(bootstrapped_means, (100 - ci) / 2)
    upper = np.percentile(bootstrapped_means, 100 - (100 - ci) / 2)
    return np.mean(bootstrapped_means), (lower, upper)

# Define metrics to analyze
metrics = ['CumulativeEconomicIndex', 'Hospitalize', 'Deceased']

# Optional: dictionary to store results
results = {}

# Run bootstrap across all group combinations
for city in data['City'].unique():
    for compliance in data['Compliance'].unique():
        for order_vac in data['OrderVac'].unique():
            filtered_data = data[
                (data['City'] == city) &
                (data['Compliance'] == compliance) &
                (data['OrderVac'] == order_vac)
            ]
            print(f"\nResults for City: {city}, Compliance: {compliance}, OrderVac: {order_vac}")
            for metric in metrics:
                mean, ci = bootstrap_ci(filtered_data, metric)
                print(f"{metric} - Mean: {mean:.2f}, CI: {ci}")

In [None]:
detailed_data = {
    "City": [
        "Bene Beraq", "Bene Beraq", "Bene Beraq", "Bene Beraq", "Bene Beraq", "Bene Beraq",
        "Holon", "Holon", "Holon", "Holon", "Holon", "Holon"
    ],
    "Compliance": [0.35, 0.35, 0.75, 0.75, 1.00, 1.00, 0.35, 0.35, 0.75, 0.75, 1.00, 1.00],
    "Order": [
        "ASCENDING", "DESCENDING", "ASCENDING", "DESCENDING", "ASCENDING", "DESCENDING",
        "ASCENDING", "DESCENDING", "ASCENDING", "DESCENDING", "ASCENDING", "DESCENDING"
    ],
    "Economic_CI": [
        (1.929e9, 1.893e9, 1.965e9), (1.841e9, 1.783e9, 1.899e9),
        (2.157e9, 2.074e9, 2.224e9), (1.879e9, 1.827e9, 1.931e9),
        (2.430e9, 2.318e9, 2.542e9), (2.036e9, 1.953e9, 2.118e9),
        (1.831e9, 1.783e9, 1.880e9), (2.142e9, 2.056e9, 2.228e9),
        (2.018e9, 1.927e9, 2.110e9), (2.549e9, 2.409e9, 2.689e9),
        (2.222e9, 2.117e9, 2.327e9), (2.782e9, 2.627e9, 2.937e9)
    ],
    "Hospitalize_CI": [
        (496.12, 465.56, 526.68), (403.92, 370.95, 436.89),
        (350.70, 306.65, 394.75), (276.62, 248.83, 304.41),
        (236.12, 192.59, 279.65), (228.68, 196.84, 260.52),
        (1382.94, 1329.81, 1436.07), (1025.48, 968.70, 1082.26),
        (848.02, 736.89, 959.15), (626.20, 410.22, 842.18),
        (699.64, 577.92, 821.36), (385.54, 275.81, 495.27)
    ],
    "Deceased_CI": [
        (179.68, 154.64, 204.72), (184.22, 153.66, 214.78),
        (152.24, 131.05, 173.43), (151.78, 124.28, 179.28),
        (126.22, 89.80, 162.64), (141.98, 110.34, 173.62),
        (451.96, 392.81, 511.11), (344.38, 299.19, 389.57),
        (273.10, 222.39, 323.81), (170.30, 126.56, 214.04),
        (236.16, 186.13, 286.19), (137.44, 87.19, 187.69)
    ]
}

df_detailed = pd.DataFrame(detailed_data)


fig, axs = plt.subplots(1, 3, figsize=(25, 8))
metrics = ["Economic_CI", "Hospitalize_CI", "Deceased_CI"]
titles = [
    "(A) Economic Function",
    "(B) Hospitalized",
    "(C) Deceased "
]

for i, metric in enumerate(metrics):
    ax = axs[i]
    for city in df_detailed["City"].unique():
        for order in df_detailed["Order"].unique():
            subset = df_detailed[(df_detailed["City"] == city) & (df_detailed["Order"] == order)]
            x = subset["Compliance"]
            y = [ci[0] for ci in subset[metric]]
            y_lower = [ci[1] for ci in subset[metric]]
            y_upper = [ci[2] for ci in subset[metric]]


            ax.errorbar(
                x, y,
                yerr=[np.array(y) - np.array(y_lower), np.array(y_upper) - np.array(y)],
                fmt='-o',
                label=f"{city} ({order})",
                capsize=4,
                alpha=0.8,
                markersize=8
            )

    ax.set_title(titles[i], fontsize=20, fontweight="bold", loc='left')
    ax.set_xlabel("Compliance Level", fontsize=17)
    ax.set_ylabel(titles[i].split(" ", 1)[1], fontsize=17)
    ax.set_xticks([0.35, 0.75, 1.0])
    ax.grid(axis='y', linestyle='--', alpha=0.7)
    ax.legend(fontsize=17)
    ax.spines['top'].set_visible(False)
    ax.spines['right'].set_visible(False)

fig.suptitle("Comparison of Key Metrics by Compliance Level, City, and Vaccination Order", fontsize=16, weight="bold")
plt.tight_layout(rect=[0, 0, 1, 0.95])
plt.show()

Supplementary Visualizations

In [None]:
data.columns = data.columns.str.strip().str.replace(r'\s+', ' ', regex=True)


action_columns = [
    "Length of action 0 (days)",
    "Length of action 1 (days)",
    "Length of action 2. (days)",
    "Length of action 3 (days)",
    "Length of action 4 (days)",
    "Length of action 5 (days)",
    "Length of action 6 (days)",
    "Length of action 7 (days)",
    "Length of action 8 (days)",
    "Length of action 9 (days)",
    "Length of action 10 (days)"
]

try:
    action_summary = data.groupby(['City', 'Compliance', 'OrderVac'])[action_columns].sum().reset_index()


    print(action_summary)


    for action in action_columns:
        plt.figure(figsize=(12, 6))
        sns.barplot(
            data=action_summary,
            x="Compliance",
            y=action,
            hue="City",
            ci=None
        )
        plt.title(f"Total Days of {action} by Compliance and City")
        plt.ylabel("Total Days")
        plt.xlabel("Compliance Level")
        plt.legend(title="City")
        plt.show()


    for action in action_columns:
        plt.figure(figsize=(12, 6))
        sns.barplot(
            data=action_summary,
            x="OrderVac",
            y=action,
            hue="City",
            ci=None
        )
        plt.title(f"Total Days of {action} by Vaccination Strategy and City")
        plt.ylabel("Total Days")
        plt.xlabel("Vaccination Strategy")
        plt.legend(title="City")
        plt.show()

except KeyError as e:
    print(f"KeyError: {e}")
    print("Please verify that the column names in your data match exactly.")


In [None]:
# By Group City, Compliance,and Order Vac, Rt
rt_summary = data.groupby(['City', 'Compliance', 'OrderVac'])['Rt'].mean().reset_index()

# Print
print(rt_summary)



In [None]:
action_columns = [col for col in data.columns if 'Length of action' in col]
action_summary = data.groupby(['City', 'Compliance', 'OrderVac'])[action_columns].sum().reset_index()


action_summary_long = action_summary.melt(
    id_vars=['City', 'Compliance', 'OrderVac'],
    var_name='Action',
    value_name='Days'
)


In [None]:
# Define list T
t_test_results = []

# Compliance List
compliance_levels = [0.35, 0.75, 1.0]

# Cumulative economic index, Hospitalize,and -Deceased
for metric in ['CumulativeEconomicIndex', 'Hospitalize', 'Deceased']:
    #
    for i in range(len(compliance_levels)):
        for j in range(i + 1, len(compliance_levels)):
            compliance_a = compliance_levels[i]
            compliance_b = compliance_levels[j]


            group_a = data[data['Compliance'] == compliance_a][metric]
            group_b = data[data['Compliance'] == compliance_b][metric]


            t_stat, p_value = ttest_ind(group_a, group_b)


            t_test_results.append({
                'Metric': metric,
                'Compliance A': compliance_a,
                'Compliance B': compliance_b,
                'T-test p-value': p_value
            })


t_test_results_df = pd.DataFrame(t_test_results)


print("T-test Results for Compliance Levels:")
print(t_test_results_df)


In [None]:
# Create a list to store T-test results comparing vaccine strategies
t_test_results_vaccine_strategy = []

# Loop to compute T-tests for each city, compliance level, and selected metric
for city in data['City'].unique():
    for compliance in data['Compliance'].unique():
        for metric in ['Hospitalize', 'Deceased']:
            # Filter data for group A: 'ASCENDING' vaccination strategy
            group_a = data[(data['City'] == city) &
                           (data['Compliance'] == compliance) &
                           (data['OrderVac'] == 'ASCENDING')][metric]

            # Filter data for group B: 'DESCENDING' vaccination strategy
            group_b = data[(data['City'] == city) &
                           (data['Compliance'] == compliance) &
                           (data['OrderVac'] == 'DESCENDING')][metric]

            # Perform independent two-sample T-test between the two strategies
            t_stat, p_value = ttest_ind(group_a, group_b)

            # Append test results to the output list
            t_test_results_vaccine_strategy.append({
                'City': city,
                'Compliance': compliance,
                'Metric': metric,
                'Strategy A': 'ASCENDING',
                'Strategy B': 'DESCENDING',
                'T-test p-value': p_value
            })

# Convert the list of T-test results into a structured DataFrame
t_test_results_vaccine_strategy_df = pd.DataFrame(t_test_results_vaccine_strategy)

# Display the results table
print("T-test Results for Vaccine Strategies:")
print(t_test_results_vaccine_strategy_df)

In [None]:
# Loop to compute T-tests comparing vaccination strategies for each city, compliance level, and metric
for city in data['City'].unique():
    for compliance in data['Compliance'].unique():
        for metric in ['Hospitalize', 'Deceased']:
            # Filter data for 'ASCENDING' vaccination order (Group A)
            group_a = data[(data['City'] == city) &
                           (data['Compliance'] == compliance) &
                           (data['OrderVac'] == 'ASCENDING')][metric]

            # Filter data for 'DESCENDING' vaccination order (Group B)
            group_b = data[(data['City'] == city) &
                           (data['Compliance'] == compliance) &
                           (data['OrderVac'] == 'DESCENDING')][metric]

            # Perform independent two-sample T-test between Group A and Group B
            t_stat, p_value = ttest_ind(group_a, group_b)

            # Store the T-test results with metadata
            t_test_results_vaccine_strategy.append({
                'City': city,
                'Compliance': compliance,
                'Metric': metric,
                'Strategy A': 'ASCENDING',
                'Strategy B': 'DESCENDING',
                'T-test p-value': p_value
            })

# Convert the list of T-test result dictionaries to a DataFrame
t_test_results_vaccine_strategy_df = pd.DataFrame(t_test_results_vaccine_strategy)

# Display the results table
print("T-test Results for Vaccine Strategies:")
print(t_test_results_vaccine_strategy_df)

In [None]:
# Set random seed to ensure reproducible results
np.random.seed(42)

# Function to compute bootstrap mean and confidence interval
def bootstrap_ci(data, metric, n_iterations=1000, ci=97):
    """
    Estimate the mean and confidence interval for a given metric using bootstrap resampling.

    Parameters
    ----------
    data : pd.DataFrame
        The input dataset.
    metric : str
        The name of the column for which the mean and CI will be computed.
    n_iterations : int, optional
        Number of bootstrap iterations. Default is 1000.
    ci : float, optional
        Confidence interval level (percentage). Default is 97.

    Returns
    -------
    tuple
        A tuple containing:
        - The bootstrap mean (float)
        - A tuple of (lower bound, upper bound) for the confidence interval
    """
    bootstrapped_means = []
    for _ in range(n_iterations):
        sample = data.sample(frac=1, replace=True)
        bootstrapped_means.append(sample[metric].mean())
    lower = np.percentile(bootstrapped_means, (100 - ci) / 2)
    upper = np.percentile(bootstrapped_means, 100 - (100 - ci) / 2)
    return np.mean(bootstrapped_means), (lower, upper)

# List of metrics to analyze
metrics = ['CumulativeEconomicIndex', 'Hospitalize', 'Deceased']

# Dictionary to store bootstrap results
results = {}

# Loop over all cities, compliance levels, and vaccination strategies
for city in data['City'].unique():
    for compliance in data['Compliance'].unique():
        for order_vac in data['OrderVac'].unique():
            # Filter data for the specific group
            filtered_data = data[
                (data['City'] == city) &
                (data['Compliance'] == compliance) &
                (data['OrderVac'] == order_vac)
            ]
            print(f"\nResults for City: {city}, Compliance: {compliance}, OrderVac: {order_vac}")

            # Compute bootstrap estimates for each metric
            for metric in metrics:
                mean, ci = bootstrap_ci(filtered_data, metric, n_iterations=5000)
                print(f"{metric} - Mean: {mean:.2f}, CI: {ci}")

            # Analyze and display top 11 interventions by average duration
            action_columns = [col for col in data.columns if 'Length of action' in col]
            action_means = filtered_data[action_columns].mean()
            top_actions = action_means.sort_values(ascending=False).head(11)
            print("Top 4 Actions:")
            for action, avg_days in top_actions.items():
                print(f"{action}: {avg_days:.2f} average days")

In [None]:
# Bootstrap function to compute mean and confidence interval
def bootstrap_action_ci(data, column, n_iterations=1000, ci=95):
    boot_means = []
    for _ in range(n_iterations):
        sample = data.sample(frac=1, replace=True)  # Sampling with replacement
        boot_means.append(sample[column].mean())  # Calculate sample mean
    lower = np.percentile(boot_means, (100 - ci) / 2)
    upper = np.percentile(boot_means, 100 - (100 - ci) / 2)
    return np.mean(boot_means), (lower, upper)

# Step 1: Build the original dataset
df_original = pd.DataFrame({
    'City': ['Bene Beraq'] * 6 + ['Holon'] * 6,
    'Compliance': [0.35, 0.35, 0.75, 0.75, 1.00, 1.00] * 2,
    'OrderVac': ['ASCENDING', 'DESCENDING'] * 6,
    'Length of action 0 (days)': [270, 330, 390, 480, 780, 570, 360, 330, 450, 390, 1410, 690],
    'Length of action 1 (days)': [690, 900, 540, 1050, 450, 630, 450, 480, 270, 390, 360, 240],
    'Length of action 2. (days)': [360, 480, 390, 330, 240, 210, 450, 510, 390, 150, 330, 420],
    'Length of action 3 (days)': [480, 660, 690, 510, 540, 780, 270, 330, 360, 360, 360, 120],
    'Length of action 4 (days)': [480, 120, 870, 630, 1200, 1200, 1170, 1210, 1260, 2610, 1320, 2970],
    'Length of action 5 (days)': [600, 960, 540, 750, 930, 570, 390, 300, 390, 330, 960, 510],
    'Length of action 6 (days)': [960, 540, 750, 480, 1050, 480, 1620, 1290, 1620, 1170, 720, 630],
    'Length of action 7 (days)': [420, 630, 540, 690, 600, 540, 210, 420, 480, 450, 540, 510],
    'Length of action 8 (days)': [1560, 1110, 1410, 870, 480, 930, 540, 1050, 450, 480, 540, 480],
    'Length of action 9 (days)': [540, 390, 360, 360, 450, 330, 180, 480, 360, 330, 330, 390],
    'Length of action 10 (days)': [840, 1080, 720, 1050, 480, 960, 1560, 780, 1200, 510, 330, 240],
})

# Step 2: Define list of action columns
action_cols = [col for col in df_original.columns if col.startswith("Length of action")]

# Step 3: Apply bootstrap to each (City, Compliance, OrderVac) group and each action
#         and calculate average days per simulation (divided by 60)
bootstrap_action_results = []
for (city, compliance, order_vac), group in df_original.groupby(['City', 'Compliance', 'OrderVac']):
    for action_col in action_cols:
        mean_val, (ci_lower, ci_upper) = bootstrap_action_ci(group, action_col)
        bootstrap_action_results.append({
            'City': city,
            'Compliance': compliance,
            'OrderVac': order_vac,
            'Action': action_col,
            'Avg Days per Sim': round(mean_val / 60, 2),
            '95% CI Lower': round(ci_lower / 60, 2),
            '95% CI Upper': round(ci_upper / 60, 2),
        })

# Step 4: Create summary DataFrame
df_bootstrap_actions_scaled = pd.DataFrame(bootstrap_action_results)

# Step 5: Display the result
print(df_bootstrap_actions_scaled)

Supplementary Visualizations

In [None]:
# Bootstrap function
def bootstrap_ci(data, metric, n_iterations=1000, ci=96):
    np.random.seed(42)  # Ensure reproducibility
    bootstrapped_means = []
    for _ in range(n_iterations):
        sample = data.sample(frac=1, replace=True)  # Sampling with replacement
        bootstrapped_means.append(sample[metric].mean())
    lower = np.percentile(bootstrapped_means, (100 - ci) / 2)
    upper = np.percentile(bootstrapped_means, 100 - (100 - ci) / 2)
    return bootstrapped_means, np.mean(bootstrapped_means), (lower, upper)

# Example of bootstrapping and visualization
for city in data['City'].unique():
    for compliance in data['Compliance'].unique():
        for order_vac in data['OrderVac'].unique():
            filtered_data = data[
                (data['City'] == city) &
                (data['Compliance'] == compliance) &
                (data['OrderVac'] == order_vac)
            ]
            print(f"\nResults for City: {city}, Compliance: {compliance}, OrderVac: {order_vac}")
            for metric in ['CumulativeEconomicIndex', 'Hospitalize', 'Deceased']:
                bootstrapped_means, mean, ci = bootstrap_ci(filtered_data, metric)
                print(f"{metric} - Mean: {mean:.2f}, CI: {ci}")

                # Visualization
                plt.figure(figsize=(10, 6))
                sns.histplot(bootstrapped_means, kde=True, bins=50, color='blue', alpha=0.7)
                plt.axvline(ci[0], color='red', linestyle='--', label=f'Lower CI ({ci[0]:.2f})')
                plt.axvline(ci[1], color='green', linestyle='--', label=f'Upper CI ({ci[1]:.2f})')
                plt.axvline(mean, color='black', linestyle='-', label=f'Mean ({mean:.2f})')
                plt.title(f'{metric} Bootstrap Distribution for {city} (Compliance: {compliance}, OrderVac: {order_vac})')
                plt.xlabel(metric)
                plt.ylabel('Frequency')
                plt.legend()
                plt.show()


Supplementary Visualizations

In [None]:
# Set seed for reproducibility
np.random.seed(42)

# Bootstrap function to compute mean and confidence interval,
# with optional adjustment for severe actions based on compliance level
def bootstrap_ci(data, metric, n_iterations=1000, ci=95, severe_actions=None):
    bootstrapped_means = []
    for _ in range(n_iterations):
        sample = data.sample(frac=1, replace=True)  # Sampling with replacement
        sample_means = sample[metric].mean()  # Compute sample mean

        # If severe actions are defined, adjust mean based on compliance
        if severe_actions:
            for action in severe_actions:
                # Adjust mean for severe actions: reduce days based on compliance level
                compliance_level = sample['Compliance'].iloc[0]  # Can be adjusted to use mean or group logic
                if compliance_level == 'High':
                    sample_means *= 0.8  # Reduce by 20% for high compliance
                elif compliance_level == 'Medium':
                    sample_means *= 0.9  # Reduce by 10% for medium compliance

        bootstrapped_means.append(sample_means)  # Add sample mean to list
    lower = np.percentile(bootstrapped_means, (100 - ci) / 2)
    upper = np.percentile(bootstrapped_means, 100 - (100 - ci) / 2)
    return np.mean(bootstrapped_means), (lower, upper)

# List of metrics to evaluate
metrics = ['CumulativeEconomicIndex', 'Hospitalize', 'Deceased', 'Rt']

# Mapping of action column names to descriptive labels
action_names = {
    "Length of action 0 (days)": "Action 0: No intervention",
    "Length of action 1 (days)": "Action 1: house_interventions + General Lockdown",
    "Length of action 2. (days)": "Action 2: global_interventions + City Lockdown",
    "Length of action 3 (days)": "Action 3: Isolation of symptomatic individuals + global_interventions",
    "Length of action 4 (days)": "Action 4: Social distancing",
    "Length of action 5 (days)": "Action 5: School closure + house_interventions",
    "Length of action 6 (days)": "Action 6: Workplace closure + house_interventions",
    "Length of action 7 (days)": "Action 7: Elderly distancing + house_interventions",
    "Length of action 8 (days)": "Action 8: Household isolation",
    "Length of action 9 (days)": "Action 9: City Lockdown",
    "Length of action 10 (days)": "Action 10: General Lockdown"
}

# Lists of severe and less severe interventions
severe_actions = [
    "Action 1: house_interventions + General Lockdown",
    "Action 2: global_interventions + City Lockdown",
    "Action 6: Workplace closure + house_interventions",
    "Action 9: City Lockdown",
    "Action 10: General Lockdown"
]

less_severe_actions = [
    "Action 0: No intervention",
    "Action 3: Isolation of symptomatic individuals + global_interventions",
    "Action 4: Social distancing",
    "Action 5: School closure + house_interventions",
    "Action 7: Elderly distancing + house_interventions",
    "Action 8: Household isolation"
]

# Loop over each city, compliance level, and vaccination strategy
for city in data['City'].unique():
    for compliance in data['Compliance'].unique():
        for order_vac in data['OrderVac'].unique():
            filtered_data = data[
                (data['City'] == city) &
                (data['Compliance'] == compliance) &
                (data['OrderVac'] == order_vac)
            ]
            actions_data = filtered_data.filter(like="Length of action").mean()

            # Strip whitespace from column names if needed
            actions_data.index = actions_data.index.str.strip()

            # Map action column names to descriptive labels
            mapped_actions = actions_data.index.map(lambda x: action_names.get(x, x))
            actions_data.index = mapped_actions

            # Sort actions by average duration
            actions_data = actions_data.sort_values(ascending=False)

            # Assign colors based on severity
            colors = [
                'red' if action in severe_actions else 'green'
                for action in actions_data.index
            ]

            # Perform bootstrap for each metric
            for metric in metrics:
                mean, ci_range = bootstrap_ci(filtered_data, metric, n_iterations=1000, ci=95, severe_actions=severe_actions)
                print(f"{metric} - City: {city}, Compliance: {compliance}, OrderVac: {order_vac}")
                print(f"Bootstrap Mean: {mean}, Confidence Interval: {ci_range}")

            # Debug print of mapped actions and their colors
            print("Mapped Actions (after mapping):", actions_data.index.tolist())
            print("Colors:", colors)

            # Create bar chart
            plt.figure(figsize=(10, 6))
            actions_data.plot(kind='bar', color=colors)
            plt.title(f"Actions Distribution - City: {city}, Compliance: {compliance}, OrderVac: {order_vac}")
            plt.ylabel("Average Days")
            plt.xlabel("Actions")
            plt.xticks(rotation=45, ha='right')
            plt.legend(handles=[
                plt.Line2D([0], [0], color='red', lw=4, label='Severe Actions'),
                plt.Line2D([0], [0], color='green', lw=4, label='Less Severe Actions')
            ], loc='upper right')
            plt.tight_layout()
            plt.show()

Supplementary Visualizations

In [None]:
def bootstrap_ci(data, metric, n_iterations=1000, ci=97):
    bootstrapped_means = []
    for _ in range(n_iterations):
        sample = data.sample(frac=1, replace=True)
        bootstrapped_means.append(sample[metric].mean())
    lower = np.percentile(bootstrapped_means, (100 - ci) / 2)
    upper = np.percentile(bootstrapped_means, 100 - (100 - ci) / 2)
    return np.mean(bootstrapped_means), (lower, upper)


metrics = ['CumulativeEconomicIndex', 'Hospitalize', 'Deceased', 'Rt']


results = {}


action_names = {
    "Length of action 0 (days)": "Action 0: No intervention",
    "Length of action 1 (days)": "Action 1: house_interventions + General Lockdown",
    "Length of action 2. (days)": "Action 2: global_interventions + City Lockdown",
    "Length of action 3 (days)": "Action 3: Isolation of symptomatic individuals + global_interventions",
    "Length of action 4 (days)": "Action 4: Social distancing",
    "Length of action 5 (days)": "Action 5: School closure + house_interventions",
    "Length of action 6 (days)": "Action 6: Workplace closure + house_interventions",
    "Length of action 7 (days)": "Action 7: Elderly distancing + house_interventions",
    "Length of action 8 (days)": "Action 8: Household isolation",
    "Length of action 9 (days)": "Action 9: City Lockdown",
    "Length of action 10 (days)": "Action 10: General Lockdown"
}

severe_actions = [
    "Action 1: house_interventions + General Lockdown",
    "Action 2: global_interventions + City Lockdown",
    "Action 6: Workplace closure + house_interventions",
    "Action 9: City Lockdown",
    "Action 10: General Lockdown"
]

less_severe_actions = [
    "Action 0: No intervention",
    "Action 3: Isolation of symptomatic individuals + global_interventions",
    "Action 4: Social distancing",
    "Action 5: School closure + house_interventions",
    "Action 7: Elderly distancing + house_interventions",
    "Action 8: Household isolation"
]


for city in data['City'].unique():
    for compliance in data['Compliance'].unique():
        for order_vac in data['OrderVac'].unique():
            filtered_data = data[
                (data['City'] == city) &
                (data['Compliance'] == compliance) &
                (data['OrderVac'] == order_vac)
            ]
            print(f"\nResults for City: {city}, Compliance: {compliance}, OrderVac: {order_vac}")
            city_results = {}

            for metric in metrics:
                mean_val, ci_val = bootstrap_ci(filtered_data, metric)
                city_results[metric] = {'Mean': mean_val, 'CI': ci_val}
                print(f"{metric} - Mean: {mean_val:.2f}, CI: {ci_val}")


            top_actions = filtered_data.filter(like="Length of action").mean().sort_values(ascending=False).head(11)
            city_results['Top Actions'] = top_actions
            print("Top 11 Actions:")
            for action, days in top_actions.items():
                print(f"{action}: {days:.2f} average days")

            results[(city, compliance, order_vac)] = city_results



sns.set_style("whitegrid")


action_names_graph = {
    0: "No intervention",
    1: "House interventions + General Lockdown",
    2: "Global interventions + City Lockdown",
    3: "Isolation of symptomatic individuals + Global interventions",
    4: "Social distancing",
    5: "School closure + House interventions",
    6: "Workplace closure + House interventions",
    7: "Elderly distancing + House interventions",
    8: "Household isolation",
    9: "City Lockdown",
    10: "General Lockdown"
}


severe_actions_indices = [1, 2, 6, 9, 10]
less_severe_actions_indices = [0, 3, 4, 5, 7, 8]

if isinstance(data, list):
    data = pd.DataFrame(data)


fig = plt.figure(figsize=(20, 10))
gs = GridSpec(3, 5, width_ratios=[1, 1, 1, 1, 0.6])

plot_index = 0
compliance_to_row = {0.35: 0, 0.75: 1, 1: 2}
compliance_counter = {0.35: 0, 0.75: 0, 1: 0}



#12 Graphs
for city in data['City'].unique():
    for compliance in data['Compliance'].unique():
        for order_vac in data['OrderVac'].unique():
            if plot_index >= 12:
                break

            # results
            if (city, compliance, order_vac) not in results:
                continue


            raw_actions_data = results[(city, compliance, order_vac)]['Top Actions']
            actions_data = raw_actions_data / 120
            if actions_data.empty:
                continue


            actions_data.index = actions_data.index.str.extract(r'(\d+)').astype(int).values.flatten()
            actions_data = actions_data.sort_values(ascending=False)


            row = compliance_to_row[compliance]
            col = compliance_counter[compliance]
            ax = fig.add_subplot(gs[row, col])
            compliance_counter[compliance] += 1


            colors = ['red' if action_names.get(f"Length of action {action} (days)", "") in severe_actions else 'green' for action in actions_data.index]
            ax.bar(actions_data.index, actions_data.values, color=colors, alpha=0.8)


            severe_days = sum(
                raw_actions_data[action]
                for action in raw_actions_data.index
                if action_names.get(action, "") in severe_actions
            )
            total_days = raw_actions_data.sum()
            severe_percentage = (severe_days / total_days) * 100 if total_days > 0 else 0


            ax.text(0.05, 0.95, f"Severe: {severe_percentage:.1f}%", transform=ax.transAxes, fontsize=12,
                    ha="left", va="top", bbox=dict(facecolor='white', alpha=0.6, pad=0.3))

            raw_text_lines = []
            for action, raw_val in raw_actions_data.head(4).items():

                match = re.search(r'(\d+)', action)
                if match:
                    action_number = match.group(1)
                    raw_text_lines.append(f"{action_number}: {raw_val:.2f}")
            raw_text = "\n".join(raw_text_lines)
            ax.text(0.95, 0.95, raw_text, transform=ax.transAxes, fontsize=8,
                    ha="right", va="top", bbox=dict(facecolor='white', alpha=0.5, pad=0.3))


            ax.set_title(f"City: {city}, Compliance: {compliance}\nOrderVac: {order_vac}", fontsize=11)
            ax.set_ylabel("Avg Days (Normalized)")
            ax.set_xticks(actions_data.index)
            ax.set_xticklabels(actions_data.index, fontsize=10, rotation=0)
            ax.set_ylim(1e-3, 1)
            ax.set_yscale('log')

            plot_index += 1


legend_ax = fig.add_subplot(gs[:, 4])
legend_ax.axis("off")
legend_ax.text(0.01, 0.95, "Action List (Severe-Red, Less Severe-Green)", fontsize=14, fontweight="bold", ha="left")


y_pos = 0.9
for action, name in action_names.items():
    color = 'red' if name in severe_actions else 'green'
    legend_ax.text(0, y_pos, f"{name}", fontsize=12, color=color, ha="left")
    y_pos -= 0.08


plt.subplots_adjust(wspace=0.4, hspace=0.5, right=0.85)
plt.show()


summary_data = []
for (city, compliance, order_vac), metrics in results.items():
    row = {
        'City': city,
        'Compliance': compliance,
        'OrderVac': order_vac,
    }
    for metric, values in metrics.items():
        if metric == 'Top Actions':
            for action, days in values.items():
                row[action] = days
        else:
            row[f"{metric} Mean"] = values['Mean']
            row[f"{metric} CI Lower"] = values['CI'][0]
            row[f"{metric} CI Upper"] = values['CI'][1]
    summary_data.append(row)

summary_df = pd.DataFrame(summary_data)


print("\nSummary Results (Bootstrap with Rt):")
print(summary_df)


# 1. Overlay
plt.figure(figsize=(14, 8))
for metric in metrics:
    if f"{metric} Mean" in summary_df.columns:  # וידוא שהמדד קיים בעמודות
        sns.kdeplot(data=summary_df, x=f"{metric} Mean", label=metric, fill=True)
plt.axvline(0, color='black', linestyle='--')
plt.title("Overlay of Metric Distributions")
plt.legend()
plt.show()

# 2. Heatmap
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

#summary_df-pivot_table:
heatmap_data = summary_df.pivot_table(
    index=['City', 'Compliance'],
    columns='OrderVac',
    values='CumulativeEconomicIndex Mean'
)


def abbreviate_number(x):
    if abs(x) >= 1e9:
        return f"{x/1e9:.2f}B"
    elif abs(x) >= 1e6:
        return f"{x/1e6:.2f}M"
    elif abs(x) >= 1e3:
        return f"{x/1e3:.2f}K"
    else:
        return f"{x:.2f}"


annot_data = heatmap_data.applymap(abbreviate_number)

plt.figure(figsize=(11, 9))
sns.heatmap(
    heatmap_data,
    annot=annot_data,
    fmt="",
    cmap="YlOrRd",
    robust=True,
    linewidths=1,
    linecolor='black',
    cbar_kws={'shrink': 0.9}
)
plt.title("Heatmap of Economic Value by City/Compliance/Order Vaccination")
plt.show()


# 3.  Rt disttribution
plt.figure(figsize=(10, 6))
if 'Rt Mean' in summary_df.columns:  # וידוא שהמדד קיים בעמודות
    sns.histplot(summary_df['Rt Mean'], kde=True, color="purple")
    plt.axvline(summary_df['Rt Mean'].mean(), color='green', linestyle='--', label='Mean Rt')
    plt.title("Distribution of Rt Means")
    plt.legend()
    plt.show()

# 4. CI
for metric in metrics:
    if f"{metric} CI Lower" in summary_df.columns and f"{metric} CI Upper" in summary_df.columns:
        plt.figure(figsize=(10, 6))
        sns.scatterplot(data=summary_df, x='City', y=f"{metric} Mean", hue='Compliance', style='OrderVac', s=100)
        plt.errorbar(summary_df['City'], summary_df[f"{metric} Mean"],
                     yerr=[summary_df[f"{metric} Mean"] - summary_df[f"{metric} CI Lower"],
                           summary_df[f"{metric} CI Upper"] - summary_df[f"{metric} Mean"]],
                     fmt='o', ecolor='gray', capsize=5)
        plt.title(f"{metric} Mean and Confidence Intervals by City")
        plt.show()

# 5. Actions Graph
for city in data['City'].unique():
    for compliance in data['Compliance'].unique():
        for order_vac in data['OrderVac'].unique():
            filtered_data = data[
                (data['City'] == city) &
                (data['Compliance'] == compliance) &
                (data['OrderVac'] == order_vac)
            ]
            actions_data = filtered_data.filter(like="Length of action").mean()


            actions_data.index = actions_data.index.str.strip()

            # Action_names
            mapped_actions = actions_data.index.map(lambda x: action_names.get(x, x))
            actions_data.index = mapped_actions


            actions_data = actions_data.sort_values(ascending=False)


            colors = [
                'red' if action in severe_actions else 'green'
                for action in actions_data.index
            ]


            print("Mapped Actions (after mapping):", actions_data.index.tolist())
            print("Colors:", colors)


            plt.figure(figsize=(10, 6))
            actions_data.plot(kind='bar', color=colors)
            plt.title(f"Actions Distribution - City: {city}, Compliance: {compliance}, OrderVac: {order_vac}")
            plt.ylabel("Average Days")
            plt.xlabel("Actions")
            plt.xticks(rotation=45, ha='right')
            plt.legend(handles=[
                plt.Line2D([0], [0], color='red', lw=4, label='Severe Actions'),
                plt.Line2D([0], [0], color='green', lw=4, label='Less Severe Actions')
            ], loc='upper right')
            plt.tight_layout()
            plt.show()

            import seaborn as sns


actions_long = summary_df.melt(
    id_vars=['City', 'Compliance', 'OrderVac'],
    value_vars=[col for col in summary_df.columns if 'Length of action' in col],
    var_name='Action',
    value_name='Average Days'
)





action_name_mapping = {
    "Length of action 0 (days)": "Action 0: No intervention",
    "Length of action 1 (days)": "Action 1: house_interventions + General Lockdown",
    "Length of action 2. (days)": "Action 2: global_interventions + City Lockdown",
    "Length of action 3 (days)": "Action 3: Isolation of symptomatic individuals + global_interventions",
    "Length of action 4 (days)": "Action 4: Social distancing",
    "Length of action 5 (days)": "Action 5: School closure + house_interventions",
    "Length of action 6 (days)": "Action 6: Workplace closure + house_interventions",
    "Length of action 7 (days)": "Action 7: Elderly distancing + house_interventions",
    "Length of action 8 (days)": "Action 8: Household isolation",
    "Length of action 9 (days)": "Action 9: City Lockdown",
    "Length of action 10 (days)": "Action 10: General Lockdown"
}

# Severe Action List
severe_actions = [
    "Action 1: house_interventions + General Lockdown",
    "Action 2: global_interventions + City Lockdown",
    "Action 6: Workplace closure + house_interventions",
    "Action 9: City Lockdown",
    "Action 10: General Lockdown"
]

#For every city
for city in data['City'].unique():
    city_data = actions_long[actions_long['City'] == city]

    # Facet Grid
    g = sns.catplot(
        data=city_data,
        x='Action', y='Average Days', hue='Compliance', col='OrderVac',
        kind='bar', height=6, aspect=1.2, palette='muted'
    )


    g.fig.suptitle(f"Actions by Compliance and Vaccination Order - City: {city}", y=1.05)


    g.set_xticklabels(rotation=45, ha='right')


    for ax in g.axes.flat:
        for label in ax.get_xticklabels():
            action_name = action_name_mapping.get(label.get_text())
            if action_name in severe_actions:
                label.set_color('red')


    g.set_axis_labels("Actions", "Average Days")


    g.tight_layout()
    plt.show()





In [None]:
# Bootstrap function to compute mean and confidence interval
def bootstrap_ci(data, metric, n_iterations=1000, ci=97):
    bootstrapped_means = []
    for _ in range(n_iterations):
        sample = data.sample(frac=1, replace=True)  # Resample with replacement
        bootstrapped_means.append(sample[metric].mean())  # Calculate mean for each sample
    lower = np.percentile(bootstrapped_means, (100 - ci) / 2)
    upper = np.percentile(bootstrapped_means, 100 - (100 - ci) / 2)
    return np.mean(bootstrapped_means), (lower, upper)

# List of metrics to analyze
metrics = ['CumulativeEconomicIndex', 'Hospitalize', 'Deceased', 'Rt']

# Dictionary to store bootstrap results
results = {}

# Mapping of action column names to descriptive labels
action_names = {
    "Length of action 0 (days)": "Action 0: No intervention",
    "Length of action 1 (days)": "Action 1: house_interventions + General Lockdown",
    "Length of action 2. (days)": "Action 2: global_interventions + City Lockdown",
    "Length of action 3 (days)": "Action 3: Isolation of symptomatic individuals + global_interventions",
    "Length of action 4 (days)": "Action 4: Social distancing",
    "Length of action 5 (days)": "Action 5: School closure + house_interventions",
    "Length of action 6 (days)": "Action 6: Workplace closure + house_interventions",
    "Length of action 7 (days)": "Action 7: Elderly distancing + house_interventions",
    "Length of action 8 (days)": "Action 8: Household isolation",
    "Length of action 9 (days)": "Action 9: City Lockdown",
    "Length of action 10 (days)": "Action 10: General Lockdown"
}

# Define severe and less severe actions
severe_actions = [
    "Action 1: house_interventions + General Lockdown",
    "Action 2: global_interventions + City Lockdown",
    "Action 6: Workplace closure + house_interventions",
    "Action 9: City Lockdown",
    "Action 10: General Lockdown"
]

less_severe_actions = [
    "Action 0: No intervention",
    "Action 3: Isolation of symptomatic individuals + global_interventions",
    "Action 4: Social distancing",
    "Action 5: School closure + house_interventions",
    "Action 7: Elderly distancing + house_interventions",
    "Action 8: Household isolation"
]

# Perform bootstrap analysis for each city, compliance, and vaccination strategy
for city in data['City'].unique():
    for compliance in data['Compliance'].unique():
        for order_vac in data['OrderVac'].unique():
            filtered_data = data[
                (data['City'] == city) &
                (data['Compliance'] == compliance) &
                (data['OrderVac'] == order_vac)
            ]
            print(f"\nResults for City: {city}, Compliance: {compliance}, OrderVac: {order_vac}")
            city_results = {}

            # Bootstrap each metric
            for metric in metrics:
                mean_val, ci_val = bootstrap_ci(filtered_data, metric)
                city_results[metric] = {'Mean': mean_val, 'CI': ci_val}
                print(f"{metric} - Mean: {mean_val:.2f}, CI: {ci_val}")

            # Compute average action durations and extract top 11
            top_actions = filtered_data.filter(like="Length of action").mean().sort_values(ascending=False).head(11)
            city_results['Top Actions'] = top_actions
            print("Top 11 Actions:")
            for action, days in top_actions.items():
                print(f"{action}: {days:.2f} average days")

            results[(city, compliance, order_vac)] = city_results

In [None]:
# Pivot table: average economic value
heatmap_data = summary_df.pivot_table(
    index=['City', 'Compliance'],
    columns='OrderVac',
    values='CumulativeEconomicIndex Mean'
)

# Abbreviate large numbers without currency symbol
def abbreviate_number(x):
    if abs(x) >= 1e9:
        return f"{x/1e9:.2f}"
    elif abs(x) >= 1e6:
        return f"{x/1e6:.2f}M"
    elif abs(x) >= 1e3:
        return f"{x/1e3:.2f}K"
    else:
        return f"{x:.2f}"

# Color normalization based on values
norm = Normalize(vmin=np.nanmin(heatmap_data.values), vmax=np.nanmax(heatmap_data.values))
cmap = plt.get_cmap("YlOrRd")
mappable = ScalarMappable(norm=norm, cmap=cmap)

# Plot heatmap
plt.figure(figsize=(15, 8))
ax = sns.heatmap(
    heatmap_data,
    cmap=cmap,
    linewidths=1,
    linecolor='black',
    cbar_kws={'shrink': 1.0, 'label': 'Cumulative Economic Output (In billions of USD)'},
    xticklabels=True,
    yticklabels=True
)

# Customize colorbar
cbar = ax.collections[0].colorbar
cbar.ax.tick_params(labelsize=17)
cbar.ax.yaxis.label.set_size(17)

# Remove scientific notation (e.g., 1e9) from colorbar
cbar.ax.yaxis.set_major_formatter(FuncFormatter(lambda x, _: f'{x*1e-9:.1f}'))

# Add text labels inside heatmap cells
for i in range(heatmap_data.shape[0]):
    for j in range(heatmap_data.shape[1]):
        value = heatmap_data.iloc[i, j]
        if not np.isnan(value):
            text = abbreviate_number(value)
            bg_color = mappable.to_rgba(value)
            r, g, b, _ = bg_color
            brightness = (r*299 + g*587 + b*114) / 1000
            text_color = 'black' if brightness > 0.5 else 'white'
            ax.text(j + 0.5, i + 0.5, text, ha='center', va='center',
                    fontsize=17, color=text_color, fontweight='bold')

# Preserve axis labels
plt.xlabel("Vaccination Order Strategy", fontsize=17)
plt.ylabel("City and Compliance Level", fontsize=17)
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)

# Add (A) and (B) above columns without a bounding box
column_labels = heatmap_data.columns.tolist()
n_cols = len(column_labels)

for idx, col_name in enumerate(column_labels):
    label = ''
    if col_name.lower() == 'ascending':
        label = '(A)'
    elif col_name.lower() == 'descending':
        label = '(B)'

    if label:
        ax.text(
            (idx + 0.5) / n_cols,   # Horizontally centered above each column
            1.05,                   # Slightly above the top of the plot
            label,
            transform=ax.transAxes,
            ha='center', va='bottom',
            fontsize=18, fontweight='bold'
        )

# Final layout adjustment
plt.tight_layout()
plt.show()

In [None]:
# Font configuration
plt.rcParams['font.family'] = 'DejaVu Sans'

# Dictionaries and mappings
original_to_new = {
    1: 0, 2: 1, 6: 2, 9: 3, 10: 4,
    0: 5, 3: 6, 4: 7, 5: 8, 7: 9, 8: 10
}

new_action_labels = {
    0: "Full Lockdown + House Vaccination",
    1: "City Lockdown + Global Vaccination",
    2: "Workplace + House Vaccination",
    3: "City Lockdown",
    4: "Full Lockdown",
    5: "No Intervention",
    6: "Isolation + Global Vaccination",
    7: "Social Distancing",
    8: "School + House Vaccination",
    9: "Elderly + House Vaccination",
    10: "Household Isolation"
}

severe_actions = [0, 1, 2, 3, 4]

results_data = {
    ('Holon', 0.35): {6: 21.5, 4: 20.17, 8: 17.5, 10: 13, 2: 8.5, 9: 8, 1: 8, 7: 7, 0: 5.5, 3: 5.5, 5: 5},
    ('Holon', 0.75): {4: 43.5, 6: 19.5, 10: 8.5, 8: 8, 7: 7.5, 0: 6.5, 1: 6.5, 3: 6, 5: 5.5, 9: 5.5, 2: 2.5},
    ('Holon', 1.0): {4: 49.5, 0: 11.5, 6: 10.5, 5: 8.5, 7: 8.5, 8: 8, 2: 7, 9: 6.5, 1: 4, 10: 4, 3: 2},
    ('Bene Beraq', 0.35): {8: 26.0, 6: 16.0, 10: 14.0, 1: 11.5, 5: 10.0, 9: 9.0, 4: 8.0, 3: 8.0, 7: 7.0, 2: 6.0, 0: 4.5},
    ('Bene Beraq', 0.75): {8: 23.5, 4: 14.5, 6: 12.5, 10: 12.0, 3: 11.5, 1: 9.0, 5: 9.0, 7: 9.0, 0: 6.5, 2: 6.5, 9: 6.0},
    ('Bene Beraq', 1.0): {4: 20.0, 6: 17.5, 5: 15.5, 0: 13.0, 7: 10.0, 3: 9.0, 8: 8.0, 10: 8.0, 1: 7.5, 9: 7.5, 2: 4.0},
}

compliance_colors = {
    0.35: "#f0f8ff",
    0.75: "#cce5ff",
    1.0: "#99ccff"
}

# Start of plots
sns.set_style("whitegrid")
fig, axes = plt.subplots(nrows=2, ncols=3, figsize=(22, 12), sharey=True)

cities = ['Holon', 'Bene Beraq']
compliance_levels = [0.35, 0.75, 1.0]
severe_avgs_by_city = {"Holon": [], "Bene Beraq": []}
letter_labels = ['(A)', '(B)', '(C)', '(D)', '(E)', '(F)']  # Letter labels

plot_idx = 0

for city in cities:
    for compliance in compliance_levels:
        ax = axes[plot_idx // 3, plot_idx % 3]
        key = (city, compliance)
        raw_data = results_data.get(key, {})
        if not raw_data:
            continue

        ax.set_facecolor(compliance_colors[compliance])

        cleaned_data = {}
        for orig, days in raw_data.items():
            orig_clean = int(str(orig).replace('.', ''))
            new_index = original_to_new[orig_clean]
            cleaned_data[new_index] = days

        sorted_items = sorted(cleaned_data.items())
        x = [i for i, _ in sorted_items]
        y = [v for _, v in sorted_items]

        bar_colors = ['red' if idx in severe_actions else 'green' for idx in x]
        bar_edges = ['black'] * len(x)
        bar_widths = [1.5] * len(x)

        for i, (xi, yi) in enumerate(zip(x, y)):
            ax.bar(xi, yi,
                   color=bar_colors[i],
                   edgecolor=bar_edges[i],
                   linewidth=bar_widths[i],
                   alpha=0.85)

        severe_y = [v for i, v in zip(x, y) if i in severe_actions]
        avg_severe = sum(severe_y) / len(severe_y) if severe_y else 0
        severe_avgs_by_city[city].append(avg_severe)

        ax.set_xticks(x)
        ax.set_xticklabels(x, fontsize=14)
        ax.set_ylim(0, 60)
        ax.set_ylabel("Avg Days", fontsize=19)
        ax.tick_params(axis='y', labelsize=14)
        ax.yaxis.set_tick_params(labelleft=True)

        total_days = sum(y)
        severe_days = sum(val for idx, val in zip(x, y) if idx in severe_actions)
        severe_pct = (severe_days / total_days) * 100 if total_days > 0 else 0

        severe_text = ax.text(0.02, 0.95, f"Severe: {severe_pct:.1f}%", transform=ax.transAxes,
                              fontsize=15, ha="left", va="top",
                              bbox=dict(facecolor='white', alpha=0.6, pad=0.3))
        severe_text.set_path_effects([
            path_effects.Stroke(linewidth=1.1, foreground='black'),
            path_effects.Normal()
        ])

        ax.axhline(avg_severe, linestyle='--', color='blue', alpha=0.7)
        ax.text(x[-1] + 1.0, avg_severe, f"{avg_severe:.1f}",
                va='center', fontsize=20, color='blue')

        order_vac = "DESCENDING" if city == "Holon" else "ASCENDING"

        # Add letter labels to titles
        ax.set_title(f"{letter_labels[plot_idx]} {city}, Compliance = {compliance} ({order_vac})",
                     fontsize=15, fontweight='bold')

        for spine in ax.spines.values():
            spine.set_visible(True)

        plot_idx += 1

# Legend
severe_handles = [Patch(facecolor='red', edgecolor='black', label=f"{i}: {new_action_labels[i]}")
                  for i in severe_actions]
non_severe_handles = [Patch(facecolor='green', edgecolor='black', label=f"{i}: {new_action_labels[i]}")
                      for i in range(11) if i not in severe_actions]

fig.legend(
    handles=severe_handles + non_severe_handles,
    title="Interventions List (Red = Severe, Green = Less Severe)",
    loc='lower center',
    bbox_to_anchor=(0.5, 0.05),
    fontsize=18,
    title_fontsize=19,
    ncol=4,
    frameon=False
)

plt.subplots_adjust(hspace=0.2, wspace=0.2, bottom=0.22, right=0.95)

# Summary line chart
fig2, ax2 = plt.subplots(figsize=(9, 5))
x_vals = compliance_levels
ax2.plot(x_vals, severe_avgs_by_city["Holon"], marker='o', label='Holon (DESC)', color='blue')
ax2.plot(x_vals, severe_avgs_by_city["Bene Beraq"], marker='o', label='Bene Beraq (ASC)', color='orange')
ax2.set_xlabel("Compliance Level", fontsize=14)
ax2.set_ylabel("Avg Duration of Severe Actions (Days)", fontsize=14)
ax2.set_title("Severe Action Duration vs Compliance Level", fontsize=16)
ax2.tick_params(labelsize=13)
ax2.legend(fontsize=13)
ax2.grid(True)

plt.tight_layout()
plt.show()

In [None]:
# Mapping from original action code to standardized index
original_to_new = {
    1: 0, 2: 1, 6: 2, 9: 3, 10: 4,
    0: 5, 3: 6, 4: 7, 5: 8, 7: 9, 8: 10
}

# Reversed mapping (if needed)
# Note: Assumes star_to_original was defined elsewhere
# Uncomment and define if needed
# original_to_star = {v: k for k, v in star_to_original.items()}

# Intervention labels indexed by standardized action number
new_action_labels = {
    0: "Full Lockdown + House Vaccination",
    1: "City Lockdown + Global Vaccination",
    2: "Workplace + House Vaccination",
    3: "City Lockdown",
    4: "Full Lockdown",
    5: "No Intervention",
    6: "Isolation + Global Vaccination",
    7: "Social Distancing",
    8: "School + House Vaccination",
    9: "Elderly + House Vaccination",
    10: "Household Isolation"
}

# Action indices considered severe
severe_actions = [0, 1, 2, 3, 4]

# Background color by compliance level
compliance_colors = {
    0.35: "#f0f8ff",
    0.75: "#cce5ff",
    1.0: "#99ccff"
}

# Average days data: star-format intervention durations by city and compliance level
results_data_star = {
    ('Holon', 0.35): {2: 27.0, 4: 26.0, 7: 19.5, 10: 9.0, 0: 7.5, 1: 7.5, 8: 6.5, 5: 6.0, 6: 4.5, 9: 3.5, 3: 3.0},
    ('Holon', 0.75): {2: 27.0, 7: 21.0, 4: 20.0, 9: 8.0, 10: 7.5, 5: 7.5, 1: 6.5, 8: 6.5, 6: 6.0, 3: 6.0, 0: 4.5},
    ('Holon', 1.0): {5: 23.5, 7: 22.0, 8: 16.0, 2: 12.0, 9: 9.0, 10: 9.0, 0: 6.0, 6: 6.0, 1: 5.5, 3: 5.5, 4: 5.5},
    ('Bene Beraq', 0.35): {10: 18.5, 4: 18.0, 8: 16.0, 0: 15.0, 6: 11.0, 9: 10.5, 2: 9.0, 1: 8.0, 3: 6.5, 5: 5.5, 7: 2.0},
    ('Bene Beraq', 0.75): {0: 17.5, 4: 17.5, 10: 14.5, 8: 12.5, 9: 11.5, 7: 10.5, 6: 8.5, 2: 8.0, 5: 8.0, 3: 6.0, 1: 5.5},
    ('Bene Beraq', 1.0): {7: 20.0, 4: 16.0, 10: 15.5, 6: 13.0, 0: 10.5, 8: 9.5, 5: 9.5, 9: 9.0, 2: 8.0, 3: 5.5, 1: 3.5},
}

# Set figure style
sns.set_style("whitegrid")
fig, axes = plt.subplots(nrows=2, ncols=3, figsize=(22, 12), sharey=True)

cities = ['Holon', 'Bene Beraq']
compliance_levels = [0.35, 0.75, 1.0]
severe_avgs_by_city = {city: [] for city in cities}
letter_labels = ['(A)', '(B)', '(C)', '(D)', '(E)', '(F)']

plot_idx = 0

# Generate bar plots per subplot
for city in cities:
    for compliance in compliance_levels:
        ax = axes[plot_idx // 3, plot_idx % 3]
        key = (city, compliance)
        data = results_data_star.get(key, {})
        if not data:
            continue

        ax.set_facecolor(compliance_colors[compliance])

        sorted_items = sorted(data.items())
        x = [i for i, _ in sorted_items]
        y = [v for _, v in sorted_items]
        bar_colors = ['red' if i in severe_actions else 'green' for i in x]

        # Draw bars
        for i, (xi, yi) in enumerate(zip(x, y)):
            ax.bar(
                xi, yi,
                color=bar_colors[i],
                edgecolor='black',
                linewidth=1.5,
                alpha=0.85
            )

        # Calculate average duration for severe actions
        severe_y = [v for i, v in zip(x, y) if i in severe_actions]
        avg_severe = sum(severe_y) / len(severe_y) if severe_y else 0
        severe_avgs_by_city[city].append(avg_severe)

        # Axis and tick configuration
        ax.set_xticks(x)
        ax.set_xticklabels(x, fontsize=14)
        ax.set_ylim(0, 60)
        ax.set_ylabel("Avg Days", fontsize=19)
        ax.tick_params(axis='y', labelsize=14)

        # Annotate percentage of severe days
        total_days = sum(y)
        severe_days = sum(v for i, v in zip(x, y) if i in severe_actions)
        severe_pct = (severe_days / total_days) * 100 if total_days > 0 else 0

        annotation = ax.text(
            0.02, 0.95, f"Severe: {severe_pct:.1f}%",
            transform=ax.transAxes,
            fontsize=15, ha="left", va="top",
            bbox=dict(facecolor='white', alpha=0.6, pad=0.3)
        )
        annotation.set_path_effects([
            path_effects.Stroke(linewidth=1.1, foreground='black'),
            path_effects.Normal()
        ])

        # Draw average line
        ax.axhline(avg_severe, linestyle='--', color='blue', alpha=0.7)
        ax.text(
            x[-1] + 1.0, avg_severe,
            f"{avg_severe:.1f}",
            va='center', fontsize=20, color='blue'
        )

        # Title with subplot label
        order_vac = "ASCENDING" if city == "Holon" else "DESCENDING"
        ax.set_title(
            f"{letter_labels[plot_idx]} {city}, Compliance = {compliance} ({order_vac})",
            fontsize=15, fontweight='bold'
        )

        plot_idx += 1

# Legend for interventions
severe_handles = [
    Patch(facecolor='red', edgecolor='black', label=f"{i}: {new_action_labels[i]}")
    for i in severe_actions
]
non_severe_handles = [
    Patch(facecolor='green', edgecolor='black', label=f"{i}: {new_action_labels[i]}")
    for i in range(11) if i not in severe_actions
]

fig.legend(
    handles=severe_handles + non_severe_handles,
    title="Interventions List (Red = Severe, Green = Less Severe)",
    loc='lower center',
    bbox_to_anchor=(0.5, 0.05),
    fontsize=18,
    title_fontsize=19,
    ncol=4,
    frameon=False
)

plt.subplots_adjust(hspace=0.2, wspace=0.2, bottom=0.22, right=0.95)
plt.show()

In [None]:
# === Set Plotting Parameters ===
# Adjust default font settings for consistent visualization styling
rcParams['font.family'] = 'DejaVu Sans'
rcParams['font.size'] = 10

# === Define Data ===
# Each entry includes: City, Compliance Level, Vaccination Order, and Severe Action durations (5 simulations per setting)
data = [
    ("Holon", 0.35, "ASCENDING", [7.5, 7.5, 27.0, 3.0, 26.0]),
    ("Holon", 0.35, "DESCENDING", [8.0, 8.5, 21.5, 8.0, 13.0]),
    ("Holon", 0.75, "ASCENDING", [4.5, 6.5, 27.0, 6.0, 20.0]),
    ("Holon", 0.75, "DESCENDING", [6.5, 2.5, 19.5, 5.5, 8.5]),
    ("Holon", 1.00, "ASCENDING", [6.0, 5.5, 12.0, 5.5, 5.5]),
    ("Holon", 1.00, "DESCENDING", [4.0, 7.0, 10.5, 6.5, 4.0]),
    ("Bene Beraq", 0.35, "ASCENDING", [11.5, 6.0, 16.0, 9.0, 14.0]),
    ("Bene Beraq", 0.35, "DESCENDING", [15.0, 8.0, 9.0, 6.5, 18.0]),
    ("Bene Beraq", 0.75, "ASCENDING", [9.0, 6.5, 12.5, 6.0, 12.0]),
    ("Bene Beraq", 0.75, "DESCENDING", [10.5, 5.5, 8.0, 6.0, 17.5]),
    ("Bene Beraq", 1.00, "ASCENDING", [7.5, 4.0, 17.5, 7.5, 8.0]),
    ("Bene Beraq", 1.00, "DESCENDING", [10.5, 3.5, 6.5, 5.5, 16.5]),
]

# === Convert Data to DataFrame ===
df = pd.DataFrame(data, columns=["City", "Compliance", "OrderVac", "Severe_Actions"])
df["Mean_Severe_Days"] = df["Severe_Actions"].apply(np.mean)  # Compute mean for visualization

# === Create Plot ===
fig, ax = plt.subplots(figsize=(7.5, 5.5))  # Set figure size in inches

# Plot a line for each combination of City and Vaccination Order
for city in df["City"].unique():
    for order in df["OrderVac"].unique():
        subset = df[(df["City"] == city) & (df["OrderVac"] == order)]
        ax.plot(subset["Compliance"], subset["Mean_Severe_Days"],
                marker='o', linestyle='-', label=f"{city} ({order})")

# === Configure Plot Labels and Aesthetics ===
ax.set_xlabel("Compliance Level")
ax.set_ylabel("Average Days of Severe Actions")
ax.set_xticks([0.35, 0.75, 1.00])
ax.grid(axis='y', linestyle='--', alpha=0.7)
ax.legend(loc='upper right', fontsize=9)  # Add legend in top-right corner

# === Save Plot as TIFF (Publication Quality) ===
output_path = "Fig_Severe_Actions.tif"
fig.savefig(output_path, dpi=300, format='tiff', pil_kwargs={"compression": "tiff_lzw"})

# === Automatically Download File in Google Colab ===
files.download(output_path)