## Colorized Iowa 2D Pareto fronts
This notebook explores the tradeoffs between districting criteria using a dataset of 5,000 county-level Iowa districting plans collected using GerryChain's random ReCom algorithm with a population tolerance of ±0.2%. This population bound has been significantly tightened from previous runs, which used a population tolerance of ±2%.

This notebook continues our exploration of these tradeoffs with a different sample. Columns:
- `cut_edges`: Percent of cut edges (relative to total edges)
- `pop_pct`: Average percent population deviation across districts
- `egs`: Efficiency gap (2000 Presidential election)
- `mms`: Mean-median score (2000 Presidential election)
- `polpop`: Polsby-Popper score

In [None]:
%config InlineBackend.figure_formats = ['svg']

import pandas as pd
import pareto
import matplotlib.pyplot as plt; plt.style.use('ggplot')
prefix = 'results/IA_counties_run_3_recom_tight_5000'

In [None]:
data = pd.read_csv('data/IA_counties_run_3_recom_tight_5000.csv')
data['pop_dev_pct_abs'] = abs(data['pop_pct'])  # Absolute average population 
data['pop_dev_pct_squared'] = data['pop_pct']**2
data['mms_abs'] = abs(data['mms'])
collection = pareto.ParetoCollection(updaters=list(data.columns))

In [None]:
collection.add(data.to_dict(orient='records'))

In [None]:
data.columns

In [None]:
def plot_colorized_front(x_col, y_col, color_col, maxima=False, cmap='inferno'):
    front = collection.front([x_col, y_col, color_col], maxima=maxima)
    x = [plan[x_col] for plan in collection.points]
    y = [plan[y_col] for plan in collection.points]
    color = [plan[color_col] for plan in collection.points]
    vmin = min(color)
    vmax = max(color)
    pareto_x = [plan[x_col] for plan in front]
    pareto_y = [plan[y_col] for plan in front]
    pareto_color = [plan[color_col] for plan in front]
    plt.scatter(x, y, c=color, marker='.', cmap=cmap,
                vmin=vmin, vmax=vmax)
    front_type = 'maxima' if maxima else 'minima'
    plt.scatter(pareto_x, pareto_y, c=pareto_color, cmap=cmap,
                vmin=vmin, vmax=vmax,
                label=f'Pareto front ({front_type})')
    plt.colorbar()
    plt.legend()

In [None]:
plt.figure(figsize=(8, 6))
plot_colorized_front('mms_abs', 'cut_edges', 'pop_pct')
plt.xlabel('Absolute mean-median score')
plt.ylabel('% of cut edges')
plt.title('Abs. MMS vs. cut edges in Iowa (colorized by pop. dev.)')
plt.savefig(f'{prefix}/mms_cut_edges_population_deviation.png', dpi=300, figsize=(8, 6))
plt.show()