Before you turn this problem in, make sure everything runs as expected. First, **restart the kernel** (in the menubar, select Kernel $\rightarrow$ Restart) and then **run all cells** (in the menubar, select Cell $\rightarrow$ Run All).

Make sure you fill in any place that says `YOUR CODE HERE` or "YOUR ANSWER HERE", as well as your name and collaborators below:

In [1]:
NAME = ""
COLLABORATORS = ""

---

Before you turn this problem in, make sure everything runs as expected. First, **restart the kernel** (in the menubar, select Kernel $\rightarrow$ Restart) and then **run all cells** (in the menubar, select Cell $\rightarrow$ Run All).

Make sure you fill in any place that says `YOUR CODE HERE` or "YOUR ANSWER HERE", as well as your name and collaborators below:

In [2]:
NAME = ""
COLLABORATORS = ""

---

In [5]:
!pip install geojson
!pip install shapely
!pip install PyShp
!pip install networkx
!pip install osmnx
!pip install pandas

Collecting osmnx
  Downloading osmnx-2.0.6-py3-none-any.whl.metadata (4.9 kB)
Collecting geopandas>=1.0.1 (from osmnx)
  Downloading geopandas-1.0.1-py3-none-any.whl.metadata (2.2 kB)
Collecting pandas>=1.4 (from osmnx)
  Downloading pandas-2.3.3-cp39-cp39-win_amd64.whl.metadata (19 kB)
Collecting requests>=2.27 (from osmnx)
  Downloading requests-2.32.5-py3-none-any.whl.metadata (4.9 kB)
Collecting pyogrio>=0.7.2 (from geopandas>=1.0.1->osmnx)
  Downloading pyogrio-0.11.1-cp39-cp39-win_amd64.whl.metadata (5.4 kB)
Collecting pyproj>=3.3.0 (from geopandas>=1.0.1->osmnx)
  Downloading pyproj-3.6.1-cp39-cp39-win_amd64.whl.metadata (31 kB)
Collecting pytz>=2020.1 (from pandas>=1.4->osmnx)
  Downloading pytz-2025.2-py2.py3-none-any.whl.metadata (22 kB)
Collecting tzdata>=2022.7 (from pandas>=1.4->osmnx)
  Downloading tzdata-2025.2-py2.py3-none-any.whl.metadata (1.4 kB)
Collecting certifi (from pyogrio>=0.7.2->geopandas>=1.0.1->osmnx)
  Downloading certifi-2025.11.12-py3-none-any.whl.metadat

In [6]:
import geojson
import pandas as pd
import numpy as np
import networkx as nx
import time
import csv
import ast
import shapefile as shp
from shapely.geometry import Polygon,shape,MultiPolygon
import shapely.ops
import warnings
import math
from collections import deque
warnings.simplefilter(action='ignore', category=pd.errors.PerformanceWarning)

### Helper functions

In [7]:
def isDistrictContiguous(district_num, assignment, contiguity_list, print_isolates=False, ignore_list=[]):
    ## input:
    ## district_num: the district number
    ## assignment: the assignment from precinct to district
    ## contiguity_list: the list of neighbors for each precinct, from the csv file
    contiguity_list.columns = ['Precinct','Neighbors']
    district_graph = nx.Graph() #creates an empty undirected graph
    district_nodes = assignment[assignment['District']==district_num]['GEOID20'].tolist()
    for i in ignore_list:
        try:
            district_nodes.remove(i)
        except ValueError:
            pass
    district_graph.add_nodes_from(district_nodes)
    for id in district_nodes:
        neighbors = ast.literal_eval(contiguity_list[contiguity_list['Precinct']==id]['Neighbors'].values.tolist()[0])
        # needed to convert string to list because the csv encodes the list as a string
        for neighbor in neighbors:
            if neighbor in district_nodes:
                district_graph.add_edge(id,neighbor)
    if(print_isolates):
        print(list(nx.isolates(district_graph)))
    return nx.is_connected(district_graph)

In [8]:
def getDistrictPopulations(assignment,data_file, num_district):
    population = {}
    for i in range (1,num_district+1):
        population[i] = data_file[data_file['GEOID20'].isin(assignment[assignment['District']==i]['GEOID20'])]['Total_2020_Total'].sum()
    return population

In [9]:
def getDistrictShape(district_id, assignment, boundaries):
    list_precincts = assignment[assignment['District']==district_id]['GEOID20']
    precinct_shapes = []
    for i in list_precincts:
        if shape(boundaries[i]).geom_type == 'Polygon':
            precinct_shapes.append(Polygon(shape(boundaries[i])))
        elif shape(boundaries[i]).geom_type == 'MultiPolygon':
            precinct_shapes.append(MultiPolygon(shape(boundaries[i])))      
    district_shape = shapely.ops.unary_union(precinct_shapes)
    #print(district_shape)
    return district_shape

In [10]:
def pp_compactness(geom): # Polsby-Popper
    p = geom.length
    a = geom.area    
    return (4*np.pi*a)/(p*p)

def box_reock_compactness(geom): # Reock on a rectangle bounding box
    a = geom.area 
    bb = geom.bounds # bounds gives you the minimum bounding box (rectangle)
    bba = abs(bb[0]-bb[2])*abs(bb[1]-bb[3])
    return a/bba

# This Notebook will help you get started on NJ
The data is in Canvas, you should upload it to your Google Drive first (if using Colab), or local filesystem (if using Jupyter).

### This is the current assignment of precinct to congressional districts (12 of them for NJ)
Note that the map shown in DRA is slightly different. This is because some precincts are split in the real assignment, and some additional precinct are created to handle special situations such as prisoners and overseas citizens. You can ignore this for the class project and just use the data and functions provided.

In [11]:
nj_current_assignment = pd.read_csv('Map_Data/precinct-assignments-congress-nj.csv')
nj_current_assignment

Unnamed: 0,GEOID20,District
0,34033020001,2
1,34033001501,2
2,34033042009,2
3,34033000703,2
4,34033042001,2
...,...,...
6356,34039045302,7
6357,34039045303,7
6358,34039045404,7
6359,34039045801,7


### This is the current demographic and voter data
The data has a lot of attributes that lists voters of different demographics and parties in different elections. You can look at the data Dictionary on Canvas to get details. For this recitation we will only keep votes from the 2020  presidential election and the total 2020 population counts. You can use additional columns (e.g., Governor's elections results, voting age (VAP) population counts, or the composite Dem/Rep score)

In [12]:
nj_precinct_data = pd.read_csv('Map_Data/precinct-data-congress-nj.csv')
keepcolumns = ['GEOID20','District','Total_2020_Pres','Dem_2020_Pres','Rep_2020_Pres','Total_2020_Total','White_2020_Total','Hispanic_2020_Total','Black_2020_Total','Asian_2020_Total','Native_2020_Total','Pacific_2020_Total']
nj_precinct_data = nj_precinct_data[keepcolumns]
nj_precinct_data

Unnamed: 0,GEOID20,District,Total_2020_Pres,Dem_2020_Pres,Rep_2020_Pres,Total_2020_Total,White_2020_Total,Hispanic_2020_Total,Black_2020_Total,Asian_2020_Total,Native_2020_Total,Pacific_2020_Total
0,34001005101,2,876,393,472,1240,946,128,102,66,24,0
1,34001005102,2,852,450,388,1913,1331,211,286,84,38,4
2,34001005103,2,1206,517,672,1760,1375,177,78,106,20,2
3,34001005201,2,828,348,469,1311,906,168,150,64,50,5
4,34001005202,2,868,579,282,1892,537,336,598,450,25,1
...,...,...,...,...,...,...,...,...,...,...,...,...
6356,34041115002,7,606,182,418,737,714,11,3,8,0,0
6357,34041115003,7,617,187,418,934,820,60,26,10,14,0
6358,34041115004,7,478,160,308,697,602,66,16,4,5,0
6359,34041115005,7,592,201,381,930,831,47,27,11,12,0


### This is the precinct boundary data (uses shapely)

This is data that represents the geography of the districts. It is needed to test for contiguity, or for any districting partitioning method based on geography. The data is in Shapely format. Each district is represented as a set of points that are connected to create the district shape (in the long/lat coordinates). Shapely geometric functions can be used to compare the shapes. These can be quite inefficient to run, so I am also providing you a pre-computed index that, for each district, lists the districts that are contiguous to it. You can see the code to generate the index in Contiguity.ipynb.

To manipulate the shapes, cast them into Shapely Polygons (see example below) and you can use the Polygon properties and functions: https://shapely.readthedocs.io/en/stable/reference/shapely.Polygon.html#shapely.Polygon

In [13]:
shpfile = 'Map_Data/nj_vtd_2020_bound/nj_vtd_2020_bound.shp'
dbffile = 'Map_Data/nj_vtd_2020_bound/nj_vtd_2020_bound.dbf'
shxfile = 'Map_Data/nj_vtd_2020_bound/nj_vtd_2020_bound.shx'


shpfile = shp.Reader(shp=shpfile, shx=shxfile, dbf=dbffile)
nj_precinct_boundaries={}
for sr in shpfile.iterShapeRecords():
    geom = sr.shape # get geo bit
    rec = sr.record # get db fields
    nj_precinct_boundaries[rec[3]]=geom

### This is the precinct boundary data 

This use the contiguity index I have pre-computed using Contiguity.ipynb, that is stored in Contiguity_nj.csv. 

In [14]:
nj_contiguity = pd.read_csv('Contiguity_nj.csv', header=None)

In [15]:
for i in range(1,13):
    print("District "+str(i)+" "+str(isDistrictContiguous(i, nj_current_assignment, nj_contiguity)))

District 1 True
District 2 True
District 3 True
District 4 True
District 5 True
District 6 True
District 7 True
District 8 True
District 9 True
District 10 True
District 11 True
District 12 True


In [16]:
#Compactness of the current assignment
for district in range(1,13):
    print("D"+str(district)+" PP : "+str(pp_compactness(getDistrictShape(district,nj_current_assignment,nj_precinct_boundaries))))
    print("D"+str(district)+" BR : "+str(box_reock_compactness(getDistrictShape(district,nj_current_assignment,nj_precinct_boundaries))))
    

D1 PP : 0.41768102211569347
D1 BR : 0.45075813446417157
D2 PP : 0.2632665176502347
D2 BR : 0.38278056561756263
D3 PP : 0.2280937682959879
D3 BR : 0.38134920642809134
D4 PP : 0.24812480573284196
D4 BR : 0.5390173747196018
D5 PP : 0.2410116694999733
D5 BR : 0.36320426268176653
D6 PP : 0.14677124653853732
D6 BR : 0.32853496486220907
D7 PP : 0.20246375771704353
D7 BR : 0.44049249082841035
D8 PP : 0.11227347882175574
D8 BR : 0.36670629634952656
D9 PP : 0.1683197710884751
D9 BR : 0.29705227593212374
D10 PP : 0.12061263370064262
D10 BR : 0.34528827672774703
D11 PP : 0.22236600778446886
D11 BR : 0.5557086439792166
D12 PP : 0.1620092442171186
D12 BR : 0.38520439164401626


In [17]:
# District Population of the current assignment
print(getDistrictPopulations(nj_current_assignment,nj_precinct_data, 12))

{1: np.int64(775340), 2: np.int64(778354), 3: np.int64(778489), 4: np.int64(767834), 5: np.int64(774454), 6: np.int64(778516), 7: np.int64(785173), 8: np.int64(800074), 9: np.int64(766863), 10: np.int64(746178), 11: np.int64(769523), 12: np.int64(768196)}


# A simple geographical  redistricting strategy

We can create a simple geopgraphical map, like we did for NH. In this case, we have 12 districts, so let's splitting the district in half North/South, and in 6th  East/West. 
New Hampshire's bounding box is (-75.559614,38.928519,-73.893979,41.357423) (https://anthonylouisdagostino.com/bounding-boxes-for-all-us-states/)
So let's start by splitting the state approximately though the middle longitude (-74.72) : everything west of longitude -71.583934 is in odd Districts, everything east is in even Districts. We will use the precinct centroids to assign them. Then we will divide each half per latitude on the ranges  (38.92, 39.3, 39.7, 40.1, 40.5,40.9,41.35)
Import the Map to DRA to look at it.

In [18]:
nj_longlat_assignment = nj_current_assignment.copy()
nj_longlat_assignment['District'] = 0
for index, row in nj_longlat_assignment.iterrows():
    try:
        if shape(nj_precinct_boundaries[row['GEOID20']]).geom_type == 'Polygon':
            centroid = Polygon(shape(nj_precinct_boundaries[row['GEOID20']])).centroid
        elif shape(nj_precinct_boundaries[row['GEOID20']]).geom_type == 'MultiPolygon':
            centroid = MultiPolygon(shape(nj_precinct_boundaries[row['GEOID20']])).centroid
        else:
            print(shape(nj_precinct_boundaries[row['GEOID20']]).geom_type)
            pass
        if centroid.x <= -74.72:
            if centroid.y <= 39.3:
                nj_longlat_assignment.iloc[index,nj_longlat_assignment.columns.get_loc('District')] = 1
            elif centroid.y <= 39.7:
                nj_longlat_assignment.iloc[index,nj_longlat_assignment.columns.get_loc('District')] = 3
            elif centroid.y <= 40.1:
                nj_longlat_assignment.iloc[index,nj_longlat_assignment.columns.get_loc('District')] = 5
            elif centroid.y <= 40.5:
                nj_longlat_assignment.iloc[index,nj_longlat_assignment.columns.get_loc('District')] = 7
            elif centroid.y <= 40.9:
                nj_longlat_assignment.iloc[index,nj_longlat_assignment.columns.get_loc('District')] = 9
            else:
                nj_longlat_assignment.iloc[index,nj_longlat_assignment.columns.get_loc('District')] = 11
        else:
            if centroid.y <= 39.3:
                nj_longlat_assignment.iloc[index,nj_longlat_assignment.columns.get_loc('District')] = 2
            elif centroid.y <= 39.7:
                nj_longlat_assignment.iloc[index,nj_longlat_assignment.columns.get_loc('District')] = 4
            elif centroid.y <= 40.1:
                nj_longlat_assignment.iloc[index,nj_longlat_assignment.columns.get_loc('District')] = 6
            elif centroid.y <= 40.5:
                nj_longlat_assignment.iloc[index,nj_longlat_assignment.columns.get_loc('District')] = 8
            elif centroid.y <= 40.9:
                nj_longlat_assignment.iloc[index,nj_longlat_assignment.columns.get_loc('District')] = 10
            else:
                nj_longlat_assignment.iloc[index,nj_longlat_assignment.columns.get_loc('District')] = 12
    except KeyError: 
        pass
#print(nh_longitude_assignment)
nj_longlat_assignment.to_csv('Recitation maps/nj_longlat_map.csv',index=False)

In [19]:
#Compactness of this Longlat assignment
for district in range(1,13):
    print("D"+str(district)+" PP : "+str(pp_compactness(getDistrictShape(district,nj_longlat_assignment,nj_precinct_boundaries))))
    print("D"+str(district)+" BR : "+str(box_reock_compactness(getDistrictShape(district,nj_longlat_assignment,nj_precinct_boundaries))))
    

D1 PP : 0.24444322502174795
D1 BR : 0.39921574154934536
D2 PP : 0.21732375139633123
D2 BR : 0.3033805826444085
D3 PP : 0.3012927570664991
D3 BR : 0.7281613087250183
D4 PP : 0.3019375759906228
D4 BR : 0.6388831361756292
D5 PP : 0.29897783517755583
D5 BR : 0.5136294328843819
D6 PP : 0.22596255307162377
D6 BR : 0.5362987383285662
D7 PP : 0.21866681223528045
D7 BR : 0.4337413755961903
D8 PP : 0.3513739185552307
D8 BR : 0.82797734690796
D9 PP : 0.37261194943216025
D9 BR : 0.7269867982048943
D10 PP : 0.28764401412152446
D10 BR : 0.6900841778845721
D11 PP : 0.34638282192061104
D11 BR : 0.47137706735305646
D12 PP : 0.318631303145351
D12 BR : 0.5828294919378438


In [20]:
# District Population of this longlat assignment
print(getDistrictPopulations(nj_longlat_assignment,nj_precinct_data, 12))


{1: np.int64(80406), 2: np.int64(22492), 3: np.int64(317218), 4: np.int64(274411), 5: np.int64(1124427), 6: np.int64(563593), 7: np.int64(244052), 8: np.int64(1471840), 9: np.int64(230117), 10: np.int64(3848094), 11: np.int64(61026), 12: np.int64(1051318)}


# Now create your own redistricting maps
Remember to check for contiguity, and to ensure that the population of the districts are balanced (which is not the case in the example above.)

## Fair Map Seeding Algorithm

This section implements the seeding algorithm for the fair map. The goal is to select 12 initial seed precincts (one per district) that are:
- Geographically spread out across the state
- Have population sizes initially close to the average precinct population
- Consider demographic trends and try to pair demographic groups 
- Not biased towards any political party 
- Located in areas that allow for compact district growth

- Since final population of each district should be around NJPopulation/12.
- The seed population of each district should be 10% of NJ Population/12


In [165]:
def get_precinct_centroid(geoid, boundaries):
    """Get the centroid coordinates for a single precinct."""
    try:
        if shape(boundaries[geoid]).geom_type == 'Polygon':
            centroid = Polygon(shape(boundaries[geoid])).centroid
        elif shape(boundaries[geoid]).geom_type == 'MultiPolygon':
            centroid = MultiPolygon(shape(boundaries[geoid])).centroid
        else:
            return None
        return (centroid.x, centroid.y)  # (longitude, latitude)
    except (KeyError, Exception):
        return None

def get_district_centroid(district_id, assignment, boundaries):
    """Get the centroid coordinates for a district (all precincts in that district)."""
    district_precincts = assignment[assignment['District'] == district_id]['GEOID20'].tolist()
    precinct_centroids = []
    for geoid in district_precincts:
        centroid = get_precinct_centroid(geoid, boundaries)
        if centroid:
            precinct_centroids.append(centroid)
    
    if not precinct_centroids:
        return None
    
    # Calculate average centroid (simple approach)
    avg_lon = sum(c[0] for c in precinct_centroids) / len(precinct_centroids)
    avg_lat = sum(c[1] for c in precinct_centroids) / len(precinct_centroids)
    return (avg_lon, avg_lat)

def get_precinct_neighbors(geoid, contiguity_df):
    """Get the list of neighboring precincts for a given precinct."""
    df = contiguity_df.copy()
    df.columns = ['Precinct', 'Neighbors']
    try:
        neighbors_str = df[df['Precinct'] == geoid]['Neighbors'].values[0]
        return ast.literal_eval(neighbors_str)
    except (KeyError, IndexError, ValueError):
        return []


total_nj_population = nj_precinct_data['Total_2020_Total'].sum()
avg_district_population = total_nj_population / 12
advisedSeedPopulation = avg_district_population * 0.10


def select_population_balanced_seeds(precinct_data, boundaries, contiguity_df, num_districts=12):
    import pandas as pd
    import math
    
    # Build coordinates for every precinct from shapefile centroids
    coords = {}
    for geoid in precinct_data['GEOID20']:
        centroid = get_precinct_centroid(geoid, boundaries)
        if centroid is not None:
            coords[geoid] = centroid  # (lon, lat)

    # Filter precinct_data to only those with valid centroids
    precinct_data = precinct_data[precinct_data['GEOID20'].isin(coords.keys())].copy()
    
    # Prepare candidate seeds with basic filtering like before
    avg_precint_pop = precinct_data['Total_2020_Total'].mean()
    pop_low = avg_precint_pop * 0.5
    pop_high = avg_precint_pop * 1.5

    def is_good_seed_candidate(row):
        geoid = row['GEOID20']
        pop = row['Total_2020_Total']
        if pop <= 0:
            return False
        neighbors = get_precinct_neighbors(geoid, contiguity_df)
        if len(neighbors) == 0:
            return False
        if not (pop_low <= pop <= pop_high):
            return False
        return True
    
    candidates = precinct_data[precinct_data.apply(is_good_seed_candidate, axis=1)].copy()
    if candidates.empty:
        # fallback to any with neighbors
        candidates = precinct_data[precinct_data['GEOID20'].map(
            lambda g: len(get_precinct_neighbors(g, contiguity_df)) > 0
        )].copy()

    # Add lon/lat columns
    candidates['lon'] = candidates['GEOID20'].map(lambda g: coords[g][0])
    candidates['lat'] = candidates['GEOID20'].map(lambda g: coords[g][1])

    # Calculate center of candidates for initial seed pick
    mean_lon = candidates['lon'].mean()
    mean_lat = candidates['lat'].mean()

    def euclidean(p1, p2):
        return math.sqrt((p1[0] - p2[0])**2 + (p1[1] - p2[1])**2)

    seeds = []
    assigned_pop = 0
    total_pop = candidates['Total_2020_Total'].sum()
    ideal_seed_pop = total_pop / num_districts

    # Select the first seed: closest to center
    candidates['dist_to_center'] = ((candidates['lon'] - mean_lon)**2 + 
                                    (candidates['lat'] - mean_lat)**2).pow(0.5)
    first_seed = candidates.sort_values('dist_to_center').iloc[0]
    seeds.append(first_seed['GEOID20'])
    assigned_pop += first_seed['Total_2020_Total']
    candidates = candidates[candidates['GEOID20'] != first_seed['GEOID20']]

    # Now select remaining seeds balancing distance and population:
    while len(seeds) < num_districts:
        # Compute for each candidate:
        # - min distance to existing seeds (to spread geographically)
        # - difference of candidate pop to ideal_seed_pop (to balance pop)

        weight_pop = 0.9
        weight_dist = 1 - weight_pop
        max_dist = candidates['lon'].max() - candidates['lon'].min()  # rough max lon range as proxy for max distance
        max_pop_diff = candidates['Total_2020_Total'].max() - candidates['Total_2020_Total'].min()
        def score_candidate(row):
            dist_to_seed = min(euclidean(coords[row['GEOID20']], coords[s]) for s in seeds)
            pop_diff = abs(row['Total_2020_Total'] - ideal_seed_pop)

            dist_norm = dist_to_seed / max_dist if max_dist > 0 else 0
            pop_norm = pop_diff / max_pop_diff if max_pop_diff > 0 else 0

            return weight_dist * dist_norm - weight_pop * pop_norm


        candidates['score'] = candidates.apply(score_candidate, axis=1)
        next_seed = candidates.sort_values('score', ascending=False).iloc[0]

        seeds.append(next_seed['GEOID20'])
        assigned_pop += next_seed['Total_2020_Total']
        candidates = candidates[candidates['GEOID20'] != next_seed['GEOID20']]

    return seeds


In [None]:
from collections import deque
import ast
import math

def round_robin_balanced(
        assignment,
        contiguity_df,
        seed_geoids,
        precinct_data,
        district_ids=None,
    ):
    """
    Round-robin flood fill that preserves contiguity and balances populations.
    This is the *population-aware upgrade* of your original fast algorithm.
    """

    # --- PREP ---
    assignment = assignment.copy()
    assignment['District'] = 0

    if district_ids is None:
        K = len(seed_geoids)
        district_ids = list(range(1, K+1))
    else:
        K = len(district_ids)

    # neighbor map
    cont = contiguity_df.copy()
    cont.columns = ['Precinct', 'Neighbors']
    neigh_map = {}
    for p, ns in zip(cont['Precinct'], cont['Neighbors']):
        try:
            neigh_map[p] = ast.literal_eval(ns)
        except:
            neigh_map[p] = []

    # population lookup
    pop_map = dict(zip(
        precinct_data["GEOID20"],
        precinct_data["Total_2020_Total"]
    ))

    total_pop = sum(pop_map.values())
    ideal_pop = total_pop / K  # ideal district size

    # --- STATE ---
    assigned = set()
    district_pop = {d: 0 for d in district_ids}

    # queues: but now neighbors inside each queue will be sorted by pop
    queues = {d: deque() for d in district_ids}

    # seed assignments
    for d, seed in zip(district_ids, seed_geoids):
        assignment.loc[assignment['GEOID20'] == seed, "District"] = d
        assigned.add(seed)
        queues[d].append(seed)
        district_pop[d] += pop_map.get(seed, 0)

    total_precincts = len(assignment)

    # ---------- MAIN LOOP -------------
    while len(assigned) < total_precincts:
        # Prioritize districts most under ideal population
        sorted_districts = sorted(district_ids, key=lambda d: ideal_pop - district_pop[d], reverse=True)

        for d in sorted_districts:

            if len(assigned) >= total_precincts:
                break

            if not queues[d]:
                continue

            curr = queues[d].popleft()
            neighbors = [nb for nb in neigh_map.get(curr, []) if nb not in assigned]
            if not neighbors:
                continue

            deficit = ideal_pop - district_pop[d]
            pop_weight = 3.0  #weight value

            def score_neighbor(g):
                pop = pop_map.get(g, 0)
                if deficit > 0:
                    return pop * pop_weight -deficit
                else:
                    return -pop * pop_weight - deficit

            neighbors_sorted = sorted(neighbors, key=score_neighbor, reverse=True)

            # If d is below ideal_pop pick large precincts first  
            # If d is above ideal_pop pick small precincts first

            # assign neighbors in that priority order
            for nb in neighbors_sorted:
                if nb in assigned:
                    continue

                assignment.loc[assignment['GEOID20'] == nb, "District"] = d
                assigned.add(nb)
                queues[d].append(nb)
                district_pop[d] += pop_map.get(nb, 0)

    return assignment, district_pop, ideal_pop


In [210]:
seeds = select_population_balanced_seeds(nj_precinct_data, nj_precinct_boundaries, nj_contiguity, num_districts=12)
balanced_assignment, pops, target = round_robin_balanced(
    nj_current_assignment,
    nj_contiguity,
    seeds,
    nj_precinct_data
)
for d in range(1, 13):
    print(d, pops[d], " vs ideal ", target)
    print("Contiguous? ", isDistrictContiguous(d, balanced_assignment, nj_contiguity))
balanced_assignment.to_csv("district_assignment_output.csv", index=False)

1 637439  vs ideal  774082.8333333334
Contiguous?  True
2 533174  vs ideal  774082.8333333334
Contiguous?  True
3 1227340  vs ideal  774082.8333333334
Contiguous?  True
4 633440  vs ideal  774082.8333333334
Contiguous?  True
5 798547  vs ideal  774082.8333333334
Contiguous?  True
6 901595  vs ideal  774082.8333333334
Contiguous?  True
7 897636  vs ideal  774082.8333333334
Contiguous?  True
8 1045843  vs ideal  774082.8333333334
Contiguous?  True
9 576731  vs ideal  774082.8333333334
Contiguous?  True
10 651095  vs ideal  774082.8333333334
Contiguous?  True
11 822556  vs ideal  774082.8333333334
Contiguous?  True
12 563598  vs ideal  774082.8333333334
Contiguous?  True
