# Redistricting Lab

**Objectives**
 - Introduce students to congressional redistricting and gerrymandering
 - Introduce students to a new integer programming formulation
 - Expose students to measures of compactness and the efficiency gap
 - Give students experience creating congressional districts for differet objectives

**Reading:** This lab is based on real congressional redistricting research done by Wes Gurnee and David Shmoys. More information about this research can be found at [fairmandering.org](https://www.fairmandering.org). A more complete description of the algorithm can be found [here](https://www.fairmandering.org/algorithm.html).

**Brief description:** <font color='red'> Update when complete.</font>

<font color='blue'> <b>Solutions are shown blue.</b> </font> <br>
<font color='red'> <b>Instuctor comments are shown in red.</b> </font>

In [None]:
# imports -- don't forget to run this cell!
from redistricting import *
from ortools.linear_solver import pywraplp as OR
from bokeh.io import output_notebook, show
output_notebook()

# Part I: Redistricting, Gerrymandering, and the Efficiency Gap

Every 10 years in the US, following the census, the process of *reapportionment* determines how the 435 representatives in the House of Representatives will be distributed among the 50 states. Depending on population shifts, a state may lose or gain a representative. Each state is divided in to electoral districts; one for each representative. Hence, after reapportionment, states *redistrict* to redraw the electoral district boundaries. These boundaries are drawn subject to a few federal constraints:

- Districts must have roughly the same population (total state population / number of representatives).
- They must be compact and contiguous.
- You can not divide minorities across multiple districts (cracking) or have a district in which they are the majority (packing).

You may have noticed that there is no restriction on packing or cracking political parties. These two strategies are common in a phenomenon known as *gerrymandering*. Gerrymandering is the process of redrawing these electoral district boundaries to give one political party an advantage over another. In the diagram below, you can see an example of both cracking in packing. 

<img src="images/gerrymandering.png" width=600 height=300 />

*Taken from [The Washington Post](https://www.washingtonpost.com/news/wonk/wp/2015/03/01/this-is-the-best-explanation-of-gerrymandering-you-will-ever-see/)*

Each district shoud have 10 squares. On the right, the two U-shaped districts are packed with blue squares (9/10). The remaining blue squares are then cracked by splitting them among the remaining districts. Even though red is the minority, gerrymandering allow them to have a majority of the districts.

To address this, some states have introduced restrictings aganist this. However, it is hard to police. How can you prove that someone gerrymandered or not? We will look at one fairness metric that has received a lot of attention called the *efficiency gap*. Essentially, the efficiency gap tracks to types of "wasted" votes: lost votes for the losing canidate and excess votes (above 50%) for the winning canidate. 

Taking diagram (2) as an example: There are 4 lost votes for red in each of the districts for a total of 20 wasted red votes. There is one excess vote for blue in each district (blue got 6 votes but 50% of the votes is 5) for a total of 5 wasted blue votes.

**Q:** Consider each square in diagram (3) above a vote. How many wasted votes does blue have? How many are lost votes and how many are excess votes? How does this compare to red?

**A:** <font color='blue'> There are 4 lost votes in each of the 3 red-winning districts and 4 excess votes in each of the 2 blue-winning districts for a total of 20 wasted votes. Conversely, red has 2 lost votes and 3 excess votes for a total of 5 wasted votes. There are many more wasted blue votes. </font>

Now that we know the total wasted votes for each party, we can compute the efficiency gap as follows:

$$\text{Efficiency Gap} = \frac{\text{Total Wasted Blue Votes} - \text{Total Wasted Red Votes}}{\text{Total Votes}}$$

**Q:** Continuing with our example, what is the efficiency gap?

**A:** <font color='blue'> The efficiency gap is $(20 - 5) / 50 = 0.30$.</font>

You should have found the efficiency gap to be $0.30$. This has the following interpretation: the red party won $30\%$ more seats than they should have. $30\%$ of $5$ is $1.5$. They should have won 2 seats but they managed to win an additional seat. If everything were perfectly fair, the efficiency gap would be 0. However, perfect equality is impossible. The authors of the efficiency gap, Stephanopoulos and McGhee, propose $\pm2\%$ for congressional redistricting.

## Part II: Redistricting With Integer Programming

Let's start by considering a small redistricting example. In this example, we want to divide this region in to 5 districts. Hence, each district will be of size 3.

In [None]:
show(grid_plot(small_example))

<font color='red'> If students has seen partitioning and some simpilar set system before, we could liken to that.</font>

Now, let's start thinking about how we can formulate this problem as an integer program. One idea would be to generate all of the possible feasible districts of size 3 and then choose 5 of them that "fit together". Let's see how many possible feasible districts there are!

In [None]:
plot_feasible_districts(small_example,3)
print('There are %d feasible districts of size 3.' % (len(feasible_districts_on_grid(small_example,3)[0])))

There are less than 50 feasible districts! That is a number we can work with.

**Q:** What should be the decision variables of our integer program. How do we interpret their value? 

**A:** <font color='blue'> There is a binary variable $x_j$ for each district $j$. If $x_j$ is 1, it is used in the plan; otherwise, it is not. </font>

**Q:** Clearly define what is means for districts to "fit together".

**A:** <font color='blue'> A set of districts fits together if each block appears in exactly one of the districts.</font>

It would be helpful to keep track of which blocks appear in which districts. Let's say we have a set of blocks $i \in B$ and a set of feasible districts $j \in D$. We will have a matrix $A$ where $a_{ij} = 1$ if and only if block $i$ is contained within district $j$.

In [None]:
A = grid_district_matrix(small_example, 3)

**Q:** Using the decision variables you defined above, write the constraint(s) that enforce each block appears in exactly one district.

**A:** <font color='blue'> $$\sum_{j \in D} a_{ij}x_j = 1 \quad \forall i \in B$$ </font>

**Q:** We must select exaclty $k$ districts to be in our final plan. Write a constraint that enforces this.

**A:** <font color='blue'> $$\sum_{j \in D}  x_j = k$$ </font>

What should our objective function be? Well, we want to minimize the efficiency gap. First, we will compute the efficiency gap for each of the generated feasible districts. The dataframe below has a row for each of the feasible districts. `tracts` and `tract_coord` give the list of blocks in that district and their location. You can ignore `square_roeck` for now but it is a measure of the compactness for a district.

In [None]:
district_df = create_districts_df(small_example, 3)
display(district_df.head())

Let's look at some example districts and their efficiency gap! The `R : 1, D : 0` gives the number of districts won by each party. Since we are only viewing one district below, the 1 will appear next to the party who won that district.

In [None]:
plot_grid_districts(small_example, district_df, [2])
plot_grid_districts(small_example, district_df, [26])

We want to find the most fair redistricting plan: the one where the average district has an efficiency gap closest to 0. We can achieve this by minimizing the absolute value of the average efficiency gap:

$$ \left| \enspace \frac{\sum_{j \in D}c_jx_j}{k}\enspace \right| $$

where $k$ is the number of districts to be chosen and $c_j$ is the efficiency gap of district $j$. However, since $k$ is a constant, we can simplify this by minimizing

$$ \left| \enspace \sum_{j \in D}c_jx_j\enspace \right|.$$

Putting this all together, we have a complete formulation:

$$\begin{align*}
\min \quad & \left| \enspace \sum_{j \in D} c_jx_j \enspace \right| \\
\text{s.t.} \quad &  \sum_{j \in D} a_{ij}x_j = 1 \quad \forall i \in B & (1)\\
\quad &  \sum_{j \in D}  x_j = k & (2)\\
\quad & x_j \in \{0,1\} \quad \forall j \in D & (3)\\
\end{align*}$$

**Q:** Is this formulation an integer linear program? Why or why not?

**A:** <font color='blue'>No, the objective function is not linear because it has an absolute value.</font>

<font color='red'> Could be good for students to see this trick earlier in the course. <br>**TODO:** Maybe introduce via tetris board completion</font>

It turns out there is a special trick for encoding an absolute value in a linear program. Consider the simpler example of maximizing $x$. We can introduce a new variable $w$ and enforce the constraints $x \leq w$ and $x \geq -w$. Now, we can just minimize $w$ as our objective function. Do you see why this works? After introducing this trick into our formulation, we get the following integer linear program (ILP):

$$\begin{align*}
\min \quad & w \\
\text{s.t.} \quad &  \sum_{j \in D} a_{ij}x_j = 1 \quad \forall i \in B & (1)\\
\quad &  \sum_{j \in D}  x_j = k & (2)\\
\quad & \sum_{j \in D} c_jx_j \leq w & (3) \\
\quad & \sum_{j \in D} c_jx_j \geq -w & (4) \\
\quad & x_j \in \{0,1\} \quad \forall j \in D & (5)\\
\end{align*}$$

Let's write this model up and use it to solve our small redistricting problem!

**Q:** Complete the model below by implementing the $k$ total districts constraint.

In [None]:
def redistrict(k, A, cost, integer=False, opt_type='abs_val', solver='CBC'):
    """A model for solving a congressional redistricting problem.
    
    Args:
        k (int): number of districts in a plan
        A (np.ndarray): binary matrix a_ij = 1 if tract i is in district j
        costs (np.ndarray): cost coefficients of districts
        opt_type (str): {"minimize", "maximize", "abs_val"}
        solver (str) : {"CBC", "gurobi"}
    """
    n_tracts, n_columns = A.shape
    TRACTS = range(n_tracts)
    DISTRICTS = range(n_columns)
    
    # define the model
    if solver=='CBC':
        m = OR.Solver('redistrict', OR.Solver.CBC_MIXED_INTEGER_PROGRAMMING)        
    elif solver=='gurobi':
        m = OR.Solver('redistrict', OR.Solver.GUROBI_MIXED_INTEGER_PROGRAMMING)
    else:
        raise ValueError('Invalid solver')

    # decision variables
    x = {} # x_i is 1 if district i is used, 0 otherwise
    for d in DISTRICTS:
        if integer:
            x[d] = m.IntVar(0, 1, name="x(%s)" % d)
        else:
            x[d] = m.NumVar(0, 1, name="x(%s)" % d)

    # objective function
    if opt_type == 'min':
        m.Minimize(sum(cost[d] * x[d] for d in DISTRICTS))
    elif opt_type == 'max':
        m.Maximize(sum(cost[d] * x[d] for d in DISTRICTS))
    elif opt_type == 'abs_val':
        w = m.NumVar(-k, k, name="w")
        m.Add(sum(cost[d] * x[d] for d in DISTRICTS) <= w, name='absval_pos')
        m.Add(sum(cost[d] * x[d] for d in DISTRICTS) >= -w, name='absval_neg')
        m.Minimize(w)
    else:
        raise ValueError('Invalid optimization type')

    # subject to: each census tract appears in exactly one district
    for t in TRACTS:    
        m.Add(sum(x[d] * A[t,d] for d in DISTRICTS) == 1)

    # subject to: k total districts
    # TODO: Implement this constraint
    
    ### BEGIN SOLUTION
    m.Add(sum(x[d] for d in DISTRICTS) == k)
    ### END SOLUTION

    return m,x

In [None]:
# You do not need to pay attention to this cell but make sure you run it!
def solve(m, solver='CBC'):
    """Solve the model and specify some solver parameters."""
    if solver=='CBC':
        m.SetTimeLimit(2000000)
        params = OR.MPSolverParameters()
        params.SetDoubleParam(params.RELATIVE_MIP_GAP, 1e-4)
        status = m.Solve(params)
    elif solver=='gurobi':
        params_set = m.SetSolverSpecificParametersAsString(
                     '''TimeLimit %d
                        MIPGapAbs %d''' % (2000, 1e-4))
        if params_set:
            print('Gurobi solver parameters set successfully.')
        status = m.Solve()
    if status == OR.Solver.OPTIMAL:
        print('Optimal solution found.')
        print('Objective value =', m.Objective().Value())
    else:
        print('Optimal solution was not found - %s' % (status))

In [None]:
m,x = redistrict(5, A, district_df['efficiency_gap'], integer=True, opt_type='abs_val', solver='CBC')
solve(m, solver='CBC')
sol = [i for i, v in x.items() if v.solution_value() > .5]
plot_grid_districts(small_example, district_df, sol)

**Q:** What was the optimal redistricting plan? What was the average efficiency gap?

**A:** <font color='blue'>There were 3 R districts and 2 D. The average efficiency gap was -0.06667. </font>

**Q:** Does this plan seem fair to you? Why or why not?

**A:** <font color='blue'> Yes, the Republican Party had $\frac{9}{15} = 60\%$ of the votes and won $60\%$ of the seats so this seems pretty fair. </font>

What if we wanted to gerrymander for the Republican party? How would our objective function change? One input to this model is `opt_type` which has the following options:
- `min` : minimize the objective function
- `max` : maximize the objective function
- `abs_val` : minimize the absolute value of the objective function

**Q:** Change `opt_type` to gerrymander for the Republican party.

In [None]:
# TODO: Change opt_type to gerrymander for the Republican party.
# m,x = redistrict(5, A, district_df['efficiency_gap'], integer=True, opt_type='XXX', solver='CBC')

### BEGIN SOLUTION
m,x = redistrict(5, A, district_df['efficiency_gap'], integer=True, opt_type='max', solver='CBC')
### END SOLUTION
solve(m, solver='CBC')
sol = [i for i, v in x.items() if v.solution_value() > .5]
plot_grid_districts(small_example, district_df, sol)

**Q:** How many districts did the Republican party win? What was the efficiency gap? Is this what you would expect given the interpretation of the efficiency gap?

**A:** <font color='blue'>The Republican party won 4 districts. The efficiency gap was $0.20$. Given the efficiency gap, we would expect the Republican party to win $5(0.20) = 1$ district more than their fair share which they did! </font>

**Q:** Can you see remnants of packing and cracking in the gerrymandered plan?

**A:** <font color='blue'>Yes, most Democratic districts are packed into one district and the rest are cracked among the remaining districts.</font>

Let's look at a slightly larger example (with 6 districts)! First, we will redistrict farily. 

In [None]:
show(grid_plot(large_example))

In [None]:
# Create the model inputs
A = grid_district_matrix(large_example, 7)
district_df = create_districts_df(large_example, 7)
print('There are %d feasible districts of size 7' % (len(district_df)))

Woah! The number of feasible districts grew by quite a lot! In fact, the number of feasible disrtricts grows exponentially. If you think back to the 10x5 grid from **Part I**, this example has 277,705 feasible districts! <font color='red'> Maybe compute and just give optimal solutions here.</font>

In [None]:
m,x = redistrict(6, A, district_df['efficiency_gap'], integer=True, opt_type='abs_val', solver='CBC')
solve(m, solver='CBC')
sol = [i for i, v in x.items() if v.solution_value() > .5]
plot_grid_districts(large_example, district_df, sol)

**Q:** Change `opt_type` to gerrymander for the Democratic party.

In [None]:
# TODO: Change opt_type to gerrymander for the Democratic party.
# m,x = redistrict(6, A, district_df['efficiency_gap'], integer=True, opt_type='XXX', solver='CBC')
### BEGIN SOLUTION
m,x = redistrict(6, A, district_df['efficiency_gap'], integer=True, opt_type='min', solver='CBC')
### END SOLUTION
solve(m, solver='CBC')
sol = [i for i, v in x.items() if v.solution_value() > .5]
plot_grid_districts(large_example, district_df, sol)

**Q:** How many districts did the Democratic party win? What was the efficiency gap? Is this what you would expect given the interpretation of the efficiency gap?

**A:** <font color='blue'>The Democratic party won 3 districts. The efficiency gap was $-0.38$. Given the efficiency gap, we would expect the Democratic party to win $5(0.38) = 1.9$ district more than their fair share amd they won 2 more!</font>

<font color='red'>**TODO:** Compute optimal redistricting for the 5x10 example.</font>

## Part III: Redistricting Georgia

At scale, it is impractical to fully enumerate all of the feasible districts. Georgia has nearly 2,000 "blocks". Instead, we must generate a relatively small subset of these districts while maintaining that these districts will be compatible. The paper by Wes Gurnee and David Shmoys provides a way for doing just this! We will ignore the details of this algorithm for the sake of this lab and just give you the data. We give you 385 sets of compatible feasible districts.

We skipped a lot of complexity involved in generating a sample tree. For the sake of this lab, this data has been given to us.

First, let's look at some of the data we have. The dataframes below give us some statistics and past election results for each census tract.

In [None]:
tracts = load_tract_shapes()
leaf_nodes = pickle.load(open('data/leaf_nodes.p', 'rb'))
internal_nodes = pickle.load(open('data/internal_nodes.p', 'rb'))
print('There are %d sets and %d total feasible districts.' % (len(internal_nodes[0]['children_ids']), 
                                                              len(leaf_nodes)))

We can view each of these districts using the function below!

In [None]:
# change i (0..215784) to highlight different feasible districts corresponding to the leaf nodes
i = 0
ax = tracts.plot(figsize=(10, 10), color='none', edgecolor='black', lw=.25) 
tracts.iloc[leaf_nodes[i]['area']].plot(ax=ax, color='green', alpha=.3);

Again, we have a dataframe of districts with various statistics for each district. Most importantly, we compute the `efficiency_gap` for each district.

In [None]:
district_df = pd.read_csv(os.path.join('data', 'ga_district_df.csv'))
district_df['efficiency_gap'] = efficiency_gap_coefficients(district_df, .53)
display(district_df.head())

Before running our model, we first need to format our data to fit the models input. We want to split Georgia into 14 districts. Lastly, `partition_map` divides the feasible districts into their 385 compatible sets.

In [None]:
n_districts = 14
A = make_tdm(leaf_nodes, len(tracts))
partition_map = list(make_root_partition_to_leaf_map(leaf_nodes, internal_nodes).items())

Unfortunately, this problem is still large enough that the open-source solver we use in this class (CBC) would take a little over an hour to solve all 385 instances. For that reason, we will solve a small subset of these instances. The function below solves some subset of the instances for given cost coefficents and an optimization type.

In [None]:
def solve_instances(cost, opt_type, index_subset):
    '''Solve a subset of the instances and return a list of each solution.'''
    solutions = []
    for i in index_subset:
        partition_ix, leaf_slice = partition_map[i]
        m,x = redistrict(k=n_districts,
                         A=A[:, leaf_slice],
                         cost=np.array(cost)[leaf_slice],
                         integer=True,
                         opt_type=opt_type,
                         solver='CBC')
        solve(m, solver='CBC')
        opt_cols = [j for j, v in x.items() if v.solution_value() > .5]
        solutions.append({'n_leaves': len(leaf_slice),
                          'solution_ixs': leaf_slice[opt_cols],
                          'optimal_objective': np.array(cost)[leaf_slice][opt_cols]})
    return solutions

We want to minimize the absolute value of the efficency gap!

In [None]:
solutions = solve_instances(district_df['efficiency_gap'], 'abs_val', [4,5,6])

Now that we have some solutions, we need to pick one of them. The simplest thing to do is pick the one with the efficiency gap closest to 0.

In [None]:
for sol in solutions:
    e_gap = round(district_df.loc[sol['solution_ixs']]['efficiency_gap'].mean(),5)
    print(e_gap)

**Q:** Just by looking at the efficiency gap for each of these solutions, which of these is the most fair? 

**A:** <font color='blue'> The first solution is the most fair because the efficiency gap is almost 0.</font>

**Q:** In the third solution, the efficiency gap is nearly $0.04$. For which party is this an advantage?

**A:** <font color='blue'> This favors the The Republican Party. </font>

Let's look at each solution graphically since there are only 3!

In [None]:
for sol in solutions:
    politics_map(tracts, district_df, leaf_nodes, sol, figsize=(6,6));

**Q:** Do these solutions reflect what you would expect from the efficiency gaps?

**A:** <font color='blue'> Yes, both the second and third solution have higher positive efficiency gaps which means more democratic votes were wasted. We can see more evidence of packing in these solutions.</font>

## Part IV: Redistricting Georgia (Different Objectives)

In **Part III** we minimized the absolute value of the efficiency gap to make things fair. Alternatively, we could use this same tool to gerrymander. For example, if we wanted to gerrymander to favor The Republican Party, we would want to maximize the efficiency gap. Let's do this for the same instances we optimized to be fair.

In [None]:
solutions = solve_instances(district_df['efficiency_gap'], 'max', [4,5,6])

In [None]:
for sol in solutions:
    politics_map(tracts, district_df, leaf_nodes, sol, figsize=(6,6));
    e_gap = round(district_df.loc[sol['solution_ixs']]['efficiency_gap'].mean(),5)
    print(e_gap)

**Q:** How effective was the gerrymandering? How did the most fair solution (6 D, 8 R seats) compare to solution with the highest efficiency gap here? Based on the interpretation of the efficiency gap, does this make sense?

**A:** <font color='blue'> There are 11 R seats and only 3 D seats. The efficiency gap was ~$20\%$ so we would expect the The Republican Party to get $0.20(14) = ~2.8$ more seats and they got 3 more!</font>

**Q:** How was the Democratic vote supressed so heavily?

**A:** <font color='blue'> All of the democratic votes were packed into 3 urban districts.</font>

**Q:** Choose the correct cost coefficents and optimization type to gerrymander for The Democratic Party 

**Note**: We look at a different 3 root partitions here because the previous 3 are hard to gerrymander in the The Democratic Party's favor.

In [None]:
# TODO: Uncomment, and gerrymander for The Democratic Party
# solutions = solve_instances(, , [4,110,216])

### BEGIN SOLUTION
solutions = solve_instances(district_df['efficiency_gap'], 'min', [4,110,216])
### END SOLUTION

In [None]:
for sol in solutions:
    politics_map(tracts, district_df, leaf_nodes, sol, figsize=(6,6));
    e_gap = round(district_df.loc[sol['solution_ixs']]['efficiency_gap'].mean(),5)
    print(e_gap)

**Q:** How effective was the gerrymandering? How did the most fair solution (6 D, 8 R seats) compare to solution with the highest efficiency gap here? Based on the interpretation of the efficiency gap, does this make sense?

**A:** <font color='blue'> There are 7 R seats and 7 D seats. The efficiency gap was ~$0.03\%$ so we would expect the The Republican Party to get $0.03(14) = ~0.42$ more seats and they got 1 more!</font>

**Q:** How was the Republican vote supressed so heavily?

**A:** <font color='blue'> The republican votes were heavily packed allowing for the democrats to spread their majority across multiple districts coming into the urban center. </font>

## Bonus I: Generating a Subset of the Districts

In this lab, it quickly became apparent that it would be impossible to enumerate all of the feasible districts for a problem at-scale. We waved our hands and provided you with a subset of feasible districts (that were compatible with eachother) for the state of Georgia. In this section, we give a breif overview of the algorithm which generated these districts.

<font color='red'> Do census tracts have to reside in one district or was this just your approach?</font>

In our small examples, each district was composed of smaller blocks. Similarly, each state is divided in to census tracts with populations of approximatly 4,000 for the purpose of taking the census. Below, we load the census tracts for Georgia.

In [None]:
tracts = load_tract_shapes()
tracts.plot(figsize=(10, 10), color='none', edgecolor='black', lw=.25);
print('There are %d census tracts in Georgia' % (len(tracts)))

Depending on the size a district should be (roughly the state's population divided by the number of representatives), we can define a district by some contiguous subset of these census tracts. To keep track of continuity, we can create an adjacency graph where there is a node for each census tract and an edge between two nodes if and only if those census tracts are geographically adjacent.

In [None]:
adjacency_graph = load_graph()
draw_adjacency_graph(tracts, adjacency_graph, figsize=(10, 10));
print('The adjacency graph has %d nodes and %d edges.' % (len(adjacency_graph.nodes), len(adjacency_graph.edges)))

For the state of Georgia, there are nearly 2000 census tracts. Since the number of feasible districts grows exponentially, it would be impossible to enumerate all of Georgia's feasible districts. Instead, we need to find a way to generate a small subset of feasible districts while guaranteeing there is a feasible way to select a subset of them that "fit" together.

The paper by Wes Gurnee and David Shmoys addressee this very issue! They formulated a structured way of generating a large number of compatible feasible districts (though small relative to the total number possible). We won't go in to detail about this process but here is a breif overview.

Esssentially, the state is divided into some number of smaller regions (partitioned) some number of times (385 in the case of this Georgia example). For each of these partitions, this process is repeated until the size of the regions is appropriate for the size of a district. Here is a small section of sample tree: 

<img src="images/sample_tree.png" width=650 height=300 />

*Taken from [fairmandering.org](https://www.fairmandering.org/algorithm.html)*

The root of this tree is the entire state. In the first layer of the tree, you have the first partitions (which we call root partitions). In the last layer of the tree (also called the leaf nodes), the partions represent the feasible districts. Just one more piece of vocabulary! If you can follow a path directly down from one node $a$ to another $b$, we say that $a$ is a ancestor of $b$. For example, the middle root partition is an ancestor of every shown leaf. The right and left root partitions are **not** ancestors of any shown leaf.

**B1:** Can two districts who have different root partition ancestors be used in the same plan together? Why or why not?

**A:** <font color='blue'> No, the "puzzle pieces" would not fit!</font>

In **B1**, you should have argued that they would not be compatible. Hence, we solve this optimization problem for each of the root partitions. Each root partition is the ancestor of some subset of the leafs which represent feasible districts. For each root partition, we can use these feasible districts as the input to the ILP we described! In the Georgia example, we had 385 root partitions. We could solve all 385 of them for some objective function (say minimizing the absolute value of the efficiency gap). Then, we could select one of these 385 possible plans based on a different objective function (say maximizing the compactness).

## Bonus II: Measures of Compactness

Up to this point, we have focused on optimizing around partisanship. However, we could also optimize around other measures like compactness. In order to do this, we need to decide how to measure compactness. One possible measure is called the Roeck test. 

Let's first explore a similar version of the Roeck test called the square Roeck test. Find the smallest square enclosing the district and then divide the area of the district by the area of that square. The value will always be between 0 and 1 with more compact districts having a value closer to 1. Let's look at some examples!

In [None]:
show(grid_plot(large_example))

In [None]:
A = grid_district_matrix(large_example, 7)
district_df = create_districts_df(large_example, 7)
plot_grid_districts(large_example, district_df, [5263])
plot_grid_districts(large_example, district_df, [10322])
plot_grid_districts(large_example, district_df, [215])

**B2:** Choose the correct cost coefficents and optimization type to maximize the compactness of the districts using the square Roeck test. (Hint: the column `square_roeck` gives the square Roeck measure for each district)

In [None]:
# TODO: maximize compactness with the square Roeck test
# m,x = redistrict(6, A, district_df['XXX'], integer=True, opt_type='XXX', solver='CBC')

### BEGIN SOLUTION
m,x = redistrict(6, A, district_df['square_roeck'], integer=True, opt_type='max', solver='CBC')
### END SOLUTION

solve(m, solver='CBC')
sol = [i for i, v in x.items() if v.solution_value() > .5]
plot_grid_districts(large_example, district_df, sol)

**B3:** Choose the correct cost coefficents and optimization type to *minimize* the compactness of the districts using the square Roeck test. (Hint: the column `square_roeck` gives the square Roeck measure for each district)

In [None]:
# TODO: minimize compactness with the square Roeck test
# m,x = redistrict(6, A, district_df['XXX'], integer=True, opt_type='XXX', solver='CBC')

### BEGIN SOLUTION
m,x = redistrict(6, A, district_df['square_roeck'], integer=True, opt_type='min', solver='CBC')
### END SOLUTION

solve(m, solver='CBC')
sol = [i for i, v in x.items() if v.solution_value() > .5]
plot_grid_districts(large_example, district_df, sol)

Now, let's move to our Georgia example. Now we will use the normal Roeck test. Find the smallest circle enclosing the district and then divide the area of the district by the area of that circle. The value will always be between 0 and 1 with more compact districts having a value closer to 1. Let's look at some examples!

In [None]:
n_districts = 14
district_df = pd.read_csv(os.path.join('data', 'ga_district_df.csv'))
district_df['efficiency_gap'] = efficiency_gap_coefficients(district_df, .53)
A = make_tdm(leaf_nodes, len(tracts))
for i in [40339, 215783, 107817]:
    ax = tracts.plot(figsize=(6, 6), color='none', edgecolor='black', lw=.25) 
    tracts.iloc[leaf_nodes[i]['area']].plot(ax=ax, color='green', alpha=.3);
    print('Roeck measure: %f' % (district_df.loc[i]['roeck']))

**B4:** Choose the correct cost coefficents and optimization type to maximize the compactness of the districts using the Roeck test. (Hint: the column `roeck` gives the Roeck measure for each district)

In [None]:
# TODO: Uncomment, and maximize the compactness of the districts
# solutions = solve_instances(, , [4,5,6])

### BEGIN SOLUTION
solutions = solve_instances(district_df['roeck'], 'max', [4,5,6])
### END SOLUTION

In [None]:
for sol in solutions:
    politics_map(tracts, district_df, leaf_nodes, sol, figsize=(6,6));
    roeck = round(district_df.loc[sol['solution_ixs']]['roeck'].mean(),5)
    print(roeck)

**B5:** Which party does maximizing compactness benefit? Why?

**A:** <font color='blue'> This benefits The Republican Party because compact districts pack the Democratic majority which is in dense in small urban areas.</font>