## Plotting
### Used to visuals districting plans, plan demographics, and plan scores.

In [None]:
from gerrytools.scoring import *
from gerrytools.plotting import *
import pandas as pd
import geopandas as gpd
from gerrychain import Graph
import matplotlib.pyplot as plt

In [None]:
plan = gpd.read_file("data/GA_CD_example")

There are functions that allow us to visualize a districting plan, as well as draw a choropleth for a certain demographic within the plan, or a dot density map of the plan. 

First we'll start with `drawplan`. In order to use this, you will need your desired plan as a shapefile on any units with a dedicated column for districts. This function also allows us to overlay other geographies on the plan. In this case, we're plotting a plan for the GA Congressional map, and we'll overlay the plan with Georgia counties. 

This function automatically plots the plan using districtr colors, a list of 33 colors, so if the plan has more than 33 districts, there will be repeats. A user defined color list can be passed using the `colors` argument, which is for the name of a column that defines color on the shapefile.

In [None]:
ga_county = gpd.read_file("../docs/source/_static/ga_county.zip")

In [None]:
ga_county.columns

In [None]:
new_plan = plan.dissolve(by="CD")
new_plan= new_plan.reset_index()
new_plan["CD"]

In [None]:
import matplotlib.pyplot as plt
import gerrytools.plotting.colors as colors
import numpy as np


N = len(new_plan)

dists = new_plan.to_crs("EPSG:3857")
dists["CD"] = dists["CD"].astype(int)
dists=dists.sort_values(by="CD")
dists["colorindex"] = list(range(N))
dists["color"] = colors.districtr(N)
dists[["color", "CD", "colorindex"]]

In [None]:
ax = drawplan(plan, assignment="CD",overlays=[ga_county])

We can also draw a choropleth of a certain demographic across the map. Our `drawchoropleth` function takes care of this, so long as you pass a demographic share column.

You can also pass the column of total counts of any demographic share. This function plots the map, as well as a colorbar, whose label can be changed the the `cbartitle` argument.

In [None]:
plan.columns

In [None]:
plan["VAP_CD"] = plan.groupby("CD")["VAP"].transform("sum")
plan["TOTPOP_CD"] = plan.groupby("CD")["TOTPOP"].transform("sum")
plan["BVAP_SHARE_CD"] = plan["BVAP"]/plan["VAP_CD"]

In [None]:
plt.clf()

districts = plan.dissolve(by="CD").reset_index()
districts["CD"] = districts["CD"].astype(int)

choro = choropleth(
    plan, 
    districts=districts,
    assignment="CD",
    demographic_share_col="BVAP_SHARE_CD",
    overlays=[plan], 
    cmap="Purples",
    cbartitle="ABP Share",
    district_lw=0.1,
    base_lw=2,
    base_linecolor="black",
    numbers=True,
) 

We can also plot scores across an ensemble of plans. And there's a variety of plots we've built in, this includes `histogram`, `boxplot`, and `violin` plots.

In [None]:
import json

with open("./data/ensemble_example.json") as f:
    scores = json.load(f)


We'll start with the `histogram` function. This works by taking a dictionary with a list of scores from an ensemble, a list of scores from "citizen" maps, and a list of scores from "proposed" maps. The citizen maps will get plotted as a histogram on the same axis as the ensemble scores, while the proposed maps will appear as vertical bars with one score per plan. 

In this example, we'll plot the number of majority Black districts in an ensemble, along with one proposed map. 

When using proposed maps, the argument `proposed_info` also has to be used. Passed to this argument is a dictionary with the keys `names` and `colors`, these will be 2 lists with the names of the proposed plans, and the desired color for their vertical line, respectively. 

In [None]:
scores

In [None]:
score_dict = {"ensemble":[], "citizen":[], "proposed":[]}

In [None]:
for score in scores:
    score_dict["ensemble"].append(len([apb for apb in score["BVAP20"] if apb > 0.5]))

In [None]:
score_dict 

In [None]:
score_dict = {
"ensemble": [],
"citizen": [],
"proposed": [],
}

for score in scores:
    count_majority_black = 0
    for apb in score["BVAP20"]:
        if apb > 0.5:
            count_majority_black += 1

    score_dict["ensemble"].append(count_majority_black)

score_dict


In [None]:

score_dict

In [None]:
plan.columns

In [None]:
condensed_plan = plan[[
    "CD", 
    "BVAP",
    "WVAP",
    "VAP",
    plan.geometry.name
]].dissolve(
    by="CD",
    aggfunc="sum"
)

condensed_plan["BVAP_CD"] = condensed_plan["BVAP"]/condensed_plan["VAP"]
condensed_plan["WVAP_CD"] = condensed_plan["WVAP"]/condensed_plan["VAP"]

In [None]:
condensed_plan.reset_index()
# condensed_plan[["CD", "geometry", "APB_share", "BVAP_SHARE_CD", "WVAP_share"]]
condensed_plan

In [None]:
score_dict["proposed"].append(len(condensed_plan[condensed_plan.BVAP_CD > 0.5]))

In [None]:
fig, ax = plt.subplots(1, 1, figsize = (7.5, 5))
hist = histogram(
    ax, 
    score_dict, 
    label = "Num Maj Black Districts", 
    proposed_info={"names": ["GA_CD_example"]}
)

In [None]:
hist.figure

Next we'll look at the `boxplot` function. Similarly to the `histogram` the scores are passed as a dictionary. However, ensemble scores must already be a lits of lists where each individual list represents the values for that box. This means that prior to plotting, scores must already be sorted and grouped so that scores are plotted lowest to highest. 

Proposed plans wlll also be a list of lists. Each proposed plan list will be of length one with one score per box. An example of pre-processing to get scores in both of these formats can be seen below. 

In [None]:
for score in scores:
    print(score)

In [None]:
boxplot_score_dict = {"ensemble": [], "proposed": [], "citizen": []}
first_time = True
for score in scores:
    if first_time:
        for s in sorted(score["BVAP20"]):
            boxplot_score_dict["ensemble"].append([s])
        first_time = False
    else:
        for i, s in enumerate(sorted(score["BVAP20"])):
            boxplot_score_dict["ensemble"][i].append(s)
boxplot_score_dict["proposed"] = ([[k] for k in sorted(condensed_plan.BVAP_CD)])

In [None]:
fig, box_ax = plt.subplots(1, 1, figsize = (7.5, 5))
box_plot = boxplot(box_ax, boxplot_score_dict, proposed_info={"names": ["GA_CD_example"], "colors": ["olivedrab"]})

In [None]:
box_plot.figure

Another type of plot, similar to the boxplot, is a violin plot. These show the same information as boxplots, however, rather than data being displayed in boxes, rotated kernel densities are shown, and in some instances look like violins!

The `violin` function allows us to make this plot. These takes the same score format as boxplots. In these example, we also show the use of other parameters like `rotation` which is a float that specifies the rotation of x axis labels. Next, we can actually define a list of 2 labels (1 for x-axis, 1 for y-axis) to be displayed on the plot. 

In [None]:
boxplot_score_dict["proposed"]

In [None]:
fig, violin_ax = plt.subplots(1, 1, figsize = (7.5, 5))
violin_plot = violin(
    violin_ax,
	boxplot_score_dict,
	rotation=45,
	labels=[
        "BVAP Share",
        ""
    ],
	proposed_info={"names":["GA_CD_Example"],
	"colors":["olivedrab"]}
)

In [None]:
violin_plot.figure

Moving away from visualizing ensembles, we can visualize specific scores about individual plans. Both the `sealevel` and `scatter` functions can be used to accomplish this. We'll start with `scatter1`. 

This function can be used to compare 2 scores across a plan, an ensemble, etc. 

Here, we'll compare the APBVAP20_share in each district of our example plan, and the WVAP20_share in each district of our example plan. 

In [None]:
fig, scatter_ax = plt.subplots(1, 1, figsize = (7.5, 5))

In [None]:
plan.columns

In [None]:
condensed_plan

In [None]:
scatter_plot = scatterplot(
    scatter_ax, 
    x=[list(condensed_plan["BVAP_CD"])], 
    y=[list(condensed_plan["WVAP_CD"])], 
    labels = ["BVAP Share", "WVAP Share"]
)

In [None]:
scatter_plot.figure