# Liberty Mutual analysis

Author: Mo Al Elew

**What notebook does/produces:**

Replicates and fact checks all the data findings used in publication

**Approach:**

The general pattern includes:
1. Quote the relevant text
2. Determine asserted figure to reproduce
3. Run the operations to reproduce relevant figure
4. Assert expected value against actual value
5. Print the relevant text with the actual value templated in

Some findings cannot be directly tested using an assertion against a single value. In those cases, I display the relevant data slice, chart, or other presentation.

In [1]:
import geopandas as gpd
import pandas as pd

# Constants

In [2]:
INSURER = "Liberty Mutual"
DATA_FP = "./outputs/libertymutual_auto_clean.geojson"
PREREFORM_DATA_FP = (
    "../09_pre_reform/liberty_mutual/outputs/libertymutual_auto_clean_gis.geojson"
)
PROJECTED_CRS = "EPSG:3078"


def prptn_to_pct(val, precision=3):
    return round(val, precision) * 100

In [3]:
RATE_Q_LABELS = [
    "lowest effect",
    "middle low",
    "median",
    "middle high",
    "highest effect",
]
INCOME_Q_LABELS = [
    "lowest income",
    "middle low",
    "median",
    "middle high",
    "highest income",
]
DENSITY_Q_LABELS = [
    "lowest density",
    "middle low",
    "median",
    "middle high",
    "highest density",
]

QUANTILE_GROUP_BY_COLS = ["black_tot", "white_tot", "tot_pop"]

# Read data

In [4]:
GDF_DATA = gpd.read_file(DATA_FP)
gdf = GDF_DATA.copy()
gdf["tot_pop"] = gdf["total_pop"]

# Highest effect

> Despite fewer units and less dramatic differences in location effect, Michigan’s more populous, diverse, and Black counties still had comparatively higher location effects under Liberty Mutual.

The assertion above is based on Wayne County and Genesee County have the top two location effects. The two counties have the two largest Black populations by county.

I display the data sorted by population, black pct, and location effect for inspection and confirmation.

In [5]:
gdf.sort_values("total_pop", ascending=False).head(10)

Unnamed: 0,geo_id,geo_name,total_pop,white_pct,black_pct,white_tot,black_tot,median_income,density,generic_location_based_premium,location_effect,geometry,tot_pop
81,26163,"Wayne County, Michigan",1781641,48.7,37.4,866868,666184,57223,0.001024,63266,1.46,"POLYGON ((-82.89881 42.35794, -82.90614 42.356...",1781641
62,26125,"Oakland County, Michigan",1272264,69.9,13.0,889221,166018,92620,0.000542,54217,1.25,"POLYGON ((-83.31682 42.44203, -83.31779 42.442...",1272264
49,26099,"Macomb County, Michigan",878453,76.3,12.3,669918,107772,73876,0.000594,53916,1.24,"POLYGON ((-82.92858 42.45062, -82.92956 42.450...",878453
40,26081,"Kent County, Michigan",657321,71.9,9.2,472300,60789,76247,0.000291,43882,1.01,"POLYGON ((-85.31193 42.94399, -85.31188 42.940...",657321
24,26049,"Genesee County, Michigan",405280,71.2,19.3,288406,78323,58594,0.000241,55573,1.28,"POLYGON ((-83.57105 42.87328, -83.57142 42.873...",405280
80,26161,"Washtenaw County, Michigan",370231,68.6,11.4,254046,42224,84245,0.000198,41060,0.95,"POLYGON ((-83.54373 42.26246, -83.54377 42.261...",370231
69,26139,"Ottawa County, Michigan",296183,82.6,1.6,244537,4651,83932,7e-05,39653,0.92,"POLYGON ((-85.90839 43.20592, -85.90755 43.205...",296183
32,26065,"Ingham County, Michigan",282540,68.1,11.4,192417,32076,62548,0.000195,41520,0.96,"POLYGON ((-84.15028 42.68520, -84.15027 42.685...",282540
38,26077,"Kalamazoo County, Michigan",261426,75.9,10.6,198298,27756,67905,0.000174,40194,0.93,"POLYGON ((-85.52915 42.07070, -85.54049 42.070...",261426
46,26093,"Livingston County, Michigan",194302,93.3,0.6,181205,1108,96135,0.000128,43319,1.0,"POLYGON ((-83.78725 42.42859, -83.78940 42.428...",194302


In [6]:
gdf.sort_values("black_pct", ascending=False).head(10)

Unnamed: 0,geo_id,geo_name,total_pop,white_pct,black_pct,white_tot,black_tot,median_income,density,generic_location_based_premium,location_effect,geometry,tot_pop
81,26163,"Wayne County, Michigan",1781641,48.7,37.4,866868,666184,57223,0.001024,63266,1.46,"POLYGON ((-82.89881 42.35794, -82.90614 42.356...",1781641
24,26049,"Genesee County, Michigan",405280,71.2,19.3,288406,78323,58594,0.000241,55573,1.28,"POLYGON ((-83.57105 42.87328, -83.57142 42.873...",405280
72,26145,"Saginaw County, Michigan",189821,68.3,18.4,129622,34953,56579,9e-05,37883,0.87,"POLYGON ((-83.69851 43.39271, -83.69848 43.391...",189821
10,26021,"Berrien County, Michigan",153938,73.7,13.9,113481,21433,60379,3.8e-05,38608,0.89,"POLYGON ((-86.22460 41.84115, -86.22461 41.840...",153938
62,26125,"Oakland County, Michigan",1272264,69.9,13.0,889221,166018,92620,0.000542,54217,1.25,"POLYGON ((-83.31682 42.44203, -83.31779 42.442...",1272264
60,26121,"Muskegon County, Michigan",175947,75.3,13.0,132454,22958,61347,4.7e-05,43146,1.0,"POLYGON ((-85.90839 43.20592, -85.90810 43.196...",175947
49,26099,"Macomb County, Michigan",878453,76.3,12.3,669918,107772,73876,0.000594,53916,1.24,"POLYGON ((-82.92858 42.45062, -82.92956 42.450...",878453
80,26161,"Washtenaw County, Michigan",370231,68.6,11.4,254046,42224,84245,0.000198,41060,0.95,"POLYGON ((-83.54373 42.26246, -83.54377 42.261...",370231
32,26065,"Ingham County, Michigan",282540,68.1,11.4,192417,32076,62548,0.000195,41520,0.96,"POLYGON ((-84.15028 42.68520, -84.15027 42.685...",282540
38,26077,"Kalamazoo County, Michigan",261426,75.9,10.6,198298,27756,67905,0.000174,40194,0.93,"POLYGON ((-85.52915 42.07070, -85.54049 42.070...",261426


In [7]:
gdf.sort_values("location_effect", ascending=False).head(10)

Unnamed: 0,geo_id,geo_name,total_pop,white_pct,black_pct,white_tot,black_tot,median_income,density,generic_location_based_premium,location_effect,geometry,tot_pop
81,26163,"Wayne County, Michigan",1781641,48.7,37.4,866868,666184,57223,0.001024,63266,1.46,"POLYGON ((-82.89881 42.35794, -82.90614 42.356...",1781641
24,26049,"Genesee County, Michigan",405280,71.2,19.3,288406,78323,58594,0.000241,55573,1.28,"POLYGON ((-83.57105 42.87328, -83.57142 42.873...",405280
43,26087,"Lapeer County, Michigan",88687,90.4,1.1,80153,957,75402,5.2e-05,54387,1.26,"POLYGON ((-83.10289 42.88865, -83.10850 42.888...",88687
62,26125,"Oakland County, Michigan",1272264,69.9,13.0,889221,166018,92620,0.000542,54217,1.25,"POLYGON ((-83.31682 42.44203, -83.31779 42.442...",1272264
49,26099,"Macomb County, Michigan",878453,76.3,12.3,669918,107772,73876,0.000594,53916,1.24,"POLYGON ((-82.92858 42.45062, -82.92956 42.450...",878453
73,26151,"Sanilac County, Michigan",40759,93.0,0.4,37903,173,55740,1e-05,53001,1.22,"POLYGON ((-82.64328 43.16400, -82.64379 43.163...",40759
42,26085,"Lake County, Michigan",12285,81.0,9.9,9949,1212,45946,8e-06,48838,1.13,"POLYGON ((-86.04351 44.16709, -86.04250 44.167...",12285
52,26105,"Mason County, Michigan",29178,90.2,1.1,26322,323,60744,9e-06,48838,1.13,"POLYGON ((-86.04011 43.98967, -86.04008 43.986...",29178
66,26133,"Osceola County, Michigan",23022,92.5,0.9,21299,217,54875,1.6e-05,48838,1.13,"POLYGON ((-85.56455 44.16481, -85.56455 44.164...",23022
17,26035,"Clare County, Michigan",30998,93.8,0.7,29088,212,47816,2.1e-05,48838,1.13,"POLYGON ((-84.78346 43.81465, -84.78752 43.814...",30998


# Prereform

In [8]:
GDF_PREREFORM_DATA = gpd.read_file(PREREFORM_DATA_FP)
GDF_PREREFORM_DATA = GDF_PREREFORM_DATA[GDF_PREREFORM_DATA["geo_id"].notnull()]
GDF_PREREFORM_DATA["black_pct"] = GDF_PREREFORM_DATA["black_pct"].astype(float)
GDF_PREREFORM_DATA["white_pct"] = GDF_PREREFORM_DATA["white_pct"].astype(float)
GDF_PREREFORM_DATA["density"] = (
    GDF_PREREFORM_DATA["tot_pop"] / GDF_PREREFORM_DATA.to_crs(PROJECTED_CRS).area
)

In [9]:
GDF_PREREFORM_DATA["effect_quantile"] = pd.qcut(
    GDF_PREREFORM_DATA["location_effect"],
    q=len(RATE_Q_LABELS),
    labels=RATE_Q_LABELS,
)
GDF_PREREFORM_DATA["income_quantile"] = pd.qcut(
    GDF_PREREFORM_DATA["median_income"],
    q=len(INCOME_Q_LABELS),
    labels=INCOME_Q_LABELS,
)
GDF_PREREFORM_DATA["density_quantile"] = pd.qcut(
    GDF_PREREFORM_DATA["density"],
    q=len(DENSITY_Q_LABELS),
    labels=DENSITY_Q_LABELS,
)

> Before the reform, 34 percent of residents lived in a ZCTA in the top quintile of location effects: This included 74 percent of Black Michiganders and 26 percent of White Michiganders.

I display the table below to fact check

In [10]:
gdf_groupby_quantiles = GDF_PREREFORM_DATA.groupby("effect_quantile", observed=False)[
    QUANTILE_GROUP_BY_COLS
].sum()

df_pre_refrom_distribution = prptn_to_pct(
    gdf_groupby_quantiles.div(gdf_groupby_quantiles.sum(axis=0), axis=1)
)
df_pre_refrom_distribution

Unnamed: 0_level_0,black_tot,white_tot,tot_pop
effect_quantile,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
lowest effect,10.6,25.6,23.8
middle low,5.5,13.5,12.0
median,7.5,18.4,17.1
middle high,2.4,16.3,13.5
highest effect,74.0,26.2,33.6


# Postreform rate quantiles

In [11]:
gdf["effect_quantile"] = pd.qcut(
    gdf["location_effect"], q=len(RATE_Q_LABELS), labels=RATE_Q_LABELS
)
gdf["density_quantile"] = pd.qcut(
    gdf["density"], q=len(DENSITY_Q_LABELS), labels=DENSITY_Q_LABELS
)

if "median_income" in gdf.columns:
    gdf["income_quantile"] = pd.qcut(
        gdf["median_income"], q=len(INCOME_Q_LABELS), labels=INCOME_Q_LABELS
    )
else:
    gdf["income_quantile"] = None

## Race

> After the reform, about fifty percent of the state’s residents lived in counties with location effects in the top quintile of the state. This included 76 percent of Black Michiganders and 41 percent of White Michiganders.

In [12]:
gdf_groupby_quantiles = gdf.groupby("effect_quantile", observed=False)[
    QUANTILE_GROUP_BY_COLS
].sum()
column_sums = gdf_groupby_quantiles.sum(axis=0)
df_distribution = prptn_to_pct(gdf_groupby_quantiles.div(column_sums, axis=1), 2)
df_distribution

Unnamed: 0_level_0,black_tot,white_tot,tot_pop
effect_quantile,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
lowest effect,11.0,24.0,22.0
middle low,3.0,6.0,6.0
median,9.0,20.0,18.0
middle high,1.0,9.0,7.0
highest effect,76.0,41.0,47.0


## Stacked chart

First draft of visual

In [13]:
%run ../00_misc/helper-func-notebook.ipynb
stacked_quintile_chart = stacked_race_hbar(df_distribution, "Liberty Mutual")
stacked_quintile_chart.save("../00_misc/charts/liberty_mutual_population_quintile.png")
stacked_quintile_chart

# Largest gap

> The largest gap in location effect was between Wayne County and Midland and Isabella Counties. Midland County and Isabella County, both majority White counties in central Michigan, border each other and had a location effect half that of Wayne County, the state’s most populous and diverse county.

In [14]:
loc_effect_max = gdf["location_effect"].max()
loc_effect_min = gdf["location_effect"].min()

Display the entries for the min and max effect verify 

In [15]:
gdf[gdf["location_effect"] == gdf["location_effect"].max()]

Unnamed: 0,geo_id,geo_name,total_pop,white_pct,black_pct,white_tot,black_tot,median_income,density,generic_location_based_premium,location_effect,geometry,tot_pop,effect_quantile,density_quantile,income_quantile
81,26163,"Wayne County, Michigan",1781641,48.7,37.4,866868,666184,57223,0.001024,63266,1.46,"POLYGON ((-82.89881 42.35794, -82.90614 42.356...",1781641,highest effect,highest density,middle low


In [16]:
gdf[gdf["location_effect"] == gdf["location_effect"].min()]

Unnamed: 0,geo_id,geo_name,total_pop,white_pct,black_pct,white_tot,black_tot,median_income,density,generic_location_based_premium,location_effect,geometry,tot_pop,effect_quantile,density_quantile,income_quantile
36,26073,"Isabella County, Michigan",64938,84.3,2.4,54720,1584,52638,4.3e-05,30918,0.71,"POLYGON ((-84.76067 43.81469, -84.76015 43.814...",64938,lowest effect,middle high,lowest income
55,26111,"Midland County, Michigan",83503,90.1,1.5,75271,1253,73643,6.1e-05,30918,0.71,"POLYGON ((-84.16800 43.57878, -84.16805 43.576...",83503,lowest effect,middle high,highest income


Verify the max effect is double the minimum

In [17]:
ASSERTED_RATIO_MIN = 2
assert (loc_effect_max / loc_effect_min) > ASSERTED_RATIO_MIN
f"The maximum effect divided by the minimum is {(loc_effect_max / loc_effect_min)} ({loc_effect_max} / {loc_effect_min})"

'The maximum effect divided by the minimum is 2.056338028169014 (1.46 / 0.71)'

# Highest effect compared to median

> Wayne County, one of six counties within the greater Detroit metropolitan area, is about 40 percent Black. Its location effect was 50 percent higher than the statewide median and about 14 percent higher than Genesee County, where we observed the second highest location effect in the state.

We round up 1.46 to 1.5 hence 50 percent higher

In [18]:
ASSERTED_FIGURE = 1.5
median_rate = gdf["generic_location_based_premium"].median()
max_rate = gdf["generic_location_based_premium"].max()
assert round(loc_effect_max, 1) == ASSERTED_FIGURE
f"The location effect in Wayne County is {round(loc_effect_max, 2)} times the state median ({max_rate} / {median_rate})"

'The location effect in Wayne County is 1.46 times the state median (63266 / 43319.0)'

In [19]:
ASSERTED_FIGURE = 1.14
second_higest_effect = gdf["location_effect"].sort_values(ascending=False).iloc[1]
max_div_second_highest = round(loc_effect_max / second_higest_effect, 2)
assert max_div_second_highest == ASSERTED_FIGURE
f"The location effect in Wayne County is {max_div_second_highest} times ({loc_effect_max} / {second_higest_effect}) the second highest effect."

'The location effect in Wayne County is 1.14 times (1.46 / 1.28) the second highest effect.'

Display the top two effects highest effect to confirm Wayne and Genessee County.

In [20]:
gdf.sort_values(["location_effect"], ascending=False).iloc[:2]

Unnamed: 0,geo_id,geo_name,total_pop,white_pct,black_pct,white_tot,black_tot,median_income,density,generic_location_based_premium,location_effect,geometry,tot_pop,effect_quantile,density_quantile,income_quantile
81,26163,"Wayne County, Michigan",1781641,48.7,37.4,866868,666184,57223,0.001024,63266,1.46,"POLYGON ((-82.89881 42.35794, -82.90614 42.356...",1781641,highest effect,highest density,middle low
24,26049,"Genesee County, Michigan",405280,71.2,19.3,288406,78323,58594,0.000241,55573,1.28,"POLYGON ((-83.57105 42.87328, -83.57142 42.873...",405280,highest effect,highest density,median


# Appendix

## Population density

In [21]:
gdf_temp = gdf.pivot_table(
    index="effect_quantile", columns="density_quantile", aggfunc="count", observed=False
)["median_income"]
df_density_quintile = round(gdf_temp / gdf_temp.sum(), 2)
df_density_quintile

density_quantile,lowest density,middle low,median,middle high,highest density
effect_quantile,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
lowest effect,0.41,0.25,0.06,0.38,0.35
middle low,0.18,0.19,0.24,0.0,0.12
median,0.06,0.06,0.29,0.25,0.24
middle high,0.35,0.25,0.24,0.25,0.06
highest effect,0.0,0.25,0.18,0.12,0.24


In [22]:
%run ../00_misc/helper-func-notebook.ipynb
df_density_quintile = prptn_to_pct(gdf_temp / gdf_temp.sum(), 5)
density_hbar = stacked_population_density_hbar(
    df_density_quintile, title="Liberty Mutual"
)
density_hbar

In [23]:
gdf_groupby_density_quantiles = gdf.groupby("density_quantile", observed=False)[
    QUANTILE_GROUP_BY_COLS
].sum()
column_sums = gdf_groupby_density_quantiles.sum(axis=0)
df_density_distribution = prptn_to_pct(
    gdf_groupby_density_quantiles.div(column_sums, axis=1), 2
)
df_density_distribution

Unnamed: 0_level_0,black_tot,white_tot,tot_pop
density_quantile,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
lowest density,0.0,3.0,2.0
middle low,0.0,5.0,4.0
median,1.0,8.0,7.0
middle high,5.0,15.0,13.0
highest density,94.0,69.0,74.0


## Effect x density

In [24]:
def pivot_effect_density_quantiles(
    gdf, race_group, race_label=None, calculate_percent=True
):
    gdf_temp = gdf.pivot_table(
        index="effect_quantile",
        columns="density_quantile",
        values=race_group,
        aggfunc="sum",
        observed=False,
    )
    if calculate_percent:
        gdf_temp = prptn_to_pct(gdf_temp / gdf_temp.sum().sum())
    gdf_temp = gdf_temp.reset_index()
    if race_label:
        gdf_temp["race"] = race_label
    else:
        gdf_temp["race"] = race_group
    gdf_temp["insurer"] = INSURER
    return gdf_temp


def join_effect_density_quantiles_pivots(calculate_percent=True):
    gdf_white = pivot_effect_density_quantiles(
        gdf, "white_tot", "White", calculate_percent=calculate_percent
    )
    gdf_black = pivot_effect_density_quantiles(
        gdf, "black_tot", "Black", calculate_percent=calculate_percent
    )
    return pd.concat([gdf_white, gdf_black], ignore_index=True)


gdf_effect_density_quantiles_pivot = join_effect_density_quantiles_pivots(False)
gdf_effect_density_quantiles_pivot.to_csv(
    "./outputs/effect_density_quantiles_pivot_count.csv", index=False
)

## County count and median

from the: "Calculating 'Location Effect'" section

> Next, we repeat this process for the other 82 Michigan counties, then find the median value of all the counties: $43,319

In [25]:
ASSERTED_VALUE = 82 + 1
assert ASSERTED_VALUE == len(gdf)

ASSERTED_VALUE = 43319
median_value = gdf["generic_location_based_premium"].median()
assert ASSERTED_VALUE == median_value

print(
    f"we repeat this process for the other {len(gdf)-1} Michigan counties, then find the median value of all the counties: : {median_value}"
)

we repeat this process for the other 82 Michigan counties, then find the median value of all the counties: : 43319.0


## Income cross tab

In [26]:
gdf_temp = gdf.pivot_table(
    index="effect_quantile", columns="income_quantile", aggfunc="count", observed=False
)["median_income"]
gdf_temp / gdf_temp.sum()

income_quantile,lowest income,middle low,median,middle high,highest income
effect_quantile,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
lowest effect,0.235294,0.25,0.294118,0.25,0.411765
middle low,0.294118,0.0625,0.176471,0.125,0.058824
median,0.0,0.125,0.176471,0.4375,0.176471
middle high,0.352941,0.3125,0.176471,0.125,0.176471
highest effect,0.117647,0.25,0.176471,0.0625,0.176471


In [27]:
%run ../00_misc/helper-func-notebook.ipynb
df_income_quintile = prptn_to_pct(gdf_temp / gdf_temp.sum(), 5)
income_hbar = stacked_income_hbar(df_income_quintile, title="Liberty Mutual")
income_hbar.save("../00_misc/charts/liberty_mutual_income_quintile.png")
income_hbar

## Effect in top quantile

In [28]:
gdf_highest_effects = gdf[gdf["effect_quantile"] == "highest effect"]
lowest_quantile_min_effect = gdf_highest_effects["location_effect"].min()
highest_quantile_max_effect = gdf_highest_effects["location_effect"].max()

print(
    f"The location effect in the top quantile ranged from {lowest_quantile_min_effect} to {highest_quantile_max_effect}"
)

The location effect in the top quantile ranged from 1.13 to 1.46


## Lowest effect

In [29]:
gdf_min = gdf[gdf["location_effect"] == gdf["location_effect"].min()]
lowest_effect_pct_white = (
    round(gdf_min["white_tot"].sum() / gdf_min["total_pop"].sum(), 3) * 100
)
f"Isabella and Midland Counties have the lowest effect in the state and are {lowest_effect_pct_white}% White."

'Isabella and Midland Counties have the lowest effect in the state and are 87.6% White.'

## Export data

In [30]:
%run ../00_misc/helper-func-notebook.ipynb
df_export = datawrapper_race_distribution(df_distribution, "Liberty Mutual")
df_export.to_csv("./outputs/liberty_mutual_race_chart_data.csv")

In [31]:
%run ../00_misc/helper-func-notebook.ipynb
df_export = datawrapper_race_distribution(df_density_distribution, "Liberty Mutual")
df_export.to_csv("./outputs/liberty_mutual_race_density_chart_data.csv")

In [32]:
%run ../00_misc/helper-func-notebook.ipynb
df_export = datawrapper_income_distribution(df_income_quintile, "Liberty Mutual")
df_export.to_csv("./outputs/liberty_mutual_income_chart_data.csv")
df_export

income,lowest effect,middle low,median,middle high,highest effect,Insurer
Lowest income,23.529,29.412,0.0,35.294,11.765,Liberty Mutual
Lower income,25.0,6.25,12.5,31.25,25.0,Liberty Mutual
Middle income,29.412,17.647,17.647,17.647,17.647,Liberty Mutual
Higher income,25.0,12.5,43.75,12.5,6.25,Liberty Mutual
Highest incomes,41.176,5.882,17.647,17.647,17.647,Liberty Mutual


In [33]:
%run ../00_misc/helper-func-notebook.ipynb
df_export = datawrapper_pop_density_distribution(df_density_quintile, "Liberty Mutual")
df_export.to_csv("./outputs/liberty_mutual_pop_density_chart_data.csv")
df_export

Population density,lowest effect,middle low,median,middle high,highest effect,Insurer
Lowest density,41.176,17.647,5.882,35.294,0.0,Liberty Mutual
Lower density,25.0,18.75,6.25,25.0,25.0,Liberty Mutual
Middle density,5.882,23.529,29.412,23.529,17.647,Liberty Mutual
Higher density,37.5,0.0,25.0,25.0,12.5,Liberty Mutual
Highest density,35.294,11.765,23.529,5.882,23.529,Liberty Mutual


## Prereform income cross tab

In [34]:
gdf_temp = GDF_PREREFORM_DATA.pivot_table(
    index="effect_quantile", columns="income_quantile", aggfunc="count", observed=False
)["black_pct"]
gdf_temp / gdf_temp.sum()

income_quantile,lowest income,middle low,median,middle high,highest income
effect_quantile,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
lowest effect,0.171717,0.22335,0.217172,0.30102,0.238579
middle low,0.191919,0.238579,0.146465,0.183673,0.126904
median,0.191919,0.208122,0.222222,0.178571,0.248731
middle high,0.161616,0.263959,0.282828,0.193878,0.142132
highest effect,0.282828,0.06599,0.131313,0.142857,0.243655


## Prereform density cross tab

In [35]:
gdf_temp = GDF_PREREFORM_DATA.pivot_table(
    index="effect_quantile", columns="density_quantile", aggfunc="count", observed=False
)["black_pct"]
gdf_temp / gdf_temp.sum()

density_quantile,lowest density,middle low,median,middle high,highest density
effect_quantile,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
lowest effect,0.318182,0.182741,0.187817,0.319797,0.142132
middle low,0.207071,0.233503,0.218274,0.177665,0.050761
median,0.277778,0.142132,0.238579,0.208122,0.182741
middle high,0.166667,0.411168,0.248731,0.177665,0.040609
highest effect,0.030303,0.030457,0.106599,0.116751,0.583756


## Exclude Detroit

In [36]:
gdf_exclude_wayne = gdf[gdf["geo_id"] != "26163"]
gdf_groupby_quantiles = gdf_exclude_wayne.groupby("effect_quantile", observed=False)[
    QUANTILE_GROUP_BY_COLS
].sum()
column_sums = gdf_groupby_quantiles.sum(axis=0)
df_distribution = prptn_to_pct(gdf_groupby_quantiles.div(column_sums, axis=1), 2)
df_distribution

Unnamed: 0_level_0,black_tot,white_tot,tot_pop
effect_quantile,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
lowest effect,22.0,27.0,27.0
middle low,6.0,7.0,7.0
median,17.0,22.0,22.0
middle high,2.0,10.0,9.0
highest effect,53.0,33.0,35.0


In [37]:
gdf_exclude_wayne = gdf[gdf["geo_id"] != "26163"].copy()
gdf_exclude_wayne["effect_quantile"] = pd.qcut(
    gdf_exclude_wayne["generic_location_based_premium"],
    q=len(RATE_Q_LABELS),
    labels=RATE_Q_LABELS,
)

gdf_groupby_quantiles = gdf_exclude_wayne.groupby("effect_quantile", observed=False)[
    QUANTILE_GROUP_BY_COLS
].sum()
column_sums = gdf_groupby_quantiles.sum(axis=0)
df_distribution = prptn_to_pct(gdf_groupby_quantiles.div(column_sums, axis=1), 2)
df_distribution

Unnamed: 0_level_0,black_tot,white_tot,tot_pop
effect_quantile,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
lowest effect,21.0,23.0,23.0
middle low,7.0,11.0,11.0
median,9.0,15.0,14.0
middle high,10.0,14.0,14.0
highest effect,53.0,37.0,39.0
