## Background

My wife Caitlyn Van Heest and I worked together to create this first draft interactive map of American Community Survey Census blocks (2018 data courtesy of Colorado Information Marketplace) that looks at the index of minority race/ethnicity groups among each block. If you're unfamiliar with indexing, the goal is to identify higher concentrations of some target data within an overall concentration. Here, it is used to look at the composition of households of a race/ethnicity group without overemphasizing heavily populated blocks.

    Index Example: 
    Census block 1234 has 500 total inhabitants and 50 are Black/African American
    Census block 5678 has 100 total inhabitants and 40 are Black/African American.*


    Among these two blocks (600 total inhabitants) 90 identify as Black/African American. The index for block A will be 50/500 (0.10) over (90/600), equaling 0.66. The index for block B is (40/100) / (90/600) = 2.66. Even though block 1234 houses fewer people than 5678, there is a higher concentration Black/African American households in block 5678. In fact, block 5678 has over two-and-a-half times the mean concentration of black households among all the blocks in this example.* 


Each layer with race/ethnicity index values uses the mean concentration of households per race/ethnicity among Colorado as the denominator and not the United States at large.
In the map below, different layers exist for each available race/ethnicity group as well as non-white households which include all non-white household counts). Each layer can be toggled using the small layer pane in the upper right corner. **By default, non-white households and healthcare providers that accept medicaid are the only layers active. Add any combination or view by using the layer controls in the upper right corner.**
![Toggle button](../../references/map_toggle_button.JPG)
![Layer Toggles](../../references/map_layer_toggle.JPG)

The individual markers throughout the map are healthcare locations likely to be providing care for COVID-19 cases that also accept Medicaid. Providers that do not accept Medicaid are listed as a separate layer. Click on a marker to see the name, phone, and if they accept Medicaid (y/n).

![Tooltips](../../references/tooltip.JPG)

-------

In [1]:
import geopandas as gpd
import descartes
import folium
from folium.plugins import MarkerCluster
import branca.colormap as cm

## Visualizing Distributions of Population Indices

Visualizing the distributions of indices gives insight into the frequencies of different concentrations of race/ethnicity groups in the state. For example. the distribution of white households indexed in Colorado looks like this:

![Layer Toggles](../../references/figures/white_index_hist.png)

You can see the most frequent indices tend to be around 1.3x. This can be interpreted that most frequently, blocks in Colorado have about 1.3x the "typical" number of white households as compared to the average concentration of white households in all of Colorado. However, the fact that there are hundreds of blocks with below average concentration signals that even when there are fewer than average numbers of white households in a block, white households do still account for some of the total population.

In contrast, here are the distributions for the indices of Black/African American, Asian, Hawaiian/Pacific Islander, and Native American race/ethnicities among those same ACS blocks.

![histograms by groups](../../references/figures/hist_by_group.png)

Compared to white Coloradans, each of these groups show a sharply declining distribution with a long right tail. This signals blocks across the state tend to have low numbers of each of these groups on average. The high index values signal that there are highly concentrated blocks in the state where many more of these households are concentrated.


## Plotting ECDF of Race/Ethnicity Index Values

The empirical cumulative distribution function allows us to clearly see how much of a trait/column/feature's distribution falls before a certain value. We plotted the ECDF as an additional visual for the distributions to emphasize the localization of different groups in Colorado. Groups that have high concentrations of households in relatively few blocks will have steeply rising curves that then approach the limit of 100% very slowly. See Hawaiian / Pacific Islanders for example. Among indexed values, more than 90% are 0, meaning there are no Hawaiian/ Pacific Islanders for 90+% of blocks in Colorado. Because the average concentration of households across the state is so low, the blocks that do have households identifying as such express huge concentrations (up to 175x the mean concentration of households identifying as such compared to Colorado in general). Race/Ethnic groups that are more common across blocks would look like a more slowly rising curve that eventually reach 100.

Tips for interpretation:
The y axis contains the percentage of observations that fall at or below a value. The X axis contains the index (the concentration of households of the specific group per block compared to the average concentration in the state) among the specified group.

Example: 90% of blocks contain less than ~3x the concentration of Black / African American Households than the average block in Colorado. 

The overall shape of the curve is important. Curves that rise sharply at or near x=0 imply a high number of blocks with few to no households of the specified group among an increasingly large fraction of total blocks. A curve that rises steadily from bottom left to top right would imply a race/ethnicity that is geographically dispersed and occupies blocks at many different concentrations within the state.


![ecdf by group](../../references/figures/ecdf_by_group.png)

Visualizing these distributions does not directly shine light on any huge inequity itself at this point. However, very high concentrations of minority communities are often associated with historical practices such as [redlining]('https://en.wikipedia.org/wiki/Redlining') and [upward mobility is impacted by one's family geography, especially among Black Americans]('https://www.nber.org/papers/w24441.pdf'). For the purposes of this analysis, the focus is on geography and access to care among these minority communities. Our hope is that individual awareness, giving, and support can be better leveraged among communities likely to be facing disparities in health equity when equipped with this information.

## Choropleth Mapping




In [2]:
#instantiate the map focused on denver metro but with statewide view
m = folium.Map(location=[39.3324, -105.1420], zoom_start=7)


def add_folium_layer(filepath,
                     filename,
                     target_field,
                     name,
                     gdf_index,
                     json_index,
                     tooltips,
                     map_obj,
                     show=True):
    '''add layer to existing map from geojson file'''
    if len(filepath) == 0 or len(filename) == 0:
        print('no filepath provided or no filename provided')
    else:
        geom = gpd.read_file(filepath + filename)
        geom = geom.set_index(geom[gdf_index])
        step = cm.linear.PuBuGn_09.to_step(8).scale(geom[target_field].min(),
                                                    geom[target_field].max())
        folium.GeoJson(
            geom.to_json(),
            name=name,
            style_function=lambda x: {
                'fillColor': step(geom.loc[float(x[json_index]), target_field]
                                  ),
                'fillOpacity': 0.8,
                'weight': 1.5
            },
            tooltip=folium.features.GeoJsonTooltip(fields=tooltips,
                                                   localize=True),
            show=show).add_to(map_obj)


for position, file in enumerate([
        'black_nh_index_high_indexing_geom.geojson',
        'asian_nh_index_high_indexing_geom.geojson',
        'hawpi_nh_index_high_indexing_geom.geojson',
        'ntvam_nh_index_high_indexing_geom.geojson',
        'other_nh_index_high_indexing_geom.geojson',
        'non_white_nh_index_high_indexing_geom.geojson'
]):

    #list  of target columns for looping
    target_pop = [
        'black_nh_index', 'asian_nh_index', 'hawpi_nh_index', 'ntvam_nh_index',
        'other_nh_index', 'non_white_nh_index'
    ]
    #list of labels for map layer
    labels = [
        'Black/African American High-Indexing Blocks',
        'Asian High-Indexing Blocks',
        'Hawaiian/Pacific Islander High-Indexing Blocks',
        'Native American High-Indexing Blocks', '"Other" High-Indexing Blocks',
        'Non-White Households High-Indexing Blocks'
    ]
    #list for tooltip fields
    pops = [
        'black_nh', 'asian_nh', 'hawpi_nh', 'ntvam_nh', 'other_nh',
        'non_white_nh'
    ]
    # list for layer display
    show = [False, False, False, False, False, True]

    #add map layer
    add_folium_layer('../../data/output/', file, target_pop[position],
                     labels[position], 'geonum', 'id',
                     [target_pop[position], pops[position], 'pop'], m,
                     show[position])

# use a clustered view of markers
marker_cluster_medicaid = MarkerCluster(
    name='Healthcare Facilities Accepting Medicaid',
    options={
        'disableClusteringAtZoom': 14
    }).add_to(m)
marker_cluster_non_medicaid = MarkerCluster(
    name='Healthcare Facilities not Accepting Medicaid',
    options={
        'disableClusteringAtZoom': 14
    },
    show=False).add_to(m)

# add a marker for every record in the healthcare facilities likely to address COVID-19
hc_df = gpd.read_file('../../data/output/healthcare.geojson')

for each in hc_df[hc_df['MEDICAID'] == "Y"].iterrows():
    folium.Marker(
        [each[1]['Latitude'], each[1]['Longitude']],
        popup=str(each[1]['FAC_NAME'] + '\nPhone: ' + each[1]['PHONE'] +
                  '\nMedicaidAccept: ' + each[1]['MEDICAID']),
        clustered_marker=True,
        show=True).add_to(marker_cluster_medicaid)

for each in hc_df[hc_df['MEDICAID'] == "N"].iterrows():
    folium.Marker(
        [each[1]['Latitude'], each[1]['Longitude']],
        popup=str(each[1]['FAC_NAME'] + '\nPhone: ' + each[1]['PHONE'] +
                  '\nMedicaidAccept: ' + each[1]['MEDICAID']),
        clustered_marker=True).add_to(marker_cluster_non_medicaid)

#add layer toggle
folium.LayerControl().add_to(m)
m

### Exporting to html


In [3]:
html_string = m.get_root().render()
with open("../../data/output/map_html.html", "w") as text_file:
    print(html_string, file=text_file)