## Harris County Household and Tank Case Study

### Import Statements

Here we are importing different libraries and packages that we use to make our visualizations

In [1]:
import geopandas as gpd

import cuxfilter
from cuxfilter.layouts import feature_and_five_edge, feature_and_double_base, feature_and_base
import cudf
import numpy as np

import holoviews as hv
import pandas as pd

### Importing HouseHold Distance Data

This is a preprocessed file with distance between households in Harris county and tanks already calculated in miles. This dataframe also includes information as to whether the households have children, the age code of the head of household, the latitude and longitudes of the tanks and households, tank type, tank diameter, and distance, and if there are elderly in each household.

The ```lat_3857``` and ```lon_3857``` coordinates will be the points we plot on our cuxfilter dashboard.

In [2]:
df_harris = pd.read_parquet('/hpc/group/codeplus22-vis/infousa_copy/distances_harris_final.parquet')
df_harris = df_harris[df_harris['distance_category'] != 4]
df_harris

Unnamed: 0,has_child,age_code,lat_3857,lon_3857,tank_type,diameter,distance_m,distance_mi,distance_category,is_elderly
288,2,I,-1.062712e+07,3.477878e+06,closed_roof_tank,57.6,7996.077996,4.968533,3,2
291,1,G,-1.062858e+07,3.478481e+06,closed_roof_tank,57.6,8039.808899,4.995706,3,2
331,1,L,-1.062456e+07,3.475956e+06,closed_roof_tank,13.2,7966.516501,4.950164,3,1
365,2,M,-1.062815e+07,3.478845e+06,closed_roof_tank,57.6,7582.136264,4.711321,3,1
376,2,I,-1.062864e+07,3.478519e+06,closed_roof_tank,57.6,8036.693809,4.993770,3,2
...,...,...,...,...,...,...,...,...,...,...
2336530,0,,-1.062105e+07,3.488964e+06,narrow_closed_roof_tank,4.2,,35.000000,0,0
2336531,0,,-1.062105e+07,3.488936e+06,narrow_closed_roof_tank,4.2,,35.000000,0,0
2336532,0,,-1.062105e+07,3.488927e+06,narrow_closed_roof_tank,4.8,,35.000000,0,0
2336533,0,,-1.061700e+07,3.488957e+06,closed_roof_tank,20.4,,35.000000,0,0


### Defining Charts

Below, we have code for labeling the distance, elderly and children multiselects. Also, we have made a list for the four colors our points will be colored in on the map.

In [3]:
label_map_distance = {0: 'Tank', 1: '0.5 miles away', 
             2: '1 mile away', 3: '5 miles away'}

label_map_elderly = {0: 'Tank', 1: 'Elderly', 
             2: 'Not Elderly'}

label_map_children = {0: 'Tank', 1: 'Children', 
             2: 'No Children'}

colors = ['#05c1ff', '#ff0000', '#ff00a4', '#a11aeb']

### Transforming to cuxfilter dataframe

We must transform the pandas dataframes into cudf dataframes so that we can plot them using cuxfilter.

In [4]:
cdf = cudf.DataFrame.from_pandas(df_harris) 

In [5]:
cux_df = cuxfilter.DataFrame.from_dataframe(cdf) 

### Making Cuxfilter Charts

Here, we are defining the charts. The ```points``` chart is the main map with households and tanks plotted. The points for tanks and households are colored differently by specifying the ```aggregate_col```. We are specifying the column ```distance_category```  because in this column 0 represents tanks, 1 represents houses 0.5 to 1 miles away, 2 represents houses 1 to 5 miles away, and 3 represents households more than 5 miles away from a tank. Each of these 4 categories of points will have different colors, as specfied by the list of colors above.

We are also defining our other multiselects and sliders here. The first parameter to these functions is column you want the multiselect or slider to be about.

Finally, we are defining the dashboard, with the final layout we want for all of the charts, multiselects, and sliders.

In [6]:
points = cuxfilter.charts.scatter(x='lat_3857', y='lon_3857', pixel_shade_type='linear', color_palette = colors, aggregate_fn = 'max', aggregate_col = 'distance_category', tile_provider="CartoDark", title = 'Households in Harris County in Close Proximity to Tanks',
                                   x_range=(-13825798.514061378,-7542228.134036879), y_range=(2819963.842141629,6272600.009501693), legend = True)

distance_category = cuxfilter.charts.multi_select('distance_category', label_map=label_map_distance)

age = cuxfilter.charts.multi_select('is_elderly', label_map=label_map_elderly)

children = cuxfilter.charts.multi_select('has_child', label_map=label_map_children)

distance_slider = cuxfilter.charts.range_slider('distance_mi')

In [7]:
d = cux_df.dashboard([points, distance_slider], sidebar = [distance_category, age, children], layout = cuxfilter.layouts.feature_and_base, theme = cuxfilter.themes.rapids) 

### Displaying interactive dashboard

Running the commands below will get the dashboard to display.

In [9]:
d.show()
d.app(sidebar_width=290) # run the dashboard within the notebook ce

Dashboard running at port 54191
