<img src="https://github.com/jupytercon/2020-exactlyallan/raw/master/images/RAPIDS-header-graphic.png" style="width:50%">


# RAPIDS Visualization Guide Notebook
### A Streamlined Guide to RAPIDS Accelerated Visualization and Visual Analtyics
The guide will walk through using RAPIDS cuDF, cuSpatial, and cuGraph with Holoviews, hvPlot, Datashader, cuxfilter, and Plotly Dash with the publically availble Divvy Bike share dataset. 

**NOTES and TODO:**
-Base on [JupyterCon Notebooks](https://github.com/rapidsai-community/event-notebooks/blob/main/JupyterCon_2020_RAPIDSViz/00%20Index%20and%20Introduction.ipynb) not cuxfilter tutorial


## Requirements
- System that meets the [RAPIDS system and GPU requirements](https://docs.rapids.ai/install#system-req)


## Dependencies
Use the below to install all the required dependencies via conda:



In [None]:
# NOTE: need to user jupyter lab 3.6.4 (4.0 doesnt work with hvplot)

# imports
import os
from zipfile import ZipFile
from pathlib import Path

import cupy
import cartopy  # NOTE: need to include
import geoviews  # NOTE: need to include

import cudf
import cuspatial
import cugraph
import cuml

from bokeh.models import NumeralTickFormatter
import hvplot.cudf # NOTE: need to include
hvplot.extension('bokeh')
import colorcet
import panel as pn

import cuxfilter


## Load Dataset
The dataset can be downloaded from the [Divvy Bike Share public dataset](https://divvybikes.com/system-data). Use the following script to download the desired date range and load it into a dataframe.


In [None]:
# Define the URL of the Divvy trip data and save dir
S3 = 'https://divvy-tripdata.s3.amazonaws.com/'
DATA_DIR = './data'


In [None]:
# Check dir
Path(DATA_DIR).mkdir(parents=True, exist_ok=True)
'''
# Download the zip files from the URL within date range and unzip
for year in range(2021, 2023):
    for month in range(1, 13):
        file = f'{year}{month:02d}-divvy-tripdata.zip'
        URL = f'{S3}{file}'
        ! wget -P {DATA_DIR} {URL}
     
        with ZipFile(f'{DATA_DIR}/{file}') as zip:
            zip.extractall(f'{DATA_DIR}')
'''

### cuDF

In [None]:

# Load all csv as dataframes and combine into one cudf
df_array = []

for file in Path(DATA_DIR).rglob('20*.csv'):
    gdf = cudf.read_csv(file)
    df_array.append(gdf)

df = cudf.concat(df_array)

# Check the data
df.reset_index()

## Check and Clean Data

The data seems unreasonabliy clean, but there are still a few things we improve on it. First lets double check the dtypes.


Lets check for blanks and nulls first


In [None]:
df.isnull().sum()


In [None]:
# Filter rows with at least one null value
df[df['end_lat'].isnull()]


In [None]:
# drop nulls
df = df.dropna(subset=['end_lat']).reset_index()
df.isnull().sum()

In [None]:
# Replace null values in 'start_station_id' with 'none'
df['start_station_name'] = df['start_station_name'].fillna('none')

# Replace null values in 'end_station_id' with 'none'
df['end_station_name'] = df['end_station_name'].fillna('none')


In [None]:
# Remove any erronus out of area lat lngs
min_lat = 41.5
max_lat = 42.5
min_lng = -88.0
max_lng = -87.0

df = df[(df['start_lat'] >= min_lat) & (df['start_lat'] <= max_lat) & (df['start_lng'] >= min_lng) & (df['start_lng'] <= max_lng) & 
                (df['end_lat'] >= min_lat) & (df['end_lat'] <= max_lat) & (df['end_lng'] >= min_lng) & (df['end_lng'] <= max_lng)].reset_index()

In [None]:
df.dtypes

The 'started_at' and 'ended_at' columns should be proper date times types.

In [None]:
df['started_at'] = cudf.to_datetime(df['started_at'])
df['ended_at'] = cudf.to_datetime(df['ended_at'])

df

To make things a bit easier lets break out the date and time into sperate columns, assuming we only need to worry about start time.

In [None]:
df['year'] = df['started_at'].dt.year
df['month'] = df['started_at'].dt.month
df['day'] = df['started_at'].dt.day
df['hour'] = df['started_at'].dt.hour

df

Extracting out the day of the week would be hepful too.

In [None]:
df['day_of_week'] = df['started_at'].dt.dayofweek

df

In [None]:
rider_type = df.groupby('member_casual').size().rename("count").reset_index()
rider_type


### hvPlot

In [None]:
rider_type.hvplot.bar(x='member_casual', y='count', title='Total Rider Types', yformatter='%0.0f')

In [None]:
# DOW = {0:'M', 1:'T', 2:'W', 3:'Th', 4:'F', 5:'Sa', 6:'Su'}

day_counts = df.groupby('day_of_week').size().rename('count').reset_index().sort_values('day_of_week')
day_counts.hvplot.bar('day_of_week', 'count', title="Trip starts per Week Day", yformatter="%0.0f")


In [None]:
# calculated duration in min
df['dur_min'] = (df['ended_at'] - df['started_at'])
df['dur_min'] = (df['dur_min'].dt.seconds / 60).round().astype('float32') #needed for cuML KDE

df

In [None]:
df.hvplot.hist(y='dur_min', bins=20, title="Trips Duration Histrogram", yformatter="%0.0f")

In [None]:
# Do some minor cleanup
df = df.drop(['index','ride_id','started_at','ended_at','start_station_id','end_station_id'], axis=1)

### cuML + KDE

In [None]:
# CUML KDE

# start, end, step size
dur_range = cupy.arange(1.0, 200.0, 5.0)

kde = cuml.KernelDensity(kernel='gaussian', bandwidth=1).fit(df['dur_min'])

log_density_values = kde.score_samples(dur_range)
density_values = cupy.exp(log_density_values)

density_df = cudf.DataFrame({'duration': dur_range, 'density': density_values})

density_df.hvplot.line(x='duration', y='density', xlabel='Data', ylabel='Density', title='Duration in min KDE')


In [None]:
df.loc[df['dur_min'].argsort().tail(5)]

In [None]:
trips_by_hour = df.groupby('hour').size().rename('count').reset_index().sort_values('hour')

avg_duration_by_hour = df.groupby('hour')['dur_min'].mean().rename('duration_mean').reset_index().sort_values('hour')

trips_by_hour.hvplot.bar('hour', 'count', title="Trip starts, per hour", yformatter="%0.0f") + avg_duration_by_hour.hvplot.bar('hour', 'duration_mean', title="Trip duration, per hour", yformatter="%0.0f") 

In [None]:
trips_by_hour_300 = df[df['dur_min'] <= 300].groupby('hour').size().rename('count').reset_index().sort_values('hour')

avg_duration_by_hour_300 = df[df['dur_min'] <= 300].groupby('hour')['dur_min'].mean().rename('duration_mean').reset_index().sort_values('hour')

trips_by_hour_300.hvplot.bar('hour', 'count', title="Trip starts, per hour", yformatter="%0.0f") + avg_duration_by_hour_300.hvplot.bar('hour', 'duration_mean', title="Trip duration, per hour", yformatter="%0.0f") 

In [None]:
# group data by month, day_of_week and hour, count the number of rows in each group
heatmap_data_dw = df.groupby(['day_of_week','hour']).size().rename('count').reset_index()
heatmap_data_dw.hvplot.heatmap(x='day_of_week', y='hour', C='count') 

In [None]:
# group data by month, day_of_week and hour, count the number of rows in each group
heatmap_data_dwm = df.groupby(['month','day_of_week','hour']).size().rename('count').reset_index()
heatmap_data_dwm.hvplot.heatmap(x='day_of_week', y='hour', C='count', groupby='month', widget_location='left_top')

In [None]:
df.hvplot.hexbin(x='start_lng', y='start_lat', cmap=colorcet.bgy, geo=True, tiles="OSM", logz=False, gridsize=150, width=700, height=600) + df.hvplot.hexbin(x='end_lng', y='end_lat', geo=True, cmap=colorcet.bgy, tiles="OSM", logz=False, gridsize=150, width=700, height=600)

And if you look at their system map, the lat longs seem to be accurate https://account.divvybikes.com/map.
But this seems like a lot of start / stop places, lets see if we can identify stations.

In [None]:
df['start_station_name'].unique()


In [None]:
df['start_lat'].round(4).unique()


So there are obviously many more starting points than stations, so it must be that the bikes do not have to start and stop at a station. We will have to find a way to bin the start stop locations into a reasonable number.

In [None]:
bike_type = df.groupby('rideable_type').size().rename('count').reset_index()
bike_type.hvplot.bar(x='rideable_type', y='count', title='Total Bike Types', yformatter='%0.0f')

## cuSpatial

In [None]:
# Create a cuSpatial GeoSeries from the latitude and longitude columns
start_points = cuspatial.GeoSeries.from_points_xy(df[['start_lng','start_lat']].interleave_columns().astype("float64"))
end_points = cuspatial.GeoSeries.from_points_xy(df[['end_lng','end_lat']].interleave_columns().astype("float64"))


In [None]:
distances_in_km = cuspatial.haversine_distance(start_points, end_points)
distances_in_km

In [None]:
# add the distances back into the dataframe, rounding values to make it more obvious if the stopped at the same place it started
dist_m = cudf.Series(distances_in_km).values * 1000
df['dist_m'] = dist_m.round().astype('int32')
df

In [None]:
df.hvplot.hist(y='dist_m', by='rideable_type', bins=80, title="Trips Distance By Type", yformatter="%0.0f") + df[df['dist_m'] > 0].hvplot.hist(y='dist_m', by='rideable_type', bins=80, title="Trips Distance By Type ( W/O Returns)", yformatter="%0.0f")

### cuxfilter Crossfilter

In [None]:
# FIX-NOTE: adding extension here explicitly RELOADS bokeh and all plots will work
hvplot.extension('bokeh')

In [None]:
# Specify the charts and widgets to use with the selected columns of data and string maps
cux_df = cuxfilter.DataFrame.from_dataframe(df)

#creating a label map for days of week strings
days_of_week_map = {
    0: 'M',
    1: 'T',
    2: 'W',
    3: 'Th',
    4: 'F',
    5: 'Sa',
    6: 'Su',
    7: 'Unknown'
}


charts = [
    cuxfilter.charts.bar('day', title='Trips per Day'),
    cuxfilter.charts.bar('dist_m', data_points=20 , title='Distance in M'),
    cuxfilter.charts.bar('dur_min', data_points=20 , title='Duration in Min'),
    cuxfilter.charts.bar('day_of_week', x_label_map=days_of_week_map, title='Day of Week'),
    cuxfilter.charts.bar('hour', title='Trips per Hour')
]


widgets = [
            cuxfilter.charts.multi_select('year')
            #cuxfilter.charts.multi_select('member_casual'),
            #cuxfilter.charts.multi_select('rideable_type')
]

# Generate the dashboard and select a layout
d = cux_df.dashboard(charts, sidebar=widgets, layout=cuxfilter.layouts.feature_and_quad_base, theme=cuxfilter.themes.rapids, title='Bike Trips Dashboard')

# Update the yaxis ticker to an easily readable format
for i in charts:
    if hasattr(i.chart, 'yaxis'):
        i.chart.yaxis.formatter = NumeralTickFormatter(format="0,0")



# show is for seperate dashboard, await d.preview() is to generate an inline image preview, d.app() shows the app inline
d.show()


### hvPlot + Datashader + Panel

In [None]:
start_elec = df[df['rideable_type'] == 'electric_bike'].hvplot.points(x='start_lng', y='start_lat', geo=True, tiles="CartoDark", width=700, height=500, datashade=True, dynspread=True, title="electric starts") 
end_elec = df[df['rideable_type'] == 'electric_bike'].hvplot.points(x='end_lng', y='end_lat', geo=True, tiles="CartoDark", width=700, height=500, datashade=True, dynspread=True, title="electric stops") 
elec_row = pn.Row(start_elec, end_elec)
elec_row 

In [None]:

# Get station count
start_stations = df[df['end_station_name'] != 'none']
unique_stations = start_stations.drop_duplicates(subset='end_station_name')


In [None]:
# Overlay station point with bike points, using end since its more dispersed 
raster = df.hvplot.points(x='end_lng', y='end_lat', geo=True, tiles='CartoDark', projection=cartopy.crs.GOOGLE_MERCATOR, hover=True, width=700, height=500, rasterize=True) #note: raster does not aggregate with datashader
station_points = unique_stations.hvplot.points(x='end_lng', y='end_lat', geo=True, tiles=False, projection=cartopy.crs.GOOGLE_MERCATOR, hover=False, width=700, height=500, color='red', alpha=0.5)

raster * station_points

In [None]:
# TEMP
# save to file
df.to_parquet('./data/bike_df_clean.parquet') 


## cuML + Kmeans

In [None]:
df = cudf.read_parquet('./data/bike_df_clean.parquet')

# combine all lat values 
lat_df = cudf.DataFrame()
lat_df['lat'] = cudf.concat([df['start_lat'], df['end_lat']], ignore_index=True)

# combine all lng values
lng_df = cudf.DataFrame()
lng_df['lng'] = cudf.concat([df['start_lng'], df['end_lng']], ignore_index=True)

# create combined lat lng 
combined_lat_lng_df = cudf.concat([lat_df, lng_df], axis=1)

combined_lat_lng_df


In [None]:
# Perform k-means clustering, from the approximate station count with a bit of headroom
kmeans = cuml.cluster.KMeans(n_clusters=unique_stations.shape[0]+20, oversampling_factor=1.5, max_iter=200)
kmeans.fit(combined_lat_lng_df)

# Get the cluster labels
cluster_labels = kmeans.labels_

cluster_labels

In [None]:
# Get the edge list from the computed clusters
half_length = len(cluster_labels) // 2

edge_list_df = cudf.DataFrame({
    'src': cluster_labels[:half_length].reset_index(drop=True).astype('int16'),
    'dst': cluster_labels[half_length:].reset_index(drop=True).astype('int16')
})

edge_list_df

In [None]:
# Get the cluster centers or Nodes
node_centers_df = kmeans.cluster_centers_

node_centers_df = node_centers_df.rename(columns={0: 'node_lat', 1: 'node_lng'}).astype('float32')

node_centers_df

In [None]:
# combine
df = cudf.concat([df, edge_list_df], axis=1)


In [None]:
# Verify the clustering worked by overlaying with the previous datashader chart
cluster_map = node_centers_df.hvplot.points(x='node_lng', y='node_lat', geo=True, tiles=False, projection=cartopy.crs.GOOGLE_MERCATOR, hover=False, width=800, height=600, color='blue', alpha=0.8)

raster * cluster_map * station_points  #Note: only specify geo=True, once otherwise the map tiles overlay the data of the second chart

# Note: looks pretty good, within about a 2 block tolerance purple = good 

In [None]:
# save to file
df.to_parquet('./data/bike_df_clean.parquet') 

## Clean up and Save

In [None]:
# Starting to get a big messy, lets clean it up
df_dur = df['dur_min'].astype('int16')

df_geo = df[['start_lat','start_lng','end_lat','end_lng']].astype('float32')
df = df.drop(['start_lat','start_lng','end_lat','end_lng','dur_min'], axis=1)
df = cudf.concat([df,df_dur,df_geo], axis=1)

# check dtypes
df.dtypes

### cuxfilter GeoSpatial

In [None]:
from pyproj import Proj, Transformer

# Note dont run twice
transform_4326_to_3857 = Transformer.from_crs('epsg:4326', 'epsg:3857')

df['end_lat'], df['end_lng'] = transform_4326_to_3857.transform(df['end_lat'].values_host, df['end_lng'].values_host)

In [None]:
G = cugraph.Graph() 
G.from_cudf_edgelist(edge_list_df, source='src', destination='dst')
edges = G.edges()


In [None]:
ITERATIONS=600
THETA=5.0
OPTIMIZE=True

# Using the previously created edge list, we calculate the FA2 layout positions here
trips_FA_df = cugraph.layout.force_atlas2(G, 
                    max_iter=ITERATIONS,
                    strong_gravity_mode=True,
                    outbound_attraction_distribution=False,
                    lin_log_mode=False,
                    barnes_hut_optimize=OPTIMIZE, 
                    barnes_hut_theta=THETA,
                    verbose=False)

trips_FA_df

In [None]:
graph_df = trips_FA_df.merge(
                df,
                left_on='vertex',
                right_on='dst',
                suffixes=('', '_original'))

graph_df

In [None]:
# FIX-NOTE: adding extension here explicitly RELOADS bokeh and all plots will work
hvplot.extension('bokeh')

In [None]:
# Specifying a graph chart type will use Datashader and its required parameters
cx_df = cuxfilter.DataFrame.load_graph((graph_df, edges))

graph = cuxfilter.charts.graph(
      edge_source='src', 
      edge_target='dst',
      node_x='x',
      node_y='y',
      unselected_alpha=0.2,
      edge_color_palette=['gray', 'black'],
      node_pixel_shade_type='linear',
      edge_transparency=0.2, 
      title='ForceAtlas2 Layout Graph'
  )

scatter = cuxfilter.charts.scatter(
        x='end_lat',
        y='end_lng',
        unselected_alpha=0.1,
        pixel_shade_type='eq_hist',
        tile_provider='CartoDark', 
        title='End Locations'
    )

bar1 = cuxfilter.charts.bar('dur_min', data_points=20 , title='Duration in Min')
bar2 = cuxfilter.charts.bar('hour', title='Trips per Hour')
bar3 = cuxfilter.charts.bar('day_of_week', title='Day of Week')
bar4 = cuxfilter.charts.bar('month', title='Trips per Month')
table1 = cuxfilter.charts.view_dataframe(['start_station_name','end_station_name'], drop_duplicates=True)

layout_array = [[1,1,1,2,2],
                [3,4,5,6,7]]

# Generate the dashboard, select a layout and theme
d = cx_df.dashboard([graph,scatter,bar1,bar2,bar3,bar4,table1], layout_array=layout_array, theme=cuxfilter.themes.rapids, title='Geospatial Trips')

# Update the yaxis ticker to an easily readable format

for i in charts:
    if hasattr(i.chart, 'yaxis'):
        i.chart.yaxis.formatter = NumeralTickFormatter(format="%0.0f")
        
d.show()


### Plotly Dash

In [None]:
app = JupyterDash(__name__)

app.layout = html.Div([
    html.Div([
        html.H3(["Divvy Bikeshare Chicago"]),
        html.H5(["Total Selected Trips:"]),
        dcc.Loading(
            dcc.Graph(id = 'number', figure = go.Figure(go.Indicator(mode = "number", value = trips.shape[0])),
            style = {
            'height': '250px'
            }),
            color = '#b0bec5'
        ),
        html.H5(["Day of Week:"]),
        dcc.Dropdown(id = 'day', clearable = False, value = '',
            options = [{'label': day_type_map[c],'value': c} for c in day_type_map]
        ),
        html.H5(["Time of Day:"]),
        dcc.Dropdown(id = 'time', clearable = False, value = '',
            options = [{'label': time_of_day_map[c], 'value': c} for c in time_of_day_map]
        )],
        style = {
            'z-index' : '99',
            'position': 'absolute',
            'width': '15%',
            'height': 'calc(100% - 2em)',
            'padding': '1em 2em',
            'background-color': '#aabacc',
            'color': 'rgb(70, 105, 130)',
            'box-shadow': '5px 0px 3px 0px rgba(0,0,0,0.1)'
        }
    ),
    html.Div([
        html.Div([
            html.H5(["Station Importance PageRank(Color) by Trips(Size)"]),
            dcc.Graph(id = 'pagerank_plot',
                config = {'responsive': True, 'modeBarButtonsToRemove': ['select2d', 'lasso2d']}
            )
        ],
        style = {
            'display': 'inline-block',
            'width': '100%',
            'vertical-align': 'top'
        }),
        html.Div([
            html.H5(["Total Trips Per Week (2014-2017)"]),
            dcc.Graph(id = 'all_time_week_bar',
                config = {'responsive': True, 'modeBarButtonsToRemove': ['zoom2d', 'zoomIn2d', 'zoomOut2d']}
            )
        ],
        style = {
            'display': 'inline-block',
            'width': '100%'
        })
    ],
    style = {
        'width': 'calc(80% - 6em)',
        'height': 'auto',
        'margin-left': 'calc(15% + 6em)',
        'padding-top': '2em',
        'display': 'inline-block',
        'vertical-align': 'top',
        'color': '#aabacc'
    })
],
style = {
    'position': 'relative',
    'border-bottom': '2px solid #aabacc'
})


## Define Function to Generate Plots with Plotly Express
Next lets define the functions to build our two charts and link them to our data:

In [None]:
# Geospatial bubble chart based on Page Rank and Trip data
def get_pagerank_plot(data):
    df = calculate_page_rank(data).to_pandas()
    g = px.scatter_mapbox(df, lat="lat", lon="lon", color="pagerank", size="total_trips",
                          hover_data=["station_name"], mapbox_style="carto-positron",
                          color_continuous_scale=px.colors.cyclical.Edge_r, size_max=15, zoom=10,
                          height=700
                         )
    g.layout['uirevision'] = True
    return g

# Bar chart based on total trips over weeks
def get_week_bar_chart(data):
    all_time_week_df = data.groupby('all_time_week').size().reset_index()
    all_time_week_df.columns = ['week', 'trips']
    g = px.bar(all_time_week_df.to_pandas(), 
               x="week", y='trips', template=dict(layout={'selectdirection': 'h',}), 
               height=300
              )
    g.layout['dragmode']='select'
    g.layout['uirevision'] = True
    return g

## Define Function to Calculate Page Rank
Because Plotly Dash applications are hosted through a python backend, the web based charts are able to call custom python functions. Lets use this feature and the speed of cuGraph to calculate new PageRank scores base on a user's selection:

In [None]:
def calculate_page_rank(data):
    G = cugraph.Graph()
    G.from_cudf_edgelist(data, source='from_station_id', destination='to_station_id')
    data_page = cugraph.pagerank(G)
    return data_page.merge(stations, left_on='vertex', right_on='station_id').drop(columns=['vertex'])

## Define Events and Callbacks
Here we define what happens when a user interacts with chart selections through [Dash callbacks](https://dash.plotly.com/basic-callbacks):

In [None]:
def bar_selection_to_query(selection, column):
    """
    Compute pandas query expression string for selection callback data
    Args:
        selection: selectedData dictionary from Dash callback on a bar trace
        column: Name of the column that the selected bar chart is based on
    Returns:
        String containing a query expression compatible with DataFrame.query. This
        expression will filter the input DataFrame to contain only those rows that
        are contained in the selection.
    """
    point_inds = [p['label'] for p in selection['points']]
    xmin = min(point_inds)  # bin_edges[min(point_inds)]
    xmax = max(point_inds) + 1  # bin_edges[max(point_inds) + 1]
    xmin_op = "<="
    xmax_op = "<="
    return f"{xmin} {xmin_op} {column} and {column} {xmax_op} {xmax}"

# Define callback to update graph, id ties plot code to layout
@app.callback(
    [
        Output('pagerank_plot', 'figure'),
        Output('all_time_week_bar', 'figure'),
        Output('number', 'figure')
    ],
    [
        Input("day", "value"), Input("time", "value"),
        Input("all_time_week_bar", "selectedData")
    ]
)
def update_figure(day, time, selected_weeks):
    query = ['day_type == '+str(day) if day != "" else "", 'time_of_day =='+str(time) if time != "" else ""]
    query_str = ' and '.join([x for x in query if x != ""])
    
    data = trips
    if len(query_str) > 0:
        data = trips.query(query_str)

    week_bar_chart = get_week_bar_chart(data)
    
    if selected_weeks is not None:
        query.append(bar_selection_to_query(selected_weeks, 'all_time_week'))
        query_str = ' and '.join([x for x in query if x != ""])
        if len(query) > 0:
            data = trips.query(query_str)
    
    pagerank_plot = get_pagerank_plot(data)
    
    number = go.Figure(go.Indicator(
                mode="number",
                value=data.shape[0]
            ))

    return pagerank_plot, week_bar_chart, number

## Start the Plotly Dash Visualization
Now that we have defined everything, lets run the application:

In [None]:
# NOTE: If you are running in a JupyterHub environment, run the below command:
# JupyterDash.infer_jupyter_proxy_config()

# NOTE: For Jupyterlab run: 
# app.run_server(mode="jupyterlab")

# NOTE: To run inline with a notebook (NOT recommended): 
# app.run_server(mode="inline")

# NOTE: To run as seperate tab run then click on the link (recommended):
app.run_server(debug=False)


## To Do Outline
- cuxfilter FA graph w/ end points 
- cuGraph PageRank leave, PageRank arrive
- Add Plotly Dash chart

Issues:
- juplyter lab needs to be 3.6.4
- FIX: use latest cuxfilter so can filter by string category
- FIX: cuxfilter / hvplot bokeh.js assets bug
- Issue: graph OOMing somehow (bug in cuxfilter percision) + optimzied node / edges with unique

## Conclusion