<img src="https://github.com/jupytercon/2020-exactlyallan/raw/master/images/RAPIDS-header-graphic.png" style="width:50%">


# RAPIDS Visualization Guide Notebook
### A Streamlined Guide to RAPIDS Accelerated Visualization and Visual Analtyics
The guide will walk through using RAPIDS cuDF, cuSpatial, cuML, and cuGraph with hvPlot, Datashader, cuxfilter, and Plotly Dash with the publicly available [Divvy Bike share dataset](https://divvybikes.com/system-data).

## Requirements
- System that meets the [RAPIDS system and GPU requirements](https://docs.rapids.ai/install#system-req)
- 

## Dependencies
Use the below to install all the required dependencies via conda:



In [None]:
# NOTE: Jupyter lab 3.6.4 and RAPIDS 23.06

# imports
import os
from zipfile import ZipFile
from pathlib import Path
import math

# rapids
import cudf
import cuspatial
import cugraph
import cuml
import cuxfilter
import cupy

# holoviz
from bokeh.models import NumeralTickFormatter
import hvplot.cudf 
hvplot.extension('bokeh')
import colorcet
import panel as pn
import geoviews
import cartopy

# plotly
import plotly.graph_objects as go
import plotly.express as px
from jupyter_dash import JupyterDash
from dash import Dash, html, dcc, callback, Output, Input
from dash.exceptions import PreventUpdate



## Load Dataset
The dataset can be downloaded from the [Divvy Bike Share public dataset](https://divvybikes.com/system-data). Use the following script to download the desired date range and load it into a dataframe.


In [None]:
# Define the URL of the Divvy trip data and save dir
S3 = 'https://divvy-tripdata.s3.amazonaws.com/'
DATA_DIR = './data'


In [None]:
# Check dir
Path(DATA_DIR).mkdir(parents=True, exist_ok=True)

'''
# Download the zip files from the URL within date range and unzip
# NOTE: 2021 + 2022 dataset is over 11M trips, which requires at least a 24GB GPU
for year in range(2021, 2023):
    for month in range(1, 13):
        file = f'{year}{month:02d}-divvy-tripdata.zip'
        URL = f'{S3}{file}'
        ! wget -P {DATA_DIR} {URL}
     
        with ZipFile(f'{DATA_DIR}/{file}') as zip:
            zip.extractall(f'{DATA_DIR}')
'''

### cuDF

In [None]:
# Load all csv as dataframes and combine into one cudf
df_array = []

for file in Path(DATA_DIR).rglob('20*.csv'):
    gdf = cudf.read_csv(file)
    df_array.append(gdf)

df = cudf.concat(df_array)

# Check the data
df.reset_index()

## Check and Clean Data

The data seems unreasonabliy clean, but there are still a few things we improve on it. Lets check for blanks and nulls first.

In [None]:
df.isnull().sum()


In [None]:
# Filter rows with at least one null value
df[df['end_lat'].isnull()]


In [None]:
# drop nulls
df = df.dropna(subset=['end_lat']).reset_index()
df.isnull().sum()

In [None]:
# Replace null values in 'start_station_id' with 'none'
df['start_station_name'] = df['start_station_name'].fillna('none')

# Replace null values in 'end_station_id' with 'none'
df['end_station_name'] = df['end_station_name'].fillna('none')


In [None]:
# Remove any erronus out of area lat lngs
min_lat = 41.5
max_lat = 42.5
min_lng = -88.0
max_lng = -87.0

df = df[(df['start_lat'] >= min_lat) & (df['start_lat'] <= max_lat) & (df['start_lng'] >= min_lng) & (df['start_lng'] <= max_lng) & 
                (df['end_lat'] >= min_lat) & (df['end_lat'] <= max_lat) & (df['end_lng'] >= min_lng) & (df['end_lng'] <= max_lng)].reset_index()

In [None]:
df.dtypes

The 'started_at' and 'ended_at' columns should be proper date times types.

In [None]:
df['started_at'] = cudf.to_datetime(df['started_at'])
df['ended_at'] = cudf.to_datetime(df['ended_at'])

df

To make things a bit easier lets break out the date and time into sperate columns, assuming we only need to worry about start time.

In [None]:
df['year'] = df['started_at'].dt.year
df['month'] = df['started_at'].dt.month
df['day'] = df['started_at'].dt.day
df['hour'] = df['started_at'].dt.hour

df

Extracting out the day of the week would be hepful too.

In [None]:
df['day_of_week'] = df['started_at'].dt.dayofweek

df

In [None]:
rider_type = df.groupby('member_casual').size().rename("count").reset_index()
rider_type


### hvPlot

In [None]:
rider_type.hvplot.bar(x='member_casual', y='count', title='Total Rider Types', yformatter='%0.0f')

In [None]:
# DOW = {0:'M', 1:'T', 2:'W', 3:'Th', 4:'F', 5:'Sa', 6:'Su'}

day_counts = df.groupby('day_of_week').size().rename('count').reset_index().sort_values('day_of_week')
day_counts.hvplot.bar('day_of_week', 'count', title="Trip starts per Week Day", yformatter="%0.0f")


In [None]:
# calculated duration in min
df['dur_min'] = (df['ended_at'] - df['started_at'])
df['dur_min'] = (df['dur_min'].dt.seconds / 60).round().astype('float32') #needed for cuML KDE

df

In [None]:
df.hvplot.hist(y='dur_min', bins=20, title="Trips Duration Histrogram", yformatter="%0.0f")

In [None]:
# Do some minor cleanup
df = df.drop(['index','ride_id','started_at','ended_at','start_station_id','end_station_id'], axis=1).reset_index()

### cuML + KDE

In [None]:
# CUML KDE

# start, end, step size
dur_range = cupy.arange(1.0, 200.0, 5.0)

kde = cuml.KernelDensity(kernel='gaussian', bandwidth=3).fit(df['dur_min'])

log_density_values = kde.score_samples(dur_range)
density_values = cupy.exp(log_density_values)

density_df = cudf.DataFrame({'duration': dur_range, 'density': density_values})

density_df.hvplot.line(x='duration', y='density', xlabel='Data', ylabel='Density', title='Duration in min KDE')


In [None]:
df.loc[df['dur_min'].argsort().tail(5)]

In [None]:
trips_by_hour = df.groupby('hour').size().rename('count').reset_index().sort_values('hour')

avg_duration_by_hour = df.groupby('hour')['dur_min'].mean().rename('duration_mean').reset_index().sort_values('hour')

trips_by_hour.hvplot.bar('hour', 'count', title="Trip Starts per Hour", yformatter="%0.0f") + avg_duration_by_hour.hvplot.bar('hour', 'duration_mean', title="Trip Duration per Hour", yformatter="%0.0f") 

In [None]:
trips_by_hour_300 = df[df['dur_min'] <= 300].groupby('hour').size().rename('count').reset_index().sort_values('hour')

avg_duration_by_hour_300 = df[df['dur_min'] <= 300].groupby('hour')['dur_min'].mean().rename('duration_mean').reset_index().sort_values('hour')

trips_by_hour_300.hvplot.bar('hour', 'count', title="Trip Starts per Hour (under 300min)", yformatter="%0.0f") + avg_duration_by_hour_300.hvplot.bar('hour', 'duration_mean', title="Trip Duration per Hour (under 300min)", yformatter="%0.0f") 

In [None]:
# group data by month, day_of_week and hour, count the number of rows in each group
heatmap_data_dw = df.groupby(['day_of_week','hour']).size().rename('count').reset_index()
heatmap_data_dw.hvplot.heatmap(x='day_of_week', y='hour', C='count', title="Trips by Hour and Day of Week") 

In [None]:
# group data by month, day_of_week and hour, count the number of rows in each group
heatmap_data_dwm = df.groupby(['month','day_of_week','hour']).size().rename('count').reset_index()
heatmap_data_dwm.hvplot.heatmap(x='day_of_week', y='hour', C='count', groupby='month', widget_location='left_top', title="Trips by Hour and Day of Week per Month")

In [None]:
df.hvplot.hexbin(x='start_lng', y='start_lat', cmap=colorcet.bgy, geo=True, tiles="OSM", logz=False, gridsize=150, width=700, height=600, title="Trip Start Counts") + df.hvplot.hexbin(x='end_lng', y='end_lat', geo=True, cmap=colorcet.bgy, tiles="OSM", logz=False, gridsize=150, width=700, height=600, title="Trip End Counts")

And if you look at their system map, the lat longs seem to be accurate https://account.divvybikes.com/map.
But this seems like a lot of start / stop places, lets see if we can identify stations.

In [None]:
df['start_station_name'].unique()


In [None]:
df['start_lat'].round(4).unique()


So there are obviously many more starting points than stations, so it must be that the bikes do not have to start and stop at a station. We will have to find a way to bin the start stop locations into a reasonable number.

In [None]:
bike_type = df.groupby('rideable_type').size().rename('count').reset_index()
bike_type.hvplot.bar(x='rideable_type', y='count', title='Total Bike Types', yformatter='%0.0f')

## cuSpatial

In [None]:
# Create a cuSpatial GeoSeries from the latitude and longitude columns
start_points = cuspatial.GeoSeries.from_points_xy(df[['start_lng','start_lat']].interleave_columns().astype("float64"))
end_points = cuspatial.GeoSeries.from_points_xy(df[['end_lng','end_lat']].interleave_columns().astype("float64"))


In [None]:
distances_in_km = cuspatial.haversine_distance(start_points, end_points)
distances_in_km

In [None]:
# add the distances back into the dataframe, rounding values to make it more obvious if the stopped at the same place it started
dist_m = cudf.Series(distances_in_km).values * 1000
df['dist_m'] = dist_m.round().astype('int32')
df

In [None]:
df.hvplot.hist(y='dist_m', by='rideable_type', bins=80, title="Trips Distance By Type", yformatter="%0.0f") + df[df['dist_m'] > 0].hvplot.hist(y='dist_m', by='rideable_type', bins=80, title="Trips Distance By Type ( W/O Returns)", yformatter="%0.0f")

### cuxfilter Crossfilter

In [None]:
# FIX-NOTE: adding extension here explicitly RELOADS bokeh and all plots will work
hvplot.extension('bokeh')

In [None]:
# Specify the charts and widgets to use with the selected columns of data and string maps
cux_df = cuxfilter.DataFrame.from_dataframe(df)

charts = [

    cuxfilter.charts.bar('dist_m', data_points=20 , title='Distance in M'),
    cuxfilter.charts.bar('dur_min', data_points=20 , title='Duration in Min'),
    cuxfilter.charts.bar('day_of_week', title='Day of Week'),
    cuxfilter.charts.bar('hour', title='Trips per Hour'),
    cuxfilter.charts.bar('day', title='Trips per Day'),
    cuxfilter.charts.bar('month', title='Trips per Month')
]


widgets = [
            cuxfilter.charts.multi_select('year')
]

# Generate the dashboard and select a layout
d = cux_df.dashboard(charts, sidebar=widgets, layout=cuxfilter.layouts.two_by_three, theme=cuxfilter.themes.rapids, title='Bike Trips Dashboard')

# Update the yaxis ticker to an easily readable format
for i in charts:
    if hasattr(i.chart, 'yaxis'):
        i.chart.yaxis.formatter = NumeralTickFormatter(format="0,0")


# show is for seperate dashboard, await d.preview() is to generate an inline image preview, d.app() shows the app inline
d.show()


### hvPlot + Datashader + Panel

In [None]:
start_elec = df[df['rideable_type'] == 'electric_bike'].hvplot.points(x='start_lng', y='start_lat', geo=True, tiles="CartoDark", width=700, height=500, datashade=True, dynspread=True, title="Electric Starts") 
end_elec = df[df['rideable_type'] == 'electric_bike'].hvplot.points(x='end_lng', y='end_lat', geo=True, tiles="CartoDark", width=700, height=500, datashade=True, dynspread=True, title="Electric Stops") 
elec_row = pn.Row(start_elec, end_elec)
elec_row 

In [None]:

# Get station count
start_stations = df[df['end_station_name'] != 'none']
unique_stations = start_stations.drop_duplicates(subset='end_station_name')


In [None]:
# Overlay station point with bike points, using end since its more dispersed 
raster = df.hvplot.points(x='end_lng', y='end_lat', geo=True, tiles='CartoDark', projection=cartopy.crs.GOOGLE_MERCATOR, hover=True, width=700, height=500, rasterize=True) #note: raster does not aggregate with datashader
station_points = unique_stations.hvplot.points(x='end_lng', y='end_lat', geo=True, tiles=False, projection=cartopy.crs.GOOGLE_MERCATOR, hover=False, width=700, height=500, color='red', alpha=0.5)

raster * station_points

## cuML + Kmeans

In [None]:
# combine all lat values 
lat_df = cudf.DataFrame()
lat_df['lat'] = cudf.concat([df['start_lat'], df['end_lat']], ignore_index=True)

# combine all lng values
lng_df = cudf.DataFrame()
lng_df['lng'] = cudf.concat([df['start_lng'], df['end_lng']], ignore_index=True)

# create combined lat lng 
combined_lat_lng_df = cudf.concat([lat_df, lng_df], axis=1)

combined_lat_lng_df


In [None]:
# Perform k-means clustering, from the approximate station count with a bit of headroom
kmeans = cuml.cluster.KMeans(n_clusters=unique_stations.shape[0]+20, oversampling_factor=1.5, max_iter=200)

# Note: This will take a min on larger datasets
kmeans.fit(combined_lat_lng_df)

# Get the cluster labels
cluster_labels = kmeans.labels_

cluster_labels

In [None]:
# Get the edge list from the computed clusters
half_length = len(cluster_labels) // 2

edge_list_df = cudf.DataFrame({
    'src': cluster_labels[:half_length].reset_index(drop=True).astype('int16'),
    'dst': cluster_labels[half_length:].reset_index(drop=True).astype('int16')
})

edge_list_df

In [None]:
# Get the cluster centers or Nodes
node_centers_df = kmeans.cluster_centers_

node_centers_df = node_centers_df.rename(columns={0: 'node_lat', 1: 'node_lng'}).astype('float32')

# Save node centers
node_centers_df.to_parquet('./data/kmean_node_center.parquet') 
node_centers_df

In [None]:
# combine
df = cudf.concat([df, edge_list_df], axis=1)


In [None]:
# Verify the clustering worked by overlaying with the previous datashader chart
cluster_map = node_centers_df.hvplot.points(x='node_lng', y='node_lat', geo=True, tiles=False, projection=cartopy.crs.GOOGLE_MERCATOR, hover=False, width=800, height=600, color='blue', alpha=0.8)

raster * cluster_map * station_points  #Note: only specify geo=True, once otherwise the map tiles overlay the data of the second chart

# Note: looks pretty good, within about a 2 block tolerance purple = good 

In [None]:
# save to file
df.to_parquet('./data/bike_df_full.parquet') 

## Clean up and Save

In [None]:
# Starting to get a big messy, lets clean it up
df_dur = df['dur_min'].astype('int16')

df_geo = df[['start_lat','start_lng','end_lat','end_lng']].astype('float32')
df = df.drop(['level_0 ','start_lat','start_lng','end_lat','end_lng','dur_min'], axis=1)
df = cudf.concat([df,df_dur,df_geo], axis=1)

df.to_parquet('./data/bike_df_clean.parquet') 

# check dtypes
df.dtypes

### cuxfilter GeoSpatial

In [None]:
from pyproj import Proj, Transformer

df = cudf.read_parquet('./data/bike_df_clean.parquet') 

# Note dont run twice
transform_4326_to_3857 = Transformer.from_crs('epsg:4326', 'epsg:3857')

df['end_lat'], df['end_lng'] = transform_4326_to_3857.transform(df['end_lat'].values_host, df['end_lng'].values_host)

In [None]:

G = cugraph.Graph() 
G.from_cudf_edgelist(df, source='src', destination='dst')
edges = G.edges()


In [None]:
ITERATIONS=600
THETA=5.0
OPTIMIZE=True

# Using the previously created edge list, we calculate the FA2 layout positions here
trips_FA_df = cugraph.layout.force_atlas2(G, 
                    max_iter=ITERATIONS,
                    strong_gravity_mode=True,
                    outbound_attraction_distribution=False,
                    lin_log_mode=False,
                    barnes_hut_optimize=OPTIMIZE, 
                    barnes_hut_theta=THETA,
                    verbose=False)

trips_FA_df

In [None]:
graph_df = trips_FA_df.merge(
                df,
                left_on='vertex',
                right_on='dst',
                suffixes=('', '_original'))

graph_df

In [None]:
# FIX-NOTE: adding extension here explicitly RELOADS bokeh and all plots will work
hvplot.extension('bokeh')

In [None]:
# Specifying a graph chart type will use Datashader and its required parameters
cx_df = cuxfilter.DataFrame.load_graph((graph_df, edges))

graph = cuxfilter.charts.graph(
      edge_source='src', 
      edge_target='dst',
      node_x='x',
      node_y='y',
      unselected_alpha=0.2,
      edge_color_palette=['gray', 'black'],
      node_pixel_shade_type='linear',
      edge_transparency=0.2, 
      title='ForceAtlas2 Trip Graph'
  )

scatter = cuxfilter.charts.scatter(
        x='end_lat',
        y='end_lng',
        unselected_alpha=0.1,
        pixel_shade_type='eq_hist',
        tile_provider='CartoDark', 
        title='Trip Endpoints'
    )

bar1 = cuxfilter.charts.bar('dur_min', data_points=20 , title='Duration in Min')
bar2 = cuxfilter.charts.bar('hour', title='Trips per Hour')
bar3 = cuxfilter.charts.bar('day_of_week', title='Trips per Day of Week')
bar4 = cuxfilter.charts.bar('month', title='Trips per Month')
table1 = cuxfilter.charts.view_dataframe(['start_station_name','end_station_name'], drop_duplicates=True)

layout_array = [[1,1,1,2,2],
                [3,4,5,6,7]]

# Generate the dashboard, select a layout and theme
d = cx_df.dashboard([graph,scatter,bar1,bar2,bar3,bar4,table1], layout_array=layout_array, theme=cuxfilter.themes.rapids, title='Divvy Bike Trip Clustering')

# Update the yaxis ticker to an easily readable format
for i in charts:
    if hasattr(i.chart, 'yaxis'):
        i.chart.yaxis.formatter = NumeralTickFormatter(format="0,0")
        
d.show()


### Plotly Dash

In [None]:
# Setup plotly df
df = cudf.read_parquet('./data/bike_df_clean.parquet')
node_centers_df = cudf.read_parquet('./data/kmean_node_center.parquet') 

plotly_df = df[['rideable_type','member_casual','year','month','day','hour','day_of_week','src','dst']]

plotly_df

In [None]:
app = JupyterDash(__name__)

app.layout = html.Div([
        html.Div([
            html.H2("Divvy Bikeshare Ranking of Destinations"),
            html.H4("Total Selected Trips:"),
            html.H2(['{:,}'.format(plotly_df.shape[0])]),
            html.H4("Year:"),
            dcc.Dropdown(id = 'year', options=sorted(plotly_df['year'].unique().to_pandas()), value='', clearable = True),
            html.H4("Month:"),
            dcc.Dropdown(id = 'month', options=sorted(plotly_df['month'].unique().to_pandas()), value='', clearable = True),
            html.H4("Bike Type:"),
            dcc.Dropdown(id = 'bikes', options=sorted(plotly_df['rideable_type'].unique().to_pandas()), value='', clearable = True),
            html.H4("User Type:"),
            dcc.Dropdown(id = 'user', options=sorted(plotly_df['member_casual'].unique().to_pandas()), value='', clearable = True)
            ],
            style = {
                'z-index' : '99',
                'font-family':'sans-serif',
                'position': 'absolute',
                'width': '15vw',
                'height': 'calc(100vh - 3rem)',
                'padding': '1em',
                'background-color': '#3a97d3',
                'color': '#f1f1f1',
                'border-radius': '0.5rem',
                'box-shadow': '5px 0px 3px 0px rgba(0,0,0,0.3)'
            }
        ),
        html.Div([
            html.Div([
                html.H3("Area Importance PageRank(Color) by Trips(Size)", style={'font-family':'sans-serif','color':'#3a97d3'}),
                dcc.Graph(id = 'pagerank_plot', config = {'responsive': True, 'displaylogo': False, 'modeBarButtonsToRemove': ['select2d', 'lasso2d', 'toImage']})
            ],
            style = {
                'display': 'inline-block',
                'width': '70vw',
                'vertical-align':'top'
            }),
            html.Div([
                html.H3("Trips Per Day of Week", style={'font-family':'sans-serif','color':'#3a97d3'}),
                dcc.Graph(id = 'dow_plot', config = {'responsive': True, 'displaylogo': False, 'modeBarButtonsToRemove': ['zoom2d', 'zoomIn2d', 'zoomOut2d','toImage']})
            ],
            style = {
                'display': 'inline-block',
                'width': '35vw',
                'vertical-align': 'bottom'
            }),
            html.Div([
                html.H3("Trips Per Hour", style={'font-family':'sans-serif','color':'#3a97d3'}),
                dcc.Graph(id = 'hour_plot',config = {'responsive': True, 'displaylogo': False, 'modeBarButtonsToRemove': ['zoom2d', 'zoomIn2d', 'zoomOut2d','toImage']})
            ],
            style = {
                'display': 'inline-block',
                'width': '35vw',
                'vertical-align': 'bottom'
            })
            ],
            style = {
                'width': '70vw',
                'margin-left': '20vw',
                'padding-top': '1em',
                'display': 'inline-block',
                'vertical-align': 'top',
            })
        ]
)


In [None]:
# Define callback to update graph, id ties plot code to layout
@app.callback(
       [
          Output('pagerank_plot', 'figure'),
          Output('dow_plot', 'figure'),
          Output('hour_plot', 'figure')
        ],
        [
         Input('year', 'value'),
         Input('month', 'value'), 
         Input('bikes', 'value'), 
         Input('user', 'value'), 
         Input('dow_plot', 'selectedData'),
         Input('hour_plot', 'selectedData')
       ]
)
def update_figure(year, month, bikes, user, dow_data, hour_data):
    
    data = plotly_df

    if hour_data is not None:  
        data = data[(data['hour'] >= hour_data['range']['x'][0]) & (data['hour'] <= math.floor(hour_data['range']['x'][1]))]
        
    if dow_data is not None: 
        data = data[(data['day_of_week'] >= hour_data['range']['x'][0]) & (data['day_of_week'] <= math.floor(hour_data['range']['x'][1]))]                                                                                              
    
    if year is not None:  
        if year != '':
            data = data[data['year'] == year]
        
    hour_plot = get_hour_chart(data)
    dow_plot = get_dow_chart(data)
    pagerank_plot = get_pagerank_plot(data)


    return pagerank_plot, dow_plot, hour_plot

In [None]:
# Live calc pageRank 
def calculate_page_rank(data):
    G = cugraph.Graph()
    G.from_cudf_edgelist(data, source='src', destination='dst', store_transposed=True)
    data_rank = cugraph.pagerank(G)
    return data_rank

# Geospatial bubble chart based on Page Rank and Trip data
def get_pagerank_plot(data):
    data_rank = calculate_page_rank(data)
    trips = data.groupby('dst').agg({'dst': 'size'}).rename(columns={'dst': 'arrivals'}).reset_index()
    trips = trips.merge(data_rank, left_on='dst', right_on='vertex').drop(columns=['vertex'])
    rank_chart = trips.merge(node_centers_df, left_on='dst', right_index=True).reset_index(drop=True)
    g = px.scatter_mapbox(rank_chart.to_pandas(), lat="node_lat", lon="node_lng", color="pagerank", size='arrivals',
                          hover_data=["pagerank","dst"], mapbox_style="carto-positron",
                          color_continuous_scale=px.colors.sequential.haline, size_max=20, zoom=10, height=800
                         )
    g.layout['uirevision'] = True
    return g

# Bar chart based on day of week
def get_dow_chart(data):
    dow = df.groupby('day_of_week').size().rename("count").reset_index()
    g = px.bar(dow.to_pandas(), 
               x="day_of_week", y='count', template=dict(layout={'selectdirection': 'h',}), height=300
              )
    g.layout['dragmode']='select'
    g.layout['uirevision'] = True
    return g

# Bar chart based on day of hour
def get_hour_chart(data):
    hour = df.groupby('hour').size().rename("count").reset_index()
    g = px.bar(hour.to_pandas(), 
               x="hour", y='count', template=dict(layout={'selectdirection': 'h',}), height=300
              )
    g.layout['dragmode']='select'
    g.layout['uirevision'] = True
    return g

In [None]:
# Find all the different ways you can deploy your JupyterDash App, or even without using Jupyter https://dash.plotly.com/workspaces/using-dash-in-jupyter-and-workspaces
if __name__ == '__main__':
    app.run_server(debug=True)


## To Do
Complete:
- Add more comments
- Fix plotly callback
- Add conclusion

Issues:
- Juplyter lab needs to be 3.6.4 for hvPlot
- FIX: use latest cuxfilter so can filter by string category (TBD)
- FIX: cuxfilter / hvplot bokeh.js assets bug (TBD)
- Include env yaml

## Conclusion

Text