# Big Data Dashboarding
---

In this notebook we are going to put our visualizations together into a dashboard

We will also introduce a new tool from the HoloViz suite:

<img src="images/panel_logo.png" width="100">

Panel is a high-level app and dashboarding solution for Python

## Lets reconnect to our Dask Cluster and load our dataset 

In [None]:
import dask_gateway
import dask.dataframe as dd

In [None]:
gateway = dask_gateway.Gateway()

In [None]:
if len(running_clusters := gateway.list_clusters())>0:
    cluster = gateway.connect(running_clusters[0].name)
else:
    cluster = gateway.new_cluster(conda_environment="pycon2023/pycon2023-tutorial", profile="Medium Worker")
    cluster.adapt(1,10)

In [None]:
cluster

In [None]:
client = cluster.get_client()
client

In [None]:
columns = [
    'YEAR', 'MONTH', 'DAY_OF_MONTH', 'DAY_OF_WEEK', 'FL_DATE', 'OP_CARRIER', 
    'TAIL_NUM', 'OP_CARRIER_FL_NUM', 'ORIGIN', 'DEST', 'CRS_DEP_TIME', 
    'DEP_TIME', 'DEP_DELAY', 'ARR_TIME', 'ARR_DELAY', 'CANCELLED', 
    'CANCELLATION_CODE', 'DIVERTED', 'AIR_TIME', 'FLIGHTS', 'DISTANCE',
    'CARRIER_DELAY', 'WEATHER_DELAY', 'NAS_DELAY', 'SECURITY_DELAY', 
    'LATE_AIRCRAFT_DELAY', 'DIV_ARR_DELAY'
]

In [None]:
flights = dd.read_parquet(
    f"gcs://quansight-datasets/airline-ontime-performance/sorted/full_dataset.parquet", 
    columns=columns
)

## Lets import our viz tools

In [None]:
import panel as pn
import pandas as pd

pn.extension()

In [None]:
import hvplot.dask
import hvplot.pandas

## Lets read in the lat/lon of all the airports in the BTS database

In [None]:
airports = pd.read_csv("prep/airports.csv", 
                        usecols=["AIRPORT", "DISPLAY_AIRPORT_NAME", "LATITUDE", "LONGITUDE"], 
                        dtype={
                            "AIRPORT": "string", 
                            "DISPLAY_AIRPORT_NAME": "string", 
                            "LATITUDE":"float", 
                            "LONGITUDE":"float",
                        }
                      ).set_index('AIRPORT')

# drop duplicates keeping the last entry
airports = airports[~airports.index.duplicated(keep='last')]

In [None]:
# plotting the map of airports
airports.hvplot.points('LONGITUDE', 'LATITUDE',  geo=True, 
                       color='red', alpha=0.2, hover_cols=['AIRPORT'],
                       tiles='CartoLight')

## Panel has 3 methods for building out interactive dashboards

- hvPlot `.interactive` turns any of your DataFrame processing pipelines into a dashboard (great if you want to explore a dataset!);
- Panel `.bind` binds your widgets with your interactive plot (great if you want to build an arbitrary app!);
- Param encapsulates your dashboard as self-contained classes (great if you want to build a complex codebase supporting both GUI and non-GUI usage).

For more details see [3 ways to build a panel visualization dashboard](https://towardsdatascience.com/3-ways-to-build-a-panel-visualization-dashboard-6e14148f529d)

Below we will use the `hvplot.interactive` method.

First, lets build a small pandas data pipeline:

In [None]:
pipeline = (
    flights[
        (flights['FL_DATE'] > "2020") &
        (flights['FL_DATE'] <= "2021")
    ]
    .groupby('DAY_OF_WEEK')["ARR_DELAY"].agg(how="mean")
    .rename(columns={"how": f"ARR_DELAY - mean"})
)

In [None]:
pipeline

## Moving from the static pipeline to a dynamic version

In the data pipeline above lets use variable to represent 
quantities we want to select in our dashboard

- `daterange` - start and end dates to filter by
- `groupby` - variable we wish to groupby
- `field` - data field we wish to plot
- `method` - statistic we wish to calculate (min, max, mean etc)

### Lets look at pick some panel widgets for each of these.

https://panel.holoviz.org/reference/index.html#widgets 

In [None]:
# daterange
import datetime as dt

daterange = pn.widgets.DateRangeSlider(
    name='Date Range Slider',
    start=dt.datetime(2003, 1, 1), end=dt.datetime(2022, 12, 31),
    value=(dt.datetime(2022, 1, 1), dt.datetime(2022, 12, 31)),
    step=24*3600*2*1000,
    bar_color = "green",
    width=800
)

In [None]:
daterange

In [None]:
daterange.value

In [None]:
groupby = pn.widgets.RadioButtonGroup(
    name='Period', 
    options=['YEAR', 'MONTH', 'DAY_OF_MONTH', 'OP_CARRIER'],
    value='MONTH',
)
groupby

In [None]:
groupby.value

## Choose an appropriate widget for `field` and `method`

https://panel.holoviz.org/reference/index.html#widgets

In [None]:
# your code here

In [None]:
method = pn.widgets.Select(
    name='Method', 
    options=['min', 'max', 'mean', 'count'],
    value='mean',
)
method

In [None]:
field = pn.widgets.RadioBoxGroup(
    name='Field', 
    options=['DEP_DELAY', 'ARR_DELAY'],
)
field

## Now lets put them all together using `hvplot.interactive`

### first we make the dataframe into an interactive dataframe

In [None]:
iflights = flights.interactive()

### then we make a interactive pipeline using our widgets as variables

Our pipeline code from before:

```python
pipeline = (
    flights[
        (flights['FL_DATE'] > "2020") &
        (flights['FL_DATE'] <= "2021")
    ]
    .groupby('DAY_OF_WEEK')["ARR_DELAY"].agg(how="mean")
)
```

In [None]:
# Combine pipeline and widgets
ipipeline = (
    iflights[
        (iflights['FL_DATE'] > daterange.value[0]) &
        (iflights['FL_DATE'] <= daterange.value[1])
    ]
    .groupby(groupby)[field]
    .agg(how=method)
    .rename(columns={"how": f"{field} - {method}"})
)

In [None]:
ipipeline

### Now we have the interactive pipeline object we can use it in several ways

In [None]:
data_plot = ipipeline.hvplot()
data_plot

### Lets also make an interactive version of our airport map plot but color the airports based on data values

In [None]:
flight_delays = (
    iflights[
        (iflights['FL_DATE'] > daterange.value[0]) &
        (iflights['FL_DATE'] <= daterange.value[1])
    ]
    .groupby('ORIGIN')[field]
    .agg(how=method)
    .join(airports)
    .rename(columns={"how": f"{field} - {method}"})
)

In [None]:
flight_delays

In [None]:
map_plot = flight_delays.hvplot.points('LONGITUDE', 'LATITUDE', geo=True, color=f"{field} - {method}", 
                 hover_cols=['ORIGIN', f"{field} - {method}"],
                 xlim=(-180, -30), ylim=(-20, 75), 
                 cmap='viridis', tiles='CartoLight')

# uncomment below to see the map plot (this will take a bit of time)
# map_plot

## Lets re-arrange the components of our dashboard using Panel's simple Row/Column Grid system

Panel also has a customizable template system that allows you to build apps that have a header, sidebar, main area and popup windows. For details see - https://panel.holoviz.org/user_guide/Templates.html

In [None]:
pn.Column(
    daterange,
    pn.Row(field, pn.Column(groupby, method)),
    data_plot.panel(),
    map_plot.panel(),
)