# Intro to Dash: Visualizing workout data with reactive plots

Dash is a low code web application framework for rapidly building data apps in Python. It is written on top of Plotly and React. It is a great tool for building data driven web apps.

In [18]:
import pandas as pd

from csv_generator import generate_csv

generate_csv()

"""Creates a dataframe of workouts from peloton workouts csv. Can be downloaded from 
https://members.onepeloton.com/profile/workouts
"""
df = pd.read_csv('workouts.csv')


## Step 1: Import our dependencies

Dash makes use of plotly for its charts. We are using dash-bootstrap-components here for our html components and styling. We are importing dash-core-components (dcc) for dash components such as dropdowns and graphs. The dash html components provide an easy way for us to create html components. jupyter dash is used for developing within jupyter notebooks. Finally, we are importing Input and Output and state from dash.dependencies. These are used for callbacks.

**Useful documentation links:**

- Plotly: [make_subplots](https://plotly.com/python-api-reference/generated/plotly.subplots.make_subplots.html)
- Dash Core Components: [dash.dcc](https://dash.plotly.com/dash-core-components)
- Dash HTML Components: [dash.html](https://dash.plotly.com/dash-html-components)
- Jupyter Dash: [jupyter dash](https://dash.plotly.com/jupyter-dash)

In [2]:
import os
import plotly.graph_objects as go
import plotly.express as px
import dash_bootstrap_components as dbc

from plotly.subplots import make_subplots
from dash import dcc, html
from jupyter_dash import JupyterDash
from dash.dependencies import Input, Output, State

## Step 2: Filter and group data

In [19]:
ts ='Workout Timestamp'
def group_data_by_timestamp(start_date = df[ts].min(), end_date = df[ts].max() , freq: str ='W') -> pd.DataFrame:
    """
    Filters dataframe by date range and groups by frequency of week by default.
    Returns a dataframe with Calories Burned and Total Output summed by the chosen frequency.

    Parameters
    ----------
        start_date : date The start date of the range
        end_date : date The end date of the range
        freq : str The frequency of the group. Default is weekly.

    Returns
    -------
        df : pd.DataFrame The dataframe with the summed columns.

    """
    dff = df.loc[(start_date <= df[ts]) & (df[ts] <= end_date)]

    for tz in ['EST', 'EDT', '-04', '-05']:
        dff.loc[:, ts]= dff[ts].str.replace(f"\({tz}\)", '', regex=True)

    dff.loc[:,ts] = pd.to_datetime(dff[ts], format='%Y-%m-%d %H:%M', errors='coerce')
    return dff.groupby(pd.Grouper(key=ts, freq=freq))[['Calories Burned', 'Total Output']].agg('sum').reset_index()

group_data_by_timestamp()

Unnamed: 0,Workout Timestamp,Calories Burned,Total Output
0,2021-01-03,572.82,409.52
1,2021-01-10,0.00,0.00
2,2021-01-17,1355.93,1276.64
3,2021-01-24,952.37,464.58
4,2021-01-31,965.91,547.93
...,...,...,...
58,2022-02-13,1821.00,573.16
59,2022-02-20,770.00,690.96
60,2022-02-27,1435.06,1094.66
61,2022-03-06,1338.59,375.77


## Step 3: Create our plot

Now that we have our data grouped as we'd like to see it, lets plot it using plotly. Since Dash is built on top of plotly, we will be able to use this figure in our web app later.


In [20]:
def create_timeseries_figure(df: pd.DataFrame, frequency = "W") -> go.Figure:
    """
    Creates a timeseries figure with two subplots. The first subplot graphs 
    the Total Output for the week. The second subplot graphs the Calories Burned 
    for the week.

    Parameters
    ----------
        df : pd.DataFrame The dataframe to be plotted.
        frequency : str The frequency of the group. Default is weekly.
    
    Returns
    -------
        go.Figure: The figure to be plotted.
    """
    switcher = {
        'D': 'Day',
        'W': 'Week',
        'M': 'Month'
    }
    freq = switcher.get(frequency, 'Week')
    line = make_subplots(specs=[[{"secondary_y": True}]])
    line.add_trace(go.Scatter(x=df['Workout Timestamp'], y=df['Calories Burned'],
                                marker=dict(size=10, color='MediumPurple'),
                                name='Total Calories'
                            ),
                            secondary_y=False
    )
    line.add_trace(go.Scatter(x=df['Workout Timestamp'], y=df['Total Output'],
                                marker=dict(size=10, color='MediumSeaGreen'),
                                name='Total Output'
                            ),
                    secondary_y=True
    )

    line.update_layout(
        title=f"Calories and Total Output per {freq}",
        title_x=0.5,
        yaxis_title="Calories Burned",
    )

    line.update_yaxes(title_text="Total Output", secondary_y=True)
    line.update_yaxes(title_text="Calories Burned", secondary_y=False)
    line.update_xaxes(title=f"{freq}")
    return line
create_timeseries_figure(group_data_by_timestamp())

## Step 4: Add our plot to our dashboard

We now have our data and our plot. We are ready to start building our dashboard. Below we will create the layout for the html to display the plot in.

Notice we are using the dash html components to create the layout, dash core components (dcc) for interactive elements (datepicker and graph) and dash bootstrap components (dbc) for styling.

![callback-graph](callback_diagram.png)

In [22]:
app = JupyterDash(__name__, external_stylesheets=[dbc.themes.BOOTSTRAP], suppress_callback_exceptions=True)

def create_calories_and_output_card(df: pd.DataFrame) -> dbc.Card:
    """
    Creates a dbc card with a title, DatePicker, RadioButton, and a timeseries figure.

    Parameters
    ----------
        df : pd.DataFrame The dataframe to be plotted.

    Returns
    -------
        dbc.Card: Card with header, datepicker, radio buttons and plot.
    """
    return dbc.Card([
        dbc.CardBody([
            html.Center(html.H1("Weekly Calorie and Output Breakdown", className='card-title')),
            html.Center(
                dcc.DatePickerRange(
                    id='date-range-picker-2',
                    min_date_allowed=df['Workout Timestamp'].min(),
                    max_date_allowed=df['Workout Timestamp'].max(),
                    initial_visible_month=df['Workout Timestamp'].min(),
                    start_date=df['Workout Timestamp'].min(),
                    end_date=df['Workout Timestamp'].max(),
                    style={
                        "margin-top": "1rem",
                        "margin-bottom": "1rem"
                    }
                )
            ),
            html.Center(dcc.RadioItems(
                options=[
                    {'label': 'Daily', 'value': 'D'},
                    {'label': 'Weekly', 'value': 'W'},
                    {'label': 'Monthly', 'value': 'M'},
                ],
                value='W',
                id='frequency-radio-2',
                labelStyle={'padding-right': '20px'}
            ),
            ),
            dcc.Graph(
                id='calories-output-graph',
            )
        ])
    ],
        outline=True,
        color='info',
        style={
            "width": "50rem",
            "margin-left": "10rem",
            "margin-bottom": "1rem"
        }
    )

@app.callback(
    Output('calories-output-graph', 'figure'),
    [   Input('date-range-picker-2', 'start_date'),
        Input('date-range-picker-2', 'end_date'),
        Input('frequency-radio-2', 'value')]
)
def update_weekly_calories_burned_chart(start_date, end_date, frequency):
    """
    Updates the weekly calories burned chart with the selected date range and frequency.
    """
    grouped_df = group_data_by_timestamp(start_date, end_date, frequency)
    return create_timeseries_figure(grouped_df, frequency)

app.layout = html.Div(
    children=[
        dbc.Row([create_calories_and_output_card(df)])
    ]
)

app.run_server(debug=True)

Dash app running on http://127.0.0.1:8050/



The 'environ['werkzeug.server.shutdown']' function is deprecated and will be removed in Werkzeug 2.1.



## Step 5: Lets add a header

In [23]:
titleCard = dbc.Card([
        dbc.CardBody([
            html.H1("Welcome to your workout dashboard, Jose!", className='card-title'),
            ])
        ],
        color='dark',
        inverse=True,
        style={
            "width": "55rem",
            "margin-left": "1rem",
            "margin-top": "1rem",
            "margin-bottom": "1rem"
        }
    )

app.layout = html.Div(
    children=[
        dbc.Row([
                html.Center(titleCard),
                ],
                justify="center",
                style={
                    'margin-left': '0.5rem'
                }
        ),
        dbc.Row([create_calories_and_output_card(df)])
    ]
)

app.run_server(debug=True)

Dash app running on http://127.0.0.1:8050/



The 'environ['werkzeug.server.shutdown']' function is deprecated and will be removed in Werkzeug 2.1.



## Step 6: Lets add a second plot

Note: go back and change width of create_calories_and_output_chart to be 50rem

In [24]:
def get_fitness_discipline_chart(df: pd.DataFrame) -> px.pie:
    """
    Returns a pie chart of the percentage of time spent per fitness discipline.

    Parameters
    ----------
        df : pd.DataFrame The dataframe to be plotted.
    
    Returns
    -------
        px.pie: The pie chart to be plotted.
    """
    pie = px.pie(
        df,
        values="Length (minutes)",
        names="Fitness Discipline",
        title="Time Spent Per Fitness Discipline",
        hole=0.2,
    )
    pie.update_traces(textposition='inside', textinfo='percent+label')
    pie.update_layout(width=1600)
    pie.update_layout(title_x=0.5)
    return pie

get_fitness_discipline_chart(df)

In [25]:

def create_discipline_card(df: pd.DataFrame) -> dbc.Card:
    """
    Returns a dbc card with a title, body text and pie chart of 
    the percentage of time spent per fitness discipline.

    Parameters
    ----------
        df : pd.DataFrame The dataframe to be plotted.

    Returns
    -------
        dbc.Card: Card with header, body text and plot.
    
    """
    return dbc.Card([
        dbc.CardBody([
            html.Center(html.H1("Fitness Discipline Breakdown", className='card-title')),
            html.P("This chart shows the percentage of time spent in minutes for each fitness discipline.", className='card-body'),
            dcc.Graph(
                id='fitness-discipline-by-calories-chart',
                figure=get_fitness_discipline_chart(df)
            )
        ])
    ],
        color='info',
        outline=True,
        style={
            "width": "50rem",
            "margin-top": "1rem",
            "margin-left": "5rem",
            "margin-bottom": "1rem"
        }
    )

app.layout = html.Div(
    children=[
        dbc.Row([
                html.Center(titleCard),
                ],
                justify="center",
                style={
                    'margin-left': '0.5rem'
                }
        ),
        dbc.Row([
            create_calories_and_output_card(df),
            create_discipline_card(df)
        ])
    ]
)

app.run_server(debug=True)

Dash app running on http://127.0.0.1:8050/



The 'environ['werkzeug.server.shutdown']' function is deprecated and will be removed in Werkzeug 2.1.



## Step 7: Lets finish off our dashboard with a final plot on its own row

In [26]:
def get_instructors_by_discipline_chart(df: pd.DataFrame) -> px.bar:
    """
    Returns a bar chart of the number of workouts per instructor per fitness discipline.

    Parameters
    ----------
        df : pd.DataFrame The dataframe to be plotted.
    
    Returns
    -------
        px.bar: The bar chart to be plotted.
    """
    df['count'] = df['Workout Timestamp']
    df = df.groupby(['Instructor Name','Fitness Discipline'])['count'].agg('count').reset_index()
    chart = px.bar(df, x="Instructor Name", y="count", color="Fitness Discipline", title="Total Workouts by Instructor", width=1600)
    chart.update_layout(
        title_x=0.5
    )

    return chart

get_instructors_by_discipline_chart(df)


In [27]:

def create_instructor_card(df: pd.DataFrame) -> dbc.Card:
    """
    Returns a dbc card with a title, body text and bar chart of
    the number of workouts per instructor per fitness discipline.

    Parameters
    ----------
        df : pd.DataFrame The dataframe to be plotted.
    
    Returns
    -------
        dbc.Card: Card with header, body text and plot.
    """
    return dbc.Card([
        dbc.CardBody([
            html.Center(html.H1("Instructor by Fitness Discipline", className='card-title')),
            dcc.Graph(
                id='instructor-by-discipline-chart',
                figure=get_instructors_by_discipline_chart(df),
                style={
                    "margin-top": "1rem",
                    "margin-bottom": "1rem"
                }
            )
        ])
    ],
        color='info',
        outline=True,
        style={
            "margin-top": "1rem",
            "margin-left": "1rem",
            "margin-bottom": "1rem"
        }
    )

app.layout = html.Div(
    children=[
        dbc.Row([
                html.Center(titleCard),
                ],
                justify="center",
                style={
                    'margin-left': '0.5rem'
                }
        ),
        dbc.Row([
            create_calories_and_output_card(df),
            create_discipline_card(df)
        ]),
        dbc.Row([
            dbc.Col([
                create_instructor_card(df)
            ])
        ])
    ]
)

app.run_server(debug=True)


Dash app running on http://127.0.0.1:8050/



The 'environ['werkzeug.server.shutdown']' function is deprecated and will be removed in Werkzeug 2.1.

