# Introduction to [Plotly](https://plotly.com/)

Plotly is a versatile interactive plotting package that can be used with Python and Javascript and also through an online editor (without the need for coding). 

## Why/When to use Plotly (my 2 cents)

If you already know Python and you don't really want to learn another coding language, but you do want to create interactive figures (e.g., within a Jupyter notebook and/or for use on a website), you should look into Plotly.  

In particular, [Plotly express](https://plotly.com/python/plotly-express/) is a fantastic tool for generating quick interactive figures without much code.  Plotly express covers a good amount of ground, and you may be able to do all/most your work within Plotly express, depending on your specific needs.  In this workshop, I'll show you Plotly express, but then move beyond it for the majority of the content.  

Though you can do a lot with Plotly, it definitely has limitations (some of which we'll see in this workshop). Also, as with all of the ready-made interactive plot solutions (e.g., [Bokeh](https://docs.bokeh.org/en/latest/), [Altair](https://altair-viz.github.io/), [Glue](https://glueviz.org/), etc.), Plotly has a specific look, which can only be tweaked to a certain extent.  If you like the look well enough and you don't mind the limitations, then it's a good choice. 

##  In this tutorial... 

We will explore the basics of the Python version, using COVID-19 data from the following sources:

- COVID-19 data from the WHO: https://covid19.who.int/info/ 
- GDP Data from the World Bank: https://data.worldbank.org/indicator/NY.GDP.MKTP.CD

I will make two plots, one comparing COVID-19 data to GDPs and another showing COVID-19 data as a function of time.

## Installation

I recommend installing Python using [Anaconda](https://www.anaconda.com/products/individual).  Then you can create and activate a new environment for this workshop by typing the following commands into your (bash) terminal.

```
$ conda create -n plotly-env python=3.9 jupyter pandas plotly statsmodels
$ conda activate plotly-env
```

## Import the relevant packages that we will use.

In [None]:
import pandas as pd
import numpy as np
import scipy.stats

import plotly.graph_objects as go
from plotly.subplots import make_subplots
import plotly.express as px

## 1. Create a plot showing COVID-19 and GDP data.

### 1.1. Read in the data.

I will join multiple data tables together, using the *pandas* package so that I have one DataFrame containing all values for a given country.

In [None]:
# Current cumulative COVID-19 data from the WHO. 
# dfCT = pd.read_csv('data/WHO-COVID/WHO-COVID-19-global-table-data.csv') # in case the WHO server goes down
dfCT = pd.read_csv('https://covid19.who.int/WHO-COVID-19-global-table-data.csv', index_col=False)
dfCT

In [None]:
# Current vaccination data from the WHO
# dfV = pd.read_csv('data/WHO-COVID/vaccination-data.csv') # in case the WHO server goes down
dfV = pd.read_csv('https://covid19.who.int/who-data/vaccination-data.csv')
dfV

In [None]:
# Vaccination metadata from the WHO; this file contains the start dates (and end dates) for vaccines for each country. 
# dfVM = pd.read_csv('data/WHO-COVID/vaccination-metadata.csv') # in case the WHO server goes down
dfVM = pd.read_csv('https://covid19.who.int/who-data/vaccination-metadata.csv')

# drop columns without a start date
dfVM.dropna(subset = ['START_DATE'], inplace = True)

# convert the date columns to datetime objects for easier plotting and manipulation later on
dfVM['AUTHORIZATION_DATE'] = pd.to_datetime(dfVM['AUTHORIZATION_DATE'])
dfVM['START_DATE'] = pd.to_datetime(dfVM['START_DATE'])
dfVM['END_DATE'] = pd.to_datetime(dfVM['END_DATE'])

# I will simplify this table to just take the earliest start date for a given country
# sort by the start date and country code
dfVM.sort_values(['START_DATE', 'ISO3'], ascending = (True, True), inplace = True)
# take only the first entry for a given country
dfVM.drop_duplicates(subset = 'ISO3', keep = 'first', inplace = True)

dfVM

In [None]:
# GDP data from the World Bank (the first three rows do not contain data)
# I don't think there's a direct link to this data on their server (but I didn't look very hard)
dfM = pd.read_csv('data/WorldBank/API_NY.GDP.MKTP.CD_DS2_en_csv_v2_6011335.csv', skiprows = 3)
dfM

In [None]:
# Join these 4 tables so that I have one DataFrame with all values for a given country.
# I will start by joining the two vaccination data tables.
dfJ1 = dfV.join(dfVM.set_index('ISO3'), on = 'ISO3', how = 'left', rsuffix = '_meta')

# Next I will join this with the COVID-19 data table.
# First rename this column in the COVID-19 data so that it is the same as the vaccine data.  Then I will join on that column.
dfCT.rename(columns = {'Name':'COUNTRY'}, inplace = True)
dfJ2 = dfJ1.join(dfCT.set_index('COUNTRY'), on = 'COUNTRY', how = 'left')

# Finally, I will join in the GDP data from the World Bank.
# I will rename a column in the World Bank data to match a column in the joined data above.
dfM.rename(columns = {'Country Code':'ISO3'}, inplace = True)
dfJoinedCOVID = dfJ2.join(dfM.set_index('ISO3'), on = 'ISO3', how = 'left')

dfJoinedCOVID

### 1.2. Create a couple simple Plotly figures using [Plotly express](https://plotly.com/python/plotly-express/).

Plotly express is a simplified version of the Plotly interface for Python that allows users to create many types of Plotly figures with single lines of code.  This greatly simplifies the workflow for some kinds of Plotly figures.  We will start with Plotly express (and for some of your use cases, that may be enough), but we will move on to full blown Plotly for the rest of this workshop.

In this plot, I will show total vaccinations vs. GDP with the point size scaled by the total cumulative COVID-19 cases.  

In [None]:
# Note: We imported plotly.express as px
# I will create a simple scatter plot using the DataFrame I created above, 
# x will be the total vaccinations per 100 people
# y will be the 2020 GDP, and since it spans a very wide range in values, I will plot y in the log
fig = px.scatter(dfJoinedCOVID, x = 'TOTAL_VACCINATIONS_PER100', y = '2020', log_y = True)
fig.show()

There are a lot of options that you can apply to a plotly express scatter plot (e.g., see [here](https://plotly.com/python/line-and-scatter/)).  I will do the following:
- size each data point by the number of COVID-19 cases.
- add a trend line (A nice part of plotly express is that you can add a trend line very easily.)

In [None]:
# The sizes will behave better if I set a minimum value (using np.clip)
# I also want to remove any nan values
size =  np.clip(np.nan_to_num(dfJoinedCOVID['Cases - cumulative total per 100000 population']/500.), 5, None)

fig = px.scatter(dfJoinedCOVID, x = 'TOTAL_VACCINATIONS_PER100', y = '2020', log_y = True, 
    size = size,
    trendline = 'ols', trendline_options = dict(log_y = True)
)
fig.show()

Lets also plot the first vaccination start date vs. GDP, with the size based on the total vaccionations.  In this example, I will also modify the hover and axis attributes.

In [None]:
# The command is similar to that from the previous cell, but here I'm also defining the data shown on hover in the tooltips.
# (It's not quite as easy to add a trendline here when plotting dates, though it is possible.)

size = np.clip(np.nan_to_num(dfJoinedCOVID['TOTAL_VACCINATIONS_PER100']),5,None)

fig = px.scatter(dfJoinedCOVID, x = 'START_DATE', y = '2020', log_y = True, 
    size = size,
    hover_name = 'COUNTRY', 
    hover_data = ['2020', 
      'START_DATE', 
      'TOTAL_VACCINATIONS_PER100',
      'Cases - cumulative total per 100000 population'
    ]
)

# a few manipulations to the axes 
fig.update_xaxes(title = 'Vaccine Start Date', range = [np.datetime64('2020-07-01'), np.datetime64('2021-07-01')])
fig.update_yaxes(title = '2020 GDP (USD)')

fig.show()

As an alternative example, let's also [create a histogram using plotly express](https://plotly.com/python/histograms/).

I will plot a histogram of all the total vaccinations per 100 people, separated (colored) by vaccination name.

*Note that this automatically comes with an interactive legend.*

In [None]:
fig = px.histogram(dfJoinedCOVID.fillna('None'), x = 'TOTAL_VACCINATIONS_PER100', nbins = 20, 
                   color = 'VACCINE_NAME', barmode = 'stack')
fig.show()

### *Exercise 1: Create your own plot using Plotly express.*

Use the data we read in above (or your own data).  You can start with one of the commands above or choose a different style of plot.  Whichever format you use, choose different columns to plot than above.  Try to also add a new option to the command to change the plot.  

Hint: Go to the [Plotly express homepage](https://plotly.com/python/plotly-express/), and click on a link to see many examples (e.g., [here's the page for the scatter plot](https://plotly.com/python/line-and-scatter/))

In [None]:
# Create a plot using Plotly express


### 1.4. Create the plot using the standard Plotly [Graph Object](https://plotly.com/python/graph-objects/).

For the remainder of the workshop we will use Graph Objects for our Plotly figures.  One motivation here is so that I can create multiple panels in one figure, which can be downloaded to an html file.  (Plotly express will only make an individual figure, and does not support arbitrary subplots.) 

First you create a <b>"trace"</b>, which holds the data.  There are many kinds of traces available in Plotly. (e.g., bar, scatter, etc.).  For this example, we will use a scatter trace.  (Interestingly, the scatter trace object also includes line traces, accessed by changing the "mode" key.  I will show the line version later on.)

Then you create a figure and add the trace to that figure.  A single figure can have multiple traces.

In [None]:
# Create a plot using Plotly Graph Objects(s)

# Note: We imported the plotly.graph_objects as go.
# create the trace
trace1 = go.Scatter(x = dfJoinedCOVID['TOTAL_VACCINATIONS_PER100'], y = dfJoinedCOVID['2020'], # x and y values for the plot
    mode = 'markers', # setting mode to markers produces a typical scatter plot
)

# create the figure 
fig = go.Figure()

# add the trace and update a few parameters for the axes
fig.add_trace(trace1)
fig.update_xaxes(title = 'Total Vaccionations Per 100 People', range=[0,300])
fig.update_yaxes(title = 'GDP (USD)', type = 'log')

fig.show()

Re-create this figure with more customizations.

In [None]:
# Note: We imported the plotly.graph_objects as go.
# create the trace and set various parameters

trace1 = go.Scatter(x = dfJoinedCOVID['TOTAL_VACCINATIONS_PER100'], y = dfJoinedCOVID['2020'], # x and y values for the plot
    mode = 'markers', # setting mode to markers produces a typical scatter plot
    showlegend = False, # I do not need a legend
    # set various parameters for the markers in the following dict, e.g., color, opacity, size, outline, etc.
    marker = dict( 
        color = 'rgba(0, 0, 0, 0.2)',
        opacity = 1,
        size = np.nan_to_num(np.clip(dfJoinedCOVID['Cases - cumulative total per 100000 population']/1000., 5, 100)),
        line = dict(
            color = 'rgba(0, 0, 0, 1)',
            width = 1
        ),
    ),
    # set a template for the tooltips below.  
    # hovertemplate can accept the x and y data and additional "text" as defined by a separate input
    # Note, the "<extra></extra>" is included to remove some formatting that plotly imposes on tooltips
    hovertemplate = '%{text}' + 
        'Total Vaccinations / 100 people: %{x}<br><extra></extra>' +
        'GDP: $%{y}<br>',
    # additional text to add to the hovertemplate.  This needs to be a list with the same length and the x and y data.
    text = ['Country: {}<br>Total COVID Cases / 100,000 people: {}<br>Vaccine start date: {}<br>'.format(x1, x2, x3) 
        for (x1, x2, x3) in zip(dfJoinedCOVID['COUNTRY'], 
            dfJoinedCOVID['Cases - cumulative total per 100000 population'], 
            dfJoinedCOVID['START_DATE'].dt.strftime('%b %Y'))
    ],
    # style the tooltip as desired                
    hoverlabel = dict(
        bgcolor = 'white',
    )
)

In [None]:
# create the figure
fig = go.Figure()

# add the trace and update a few parameters for the axes
fig.add_trace(trace1)
fig.update_xaxes(title = 'Total Vaccionations Per 100 People', range=[0,300])
fig.update_yaxes(title = 'GDP (USD)', type = 'log')

fig.show()

In [None]:
# Add a trendline
# I will use scipy.stats.linregress (and fit to the log of the GDP)
dfFit1 = dfJoinedCOVID.dropna(subset = ['TOTAL_VACCINATIONS_PER100', '2020'])
slope1, intercept1, r1, p1, se1 = scipy.stats.linregress(dfFit1['TOTAL_VACCINATIONS_PER100'], np.log10(dfFit1['2020']))
xFit1 = np.linspace(0, 300, 100)
yFit1 = 10.**(slope1*xFit1 + intercept1)
trace1F = go.Scatter(x = xFit1, y = yFit1, 
    mode = 'lines', # Set the mode the lines (rather than markers) to show a line.
    opacity = 1, 
    marker_color = 'black',
    showlegend = False,
    hoverinfo='skip' # Don't show anything on hover.  (We could show the trendline info, but I'll leave that out for now.)
)

In [None]:
# create the figure
fig = go.Figure()

# add the trace and update a few parameters for the axes
fig.add_trace(trace1)
fig.add_trace(trace1F)
fig.update_xaxes(title = 'Total Vaccionations Per 100 People', range=[0,300])
fig.update_yaxes(title = 'GDP (USD)', type = 'log')

fig.show()

### *Exercise 2: Create your own plot using Plotly Graph Object(s).*

Use the data we read in above (or your own data).  You can start with one of the commands above or choose a different style of plot.  Whichever format you use, choose different columns to plot than above.  Try to also add a new option to the command to change the plot.  

Hint: The Plotly help pages usually contain examples for both Plotly express and Graph Object.  If you go to the [Plotly express homepage](https://plotly.com/python/plotly-express/) and click on a link (e.g., [the page for the scatter plot](https://plotly.com/python/line-and-scatter/)), you can scroll down to see Graph Object examples.

In [None]:
# Create a plot using Plotly Graph Objects(s)

# First, create the trace

# Second, create the figure and show it


### 1.5. Show two plots side-by-side sharing the y axis.

In [None]:
# Create the trace for the 2nd figure (similar method to above).
trace2 = go.Scatter(x = dfJoinedCOVID['START_DATE'], y = dfJoinedCOVID['2020'], 
    mode = 'markers', 
    showlegend = False,
    name = 'COVID Vaccines',
    marker = dict(
        color = 'rgba(0, 0, 0, 0.2)',
        opacity = 1,
        size = np.nan_to_num(np.clip(dfJoinedCOVID['TOTAL_VACCINATIONS_PER100']/7., 5, 100)),
        line = dict(
            color = 'rgba(0, 0, 0, 1)',
            width = 1
        ),
    ),
    hovertemplate = '%{text}' + 
        'Vaccine start date: %{x}<br><extra></extra>' +
        'GDP: $%{y}<br>',
    text = ['Country: {}<br>Total COVID Cases / 100,000 people: {}<br>Total Vaccinations / 100 people: {}<br>'.format(x1, x2, x3) 
        for (x1, x2, x3) in zip(dfJoinedCOVID['COUNTRY'], 
            dfJoinedCOVID['Cases - cumulative total per 100000 population'], 
            dfJoinedCOVID['TOTAL_VACCINATIONS_PER100'])
    ],
    hoverlabel=dict(
        bgcolor = 'white',
    )

)

# Add trendline
dfFit2 = dfJoinedCOVID.dropna(subset = ['START_DATE', '2020'])
delta = (dfFit2['START_DATE'] - dfFit2['START_DATE'].min())/np.timedelta64(1,'D')
slope2, intercept2, r2, p2, se2 = scipy.stats.linregress(delta, np.log10(dfFit2['2020']))
xx2 = np.linspace(0, 500, 100)
yFit2 = 10.**(slope2*xx2 + intercept2)
xFit2 = xx2*np.timedelta64(1,'D') + dfFit2['START_DATE'].min()
trace2F = go.Scatter(x = xFit2, y = yFit2, 
    mode = 'lines', 
    opacity = 1, 
    marker_color = 'black',
    showlegend = False,
    hoverinfo='skip' 
)

# Create the figure and add the traces
# I will use Plotly's "make_subplots" method (imported above).
# Define the number of rows and columns, the column_widths, spacing, and here I will share the y axis.
# Sharing the y axis means that if you zoom/pan on one plot, the other will also zoom/pan.
fig = make_subplots(rows = 1, cols = 2, column_widths = [0.5, 0.5], horizontal_spacing = 0.01, shared_yaxes = True)

# Add the first trace and update the axes.
# Note that I specify which row and column within each of these commands.
fig.add_trace(trace1, row = 1, col = 1)
fig.add_trace(trace1F, row = 1, col = 1)
fig.update_xaxes(title = 'Total Vaccionations Per 100 People', range=[0,280], row = 1, col = 1)
fig.update_yaxes(title = 'GDP (USD)', type = 'log', row = 1, col = 1)

# Add the second trace and update the axes.
# Note that I am using numpy's datetime64 data types in order to set the axis range here
fig.add_trace(trace2, row = 1, col = 2)
fig.add_trace(trace2F, row = 1, col = 2)
fig.update_xaxes(title = 'Vaccine Start Date', range = [np.datetime64('2020-07-02'), 
                                                        np.datetime64('2021-07-01')], row = 1, col = 2)
fig.update_yaxes(type = 'log', row = 1, col = 2)

# Provide an overall title to the figure.
fig.update_layout(title_text = 'COVID-19 Vaccine Equity')

# Add annotations to tell what the symbol sizes mean.
# I will position these relative to the data domain, and therefore they will not move around when zooming and panning.
fig.add_annotation(x = 0.01, y = 0.99, row = 1, col = 1, showarrow = False,
    xref = 'x domain', yref = 'y domain',
    text = 'Symbol size indicates total COVID-19 cases.')
fig.add_annotation(x = 0.01, y = 0.99, row = 1, col = 2, showarrow = False,
    xref = 'x domain', yref = 'y domain',
    text = 'Symbol size indicates total vaccinations.')

# Show the final result
fig.show()

#### You can save the figure in html format to use on a website.

In [None]:
fig.write_html('plotly_graph.html')

## 2. Create a plot showing COVID-19 cases and deaths vs. time for a given country.

I will also include [custom buttons](https://plotly.com/python/custom-buttons/) to toggle between various ways of viewing the data.

### 2.1. Read in the data

In [None]:
# COVID-19 cases and deaths as a function of time for multiple countries
# dfC = pd.read_csv('data/WHO-COVID/WHO-COVID-19-global-data.csv') # in case the WHO server goes down
dfC = pd.read_csv('https://covid19.who.int/WHO-COVID-19-global-data.csv')

# convert the date column to datetime objects for easier plotting and manipulation later on
dfC['Date_reported'] = pd.to_datetime(dfC['Date_reported'])
dfC

### 2.2. Choose a country, and then create the plot,

In [None]:
country = 'United States of America'

In [None]:
# Select only the data that is from the country.
use3 = dfC.loc[dfC['Country'] == country]

In [None]:
# Create the trace.
# In this example I will use a bar chart.
trace3 = go.Bar(x = use3['Date_reported'], y = use3['New_cases'], 
    opacity = 1, 
    marker_color = 'black',
    showlegend = False,
    name = 'COVID Cases'
)

# Create the figure.
fig = go.Figure()

# Add the trace and update a few parameters for the axes.
fig.add_trace(trace3)
fig.update_xaxes(title = 'Date')
fig.update_yaxes(title = 'Total COVID-19 Cases')
fig.show()

#### Let's improve this plot.

- I want to take a rolling average (this is easily done with *pandas*).
- I'd prefer a filled region rather than bars.

In [None]:
# Define the number of days to use for the rolling average.
rollingAve = 7

In [None]:
# Create the trace, using Scatter to create lines and fill the region between the line and y=0.
trace3 = go.Scatter(x = use3['Date_reported'], y = use3['New_cases'].rolling(rollingAve).mean(), 
    mode = 'lines', # Set the mode the lines (rather than markers) to show a line.
    opacity = 1, 
    marker_color = 'black',
    fill = 'tozeroy',  # This will fill between the line and y=0.
    showlegend = False,
    name = 'COVID Count',
    hovertemplate = 'Date: %{x}<br>Number: %{y}<extra></extra>', #Note: the <extra></extra> removes the trace label.
)

# Create the figure.
fig = go.Figure()

# Add the trace and update a few parameters for the axes.
fig.add_trace(trace3)
fig.update_xaxes(title = 'Date')
fig.update_yaxes(title = 'Total COVID-19 Cases')
fig.show()

### *Exercise 3: Create your own plot showing COVID-19 deaths vs time.*

You can use either Plotly express or Graph Objects.  Try to pick a different country than I used above.  Also try to use a different style than I plotted above.  

In [None]:
# Create a Plotly figure showing COVID-19 deaths vs. time


### 2.3. Add some buttons to interactively change the plot.

I want to be able to toggle between cumulative vs. total as well as cases vs. death.  We can do this with [custom buttons](https://plotly.com/python/custom-buttons/) that will "restyle" the plot.  

You can also create interactions with buttons and other "widgets" using [dash](https://plotly.com/dash/), but we won't go there in this workshop. 

In [None]:
columns = ['New_cases', 'New_deaths', 'Cumulative_cases', 'Cumulative_deaths']

#I'm going to write this as a function so that I can reuse it below
def createTraces(columns):
    # For this scenario, I am going to add each of the 4 traces to the plot but only show one at a time
    # Add traces for each column
    
    traces = [
        go.Scatter(x = use3['Date_reported'], y = use3[c].rolling(rollingAve).mean(), 
            mode = 'lines', # Set the mode the lines (rather than markers) to show a line.
            opacity = 1, 
            marker_color = 'black',
            fill = 'tozeroy',  # This will fill between the line and y=0.
            showlegend = False,
            name = 'COVID Count',
            hovertemplate = 'Date: %{x}<br>Number: %{y}<extra></extra>', #Note: the <extra></extra> removes the trace label.
            visible = i == 0
        ) for i, c in enumerate(columns)
    ]

    
    return traces


# I'm going to write this as a function so that I can reuse it below
# x,y args to position the buttons
def createButtons(columns, x = 0.0, y = 1.13):
    # create an "updatemenu" with buttons for choosing the data to plot that I will add to the figure later

    updatemenu = dict(
            type = 'buttons',
            direction = 'left', # This defines what orientation to include all buttons.  'left' shows them in one row.
            buttons = list([
                dict(
                    # 'args' tells the button what to do when clicked.  
                    #     In this case it will change the visibility of the traces
                    # 'label' is the text that will be displayed on the button
                    # 'method' is the type of action the button will take.
                    #    method = 'restyle' allows you to redefine certain preset plot styles (including the visible key).  
                    #    See  https://plotly.com/python/custom-buttons/ for different methods and their uses
                    args = [{'visible': [i == j for j in range(len(columns))]}], 
                    label = label.replace('_',' '),
                    method = 'restyle' 
                ) for i, label in enumerate(columns)]),
        
            showactive = True, # Highlight the active button
            # Below is for positioning
            x = x, 
            xanchor = 'left',
            y = y,
            yanchor = 'top'
        )
    
    return updatemenu

In [None]:
# Create the figure.
fig = go.Figure()

# create the traces
traces = createTraces(columns)

# add the traces to the figure
for t in traces:
    fig.add_trace(t)
    
# create the buttons and add them to the figure below
buttons = createButtons(columns)

# Update a few parameters for the axes and add the buttons
#   Note: I added a margin to the top ('t') of the plot within fig.update_layout to make room for the buttons.
fig.update_xaxes(title = 'Date')#, range = [np.datetime64('2020-03-01'), np.datetime64('2022-01-12')])
fig.update_yaxes(title = 'COVID-19 Count')
fig.update_layout(
    title_text = 'COVID-19 Data Explorer : '+ country + '<br>(' + str(rollingAve) +'-day rolling average)',
    margin = dict(t = 150),
    updatemenus = [buttons]
)

fig.show()

### 2.4. Create a dropdown menu to choose the country.

[Here are examples of how to include dropdown menus in Plotly](https://plotly.com/python/dropdowns/).  

The procedure will be similar to the buttons, but we will use the "update" mode (rather than "restyle") for the dropdown menu.  Update will allow us to change the data being plotted.

In [None]:
# I am going to create the dropdown list here and then add it to the figure below
# I will need to update the x and y data for the time series plot 

# Identify the countries to use 
# I will but The United States of America first so that it can be the default country on load (the first button)
availableCountries = dfC['Country'].unique().tolist()
availableCountries.insert(0, availableCountries.pop(availableCountries.index('United States of America'))) 

# I will write this as a function as well and then create a new figure in the next cell that uses this function
# x,y args to position the dropdown
def createDropdown(availableCountries, columns, x = 0.0, y = 1.1):
    # create an "updatemenu" with a dropdown for choosing the data to plot that I will add to the figure later

    dropdown = []
    for c in availableCountries:
        if (c in dfJoinedCOVID['COUNTRY'].tolist()):
            dropdown.append(dict(
                args = [{'x': [dfC.loc[dfC['Country'] == c]['Date_reported']]*len(columns), # the same x values for each trace
                         'y': [dfC.loc[dfC['Country'] == c][col].rolling(rollingAve).mean() for col in columns],
                }],
                label = c,
                method = 'update'
            ))

    updatemenu = dict(
        buttons = dropdown,
            direction = 'down',
            showactive = True,
            x = x,
            xanchor = 'left',
            y = y,
            yanchor = 'top'
        )
        

    return updatemenu

In [None]:
# Create the figure.
fig = go.Figure()

# create the traces
traces = createTraces(columns)

# add the traces to the figure
for t in traces:
    fig.add_trace(t)

# generate the menus to be added to the figure below
updatemenus = [createButtons(columns, 0, 1.3), createDropdown(availableCountries, columns, 0, 1.15)]

# Update a few parameters for the axes and add the buttons and dropdown
fig.update_xaxes(title = 'Date')#, range = [np.datetime64('2020-03-01'), np.datetime64('2022-01-12')])
fig.update_yaxes(title = 'COVID-19 Count')
fig.update_layout(
    title_text = 'COVID-19 Data Explorer : '+ country + '<br>(' + str(rollingAve) +'-day rolling average)',
    title_y = 0.97,
    margin = dict(t = 140),
    updatemenus = updatemenus
)

fig.show()

## 3. Put the three plots together into one "dashboard".

I will put commands (from above) into functions to clean up the code.  This is mostly copying and pasting, but with some additions that I will point out below in the comments.

In [None]:
# In order to reduce the lines of code, I created a function that generates the vaccine trace, given inputs
def generateVaccineTrace(xData, yData, size, color, hovertemplate, text, hoverbg = 'white'):
    '''
        xData : the x data for the trace
        yData : the y data for the trace
        size : sizes for the data points
        color : color for the markers
        hovertemplate : the template for the tooltip
        text : the additional text to include in the tooltip
        hoverbg : optional parameter to set the background color of the tooltip (defaut is white)
    '''
    trace = go.Scatter(x = xData, y = yData, 
        mode = 'markers', 
        showlegend = False,
        name = 'COVID Vaccines',
        marker = dict(
            color = color,
            opacity = 1,
            size = size,
            line = dict(
                color = 'rgba(0, 0, 0, 1)',
                width = 1
            ),
        ),
        hovertemplate = hovertemplate,
        text = text,
        hoverlabel = dict(
            bgcolor = hoverbg,
        ),
    )
    
    return trace


# Function to help with the vaccine tooltip text
def getVaccineText(co = None):
    use = dfJoinedCOVID
    if (co is not None):
        use = dfJoinedCOVID.loc[dfJoinedCOVID['COUNTRY'] == co]
        
    return ['Country: {}<br>Total COVID Cases / 100,000 people: {}<br>Total Vaccinations / 100 people: {}<br>Vaccine start date: {}<br>'.format(x1, x2, x3, x4) 
        for (x1, x2, x3, x4) in zip(
            use['COUNTRY'], 
            use['Cases - cumulative total per 100000 population'],
            use['TOTAL_VACCINATIONS_PER100'],
            use['START_DATE'].dt.strftime('%b %Y'))
    ]


# Functions to help with the vaccine marker size
# Functions to help with the vaccine marker size
def getVaccineMarkersize1(co = None):
    use = dfJoinedCOVID
    if (co is not None):
        use = dfJoinedCOVID.loc[dfJoinedCOVID['COUNTRY'] == co]
        
    return np.clip(np.nan_to_num(use['Cases - cumulative total per 100000 population']/1000.), 5, 100)

def getVaccineMarkersize2(co = None):
    use = dfJoinedCOVID
    if (co is not None):
        use = dfJoinedCOVID.loc[dfJoinedCOVID['COUNTRY'] == co]
        
    return np.clip(np.nan_to_num(use['TOTAL_VACCINATIONS_PER100']/7.), 5, 100)



# This is a large function that will generate the entire figure with all the subplots
def generateFigure(co, rollingAve = 7):
    '''
        co : the country that we want to plot
    '''
    
    ##################################
    # First, create the traces. 
    ##################################
    
    # cases over time
    useC = dfC.loc[dfC['Country'] == co]
    traces1 = []
    columns = ['New_cases', 'New_deaths', 'Cumulative_cases', 'Cumulative_deaths']
    for i, c in enumerate(columns):
        visible = False
        if (i == 0):
            visible = True

        # Create the trace, using Scatter to create lines and fill the region between the line and y=0.
        trace = go.Scatter(x = useC['Date_reported'], y = useC[c].rolling(rollingAve).mean(), 
            mode = 'lines', # Set the mode the lines (rather than markers) to show a line.
            opacity = 1, 
            marker_color = 'black',
            fill = 'tozeroy',  # This will fill between the line and y=0.
            showlegend = False,
            name = 'COVID Count',
            hovertemplate = 'Date: %{x}<br>Number: %{y}<extra></extra>', #Note: the <extra></extra> removes the trace label.
            visible = visible
        )
        traces1.append(trace)
        
    
    # vaccine fraction vs. GDP (using the function that I wrote above)
    trace2 = generateVaccineTrace(dfJoinedCOVID['TOTAL_VACCINATIONS_PER100'], dfJoinedCOVID['2020'], 
        getVaccineMarkersize1(),
        'rgba(0, 0, 0, 0.2)',
        '%{text}<extra></extra>GDP: $%{y}<br>',
        getVaccineText(),
     )
                                                                 
    # vaccine start date vs. GDP (using the function that I wrote above)
    trace3 = generateVaccineTrace(dfJoinedCOVID['START_DATE'], dfJoinedCOVID['2020'], 
        getVaccineMarkersize2(),
        'rgba(0, 0, 0, 0.2)',
        '%{text}<extra></extra>GDP: $%{y}<br>',
        getVaccineText(),                              
    )

    # Add trendlines
    #   This is simply copied from above
    dfFit1 = dfJoinedCOVID.dropna(subset = ['TOTAL_VACCINATIONS_PER100', '2020'])
    slope1, intercept1, r1, p1, se1 = scipy.stats.linregress(dfFit1['TOTAL_VACCINATIONS_PER100'], np.log10(dfFit1['2020']))
    xFit1 = np.linspace(0, 300, 100)
    yFit1 = 10.**(slope1*xFit1 + intercept1)
    trace2F = go.Scatter(x = xFit1, y = yFit1, 
        mode = 'lines', # Set the mode the lines (rather than markers) to show a line.
        opacity = 1, 
        marker_color = 'black',
        showlegend = False,
        hoverinfo='skip' # Don't show anything on hover.  (We could show the trendline info, but I'll leave that out for now.)
    )

    dfFit2 = dfJoinedCOVID.dropna(subset = ['START_DATE', '2020'])
    delta = (dfFit2['START_DATE'] - dfFit2['START_DATE'].min())/np.timedelta64(1,'D')
    slope2, intercept2, r2, p2, se2 = scipy.stats.linregress(delta, np.log10(dfFit2['2020']))
    xx2 = np.linspace(0, 500, 100)
    yFit2 = 10.**(slope2*xx2 + intercept2)
    xFit2 = xx2*np.timedelta64(1,'D') + dfFit2['START_DATE'].min()
    trace3F = go.Scatter(x = xFit2, y = yFit2, 
        mode = 'lines', 
        opacity = 1, 
        marker_color = 'black',
        showlegend = False,
        hoverinfo='skip' 
    )

    # Add 2 more traces for the vaccine plots to highlight the selected country (using the function that I wrote above).
    #   These are nearly identical to the 2 traces from above but using the limitted useH dataset (below) and colored red.
    useH = dfJoinedCOVID.loc[dfJoinedCOVID['COUNTRY'] == co]
    trace2H = generateVaccineTrace(useH['TOTAL_VACCINATIONS_PER100'], useH['2020'], 
        getVaccineMarkersize1(co),
        'rgba(255, 0, 0, 1)',
        '%{text}<extra></extra>GDP: $%{y}<br>',
        getVaccineText(co),
        hoverbg = 'red'
    )
    trace3H = generateVaccineTrace(useH['START_DATE'], useH['2020'], 
        getVaccineMarkersize2(co),
        'rgba(255, 0, 0, 1)',
        '%{text}<extra></extra>GDP: $%{y}<br>',
        getVaccineText(co),
        hoverbg = 'red'
    )


    ##################################
    # Second, create the figure and add the traces.
    ##################################
    
    # I will create a subplot object where 
    # - the top will have 1 column and contain the cases over time,
    # - the bottom will be split in two columns for the vaccine plots,
    # - and the bottom two columns will share the y axis.
    fig = make_subplots(rows = 2, cols = 2, shared_yaxes = True,   
        column_widths = [0.5, 0.5],
        row_heights = [0.35, 0.65],
        specs = [ [{"colspan": 2}, None], [{}, {}] ], # here is where I define that the first row only has one column
        horizontal_spacing = 0.01, 
        vertical_spacing = 0.08
    )

    # Add in the traces and update the axes (specifying with row and column they below to)
    for t in traces1:
        fig.add_trace(t, row = 1, col = 1)
    fig.update_xaxes(title = 'Date')#, range = [np.datetime64('2020-03-01'), np.datetime64('2022-01-12')], row = 1, col = 1)
    fig.update_yaxes(title = 'COVID-19 Count', row = 1, col = 1, rangemode = 'nonnegative')

    fig.add_trace(trace2, row = 2, col = 1)
    fig.add_trace(trace2F, row = 2, col = 1)
    fig.add_trace(trace2H, row = 2, col = 1)
    fig.update_xaxes(title = 'Total Vaccionations Per 100 People', range=[0,280], row = 2, col = 1)
    fig.update_yaxes(title = 'GDP (USD)', type = 'log', row = 2, col = 1)

    fig.add_trace(trace3, row = 2, col = 2)
    fig.add_trace(trace3F, row = 2, col = 2)
    fig.add_trace(trace3H, row = 2, col = 2)
    fig.update_xaxes(title = 'Vaccine Start Date', range = [np.datetime64('2020-07-02'), 
                                                            np.datetime64('2021-07-01')], row = 2, col = 2)
    fig.update_yaxes(type = 'log', row = 2, col = 2)

    # Add a title and define the size and margin.
    fig.update_layout(title_text = 'COVID-19 Data Explorer : '+ co + '<br>(' + str(rollingAve) +'-day rolling average)',
        title_y = 0.97,
        height = 1000,
        width = 1000, 
        margin = dict(t = 120))

    # Add the annotations to tell what the symbol sizes mean.
    fig.add_annotation(x = 0.01, y = 0.99, row = 2, col = 1, showarrow = False,
        xref = 'x domain', yref = 'y domain',
        text = 'Symbol size indicates total COVID-19 cases.')
    fig.add_annotation(x = 0.01, y = 0.99, row = 2, col = 2, showarrow = False,
        xref = 'x domain', yref = 'y domain',
        text = 'Symbol size indicates total vaccinations.')
    

    return fig, xFit1, yFit1, xFit2, yFit2
                            
         

In [None]:
def createButtons(columns, x = 0.0, y = 1.05):   
    ##################################
    # create the buttons.
    ##################################
    
    # Note that here in 'args' I need to provide values for all the traces (even though only one plot will change).

    updatemenu = dict(
            type = 'buttons',
            direction = 'left', # This defines what orientation to include all buttons.  'left' shows them in one row.
            buttons = list([
                dict(
                    args = [{'visible': [i == j for j in range(len(columns))] + [True, True, True, True, True, True]}], 
                    label = label.replace('_',' '),
                    method = 'restyle' 
                ) for i, label in enumerate(columns)]),
        
            showactive = True, # Highlight the active button
            # Below is for positioning
            x = x, 
            xanchor = 'left',
            y = y,
            yanchor = 'top'
        )
    
    return updatemenu

In [None]:
def createDropdown(availableCountries, columns, xFit1, yFit1, xFit2, yFit2, rollingAve = 7, x = 0.0, y = 1.1, ):
    ##################################
    # create the dropdown menu.
    ##################################
    
    # Note that here in 'args' I need to provide values for all the traces.

    dropdown = []
    for c in availableCountries:
        if (c in dfJoinedCOVID['COUNTRY'].tolist()):
            dropdown.append(dict(
                args = [{'x': 
                             [dfC.loc[dfC['Country'] == c]['Date_reported'].values]*len(columns) +  # time plot
                             [
                                 dfJoinedCOVID['TOTAL_VACCINATIONS_PER100'].values, # full scatter plot on the left,
                                 xFit1, # fit line 1
                                 dfJoinedCOVID.loc[dfJoinedCOVID['COUNTRY'] == c]['TOTAL_VACCINATIONS_PER100'].values, # red circle in left scatter plot
                                 dfJoinedCOVID['START_DATE'].values, # full scatter plot on the right
                                 xFit2, # fit line 2
                                 dfJoinedCOVID.loc[dfJoinedCOVID['COUNTRY'] == c]['START_DATE'].values # red circle on right scatter plot
                             ],
                         'y': 
                             [dfC.loc[dfC['Country'] == c][col].rolling(rollingAve).mean().values for col in columns] + 
                             [
                                 dfJoinedCOVID['2020'].values, 
                                 yFit1,
                                 dfJoinedCOVID.loc[dfJoinedCOVID['COUNTRY'] == c]['2020'].values,
                                 dfJoinedCOVID['2020'].values,
                                 yFit2,
                                 dfJoinedCOVID.loc[dfJoinedCOVID['COUNTRY'] == c]['2020'].values
                             ],
                        'text': 
                            ['', '', '', '',
                            getVaccineText(), '', getVaccineText(c),
                            getVaccineText(), '', getVaccineText(c)
                            ],
                        'marker.size': 
                            ['', '', '', '',
                            getVaccineMarkersize1(), '',getVaccineMarkersize1(c),
                            getVaccineMarkersize2(), '',getVaccineMarkersize2(c),
                            ]
                   
                }],
                label = c,
                method = 'update'
            ))

    updatemenu = dict(
        buttons = dropdown,
            direction = 'down',
            showactive = True,
            x = x,
            xanchor = 'left',
            y = y,
            yanchor = 'top'
        )
        

    return updatemenu

In [None]:
# redefine these here for completeness
country = 'United States of America'
rollingAve = 7
columns = ['New_cases', 'New_deaths', 'Cumulative_cases', 'Cumulative_deaths']
availableCountries = dfC['Country'].unique().tolist()
availableCountries.insert(0, availableCountries.pop(availableCountries.index('United States of America'))) 


# Use the functions to create the figure.
fig, xFit1, yFit1, xFit2, yFit2 = generateFigure(country, rollingAve)
fig.update_layout(
    title_y = 0.97,
    margin = dict(t = 140),
    updatemenus = [createButtons(columns, 0.0, 1.05), 
                   createDropdown(availableCountries, columns, xFit1, yFit1, xFit2, yFit2, rollingAve, 0.0, 1.1)]
)
fig.show()

In [None]:
# You can save the plotly figure as an html file to use on your website.
fig.write_html('plotly_graph.html')