# Jupyter Widgets

### A Walk Through of the Most Useful Widgets for Data Visualization

Importing necessary modules for the notebook. Pandas is used for loading in the dataset and for basic data manipulation. Numpy is used for various calculations. Matplotlib and Seaborn are used for the data visualization. And IPython is used to display the widgets alongside the visualizations. 

In [None]:
# importing other modules
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from IPython.display import display

#### Importing ipywidgets

Below is the import to use ipywidgets in the notebook:

In [None]:
# loading in jupyter widgets
import ipywidgets as widgets

#### The Dataset

The dataset used throughout this notebook is the auto-mpg dataset from Kaggle:

https://www.kaggle.com/datasets/uciml/autompg-dataset

In [None]:
# load in and clean the data
auto = pd.read_csv("auto-mpg.csv")
auto.replace("?", np.nan, inplace = True)
auto.dropna(inplace = True)
auto.head()

## Jupyter Widgets

Jupyter Widgets are interactive features that one can add to their visualizations for filtering or adjusting the plot. There are a few main types of widgets: checkboxes, buttons, and sliders. Each widget has its own special use and it is up to the designer on which fits the best for the task at hand. Within each type of widget, there are different parameters for customization. All of the parameters can be accessed using the .keys method:

In [None]:
# accessing the possible parameters using the .keys method (example used on the Checkbox Widget)
print(widgets.Checkbox().keys)

Using all of these parameters is not required but they are useful to create the desired widget.

Some of the more useful and common keys are:

    continous_update: determines if the plot updates as the user slides across values
    description:      a string acting as a label, displayed next to the widget
    disabled:         a boolean value that determines whether or not the user can interact with the widget
    indent:           a boolean value that determines whether the widget is indented or not
    layout:           used to set width and height values to the widget
    max:              the maximum value available in the widget
    min:              the minimum value available in the widget
    options:          a list of options a user can choose from
    orientation:      determines how the widget is displayed, usually 'horizontal' or 'vertical'
    readout:          determines whether or not the current value is displayed next to widget
    readout_format:   the format in which the current value is displayed next to widget
    step:             the increment in which the widget moves from one value to the next
    style:            used to customize color or font attributes of widget
    tooltip:          a text 'tip' that appears when hovering over the widget to provide guidence
    value:            the default value of the widget before user interaction


#### The Checkbox Widget

The Checkbox widget is a boolean widget, meaning it is either on or off, true or false, checked or unchecked. It creates a literal checkbox that can be checked and unchecked by the user.  

Here is a simple Checkbox implementation:

In [None]:
# simple Checkbox widget implementation
widgets.Checkbox(
    description = "Checkbox",    # description is set to "Checkbox" because that is the label I want
    indent = False,              # indent is set to False because I do not want the checkbox indented
    value = False                # value is set to False so it begins as an unchecked box
)

A checkbox can be very useful for applying a simple filter to data visualizations. Some examples include:
- Applying an average line on a visual with bar charts
- Displaying a fitted regression line to a scatter plot
- Adding a filter-by method to the visualization

Below is an example of the third option, adding a filter-by method. The visualization below plots the Miles Per Gallon variable of cars over the years. With the box unchecked, the graph provides a look into every car, no matter the amount of cylinders. But when the box is checked, the audience is provided with a break down of number of cylinders by color. 

Pro Tip: In this visualization, the palette is set to "colorblind". Often times, color maps are difficult to differentiate certain colors for those with a Color Vision Deficiency. This can lead to misinterpreted visualizations. By using a CVD friendly palette, not only can those with CVD see the clear color differences but it also leads to less of a chance of the visualization being misinterpreted (Nuñez et al. 2017).

In [None]:
# example of Checkbox used with data
checkbox = widgets.Checkbox(description = "Click to Color by Cylinders!", indent = True, value = False)

def update(use_color):                           # function to update the input for the visual based on the checkbox value                   
    if use_color:                                # when the checkbox is clicked, this block will run
        hue = "cylinders"
        title = "MPG vs Model Year by Cylinder"
        legend = "Number of Cylinders"
    else:                                        # when the checkbox is not checked, this block will run
        hue = None
        title = "MPG vs Model Year"

    plt.figure(figsize = (8,5))
    sns.scatterplot(data = auto, x = "model year", y = "mpg", hue = hue, palette = "colorblind")
    plt.title(title)
    plt.xlabel("Model Year")
    plt.ylabel("MPG")
    if use_color:
        plt.legend(title = legend)
    plt.show()

# reacts to user interaction and updates the visualization based on the checkbox value
interactive_output = widgets.interactive_output(update, {"use_color": checkbox})

# displays the checkbox and the updated visualization
display(checkbox, interactive_output)

#### The Dropdown Widget

The Dropdown widget is a selection widget, which means the value is one of the options available. The user can click on the main bar, the options will appear, and one can be chosen. 

Here is a simple implementation of the Dropdown widget:

In [None]:
# simple Dropdown widget implementation
widgets.Dropdown(
    options = ["All","Group A","Group B","Group C"],   # options is the list of available items to be chosen from
    value = "All",                                     # value is set to "All" because I want it to start off on that
    description = "Group:",                            # description is set to "Group" because that is what is being chosen
)

The dropdown widget is a great interactive tool to use when filtering on a visualization. It can be used for:
- Filtering by a specific category from a qualitative variable
- Choosing a sorting method on a set of data to display
- Determining a type of visual to present

Below is an example of the first option, filtering by a specific category. This visualization also plots the MPG of cars over the years. It uses the dropdown widget to filter by the origin variable. The widget allows for the user to dive into each origin on its own with the click of their mouse. 

Pro Tip: The initial plot is set to "all origins" and is not filtered. The filtering should be up to the user's choice and all of the data should be displayed to start. This is so that the user is not misled in the initial plot. This minimizes the chance of the audience misunderstanding the point of the graph, even if the author does not have bad intentions (Tufte, Chapter 2-3).

In [None]:
# example of Dropdown used with data
dropdown = widgets.Dropdown(options = ["All", "1", "2", "3"], value = "All", description = "Car Origin:", disabled = False)

def update(choice):                            # function to update the input for the visual based on the chosen value
    if choice == "1":                          # if "1" is chosen, this block of code will run
        data = auto[auto["origin"] == 1]
        title = "MPG over time (origin 1)"
        color = "red"
    elif choice == "2":                        # if "2" is chosen, this block of code will run
        data = auto[auto["origin"] == 2]
        title = "MPG over time (origin 2)"
        color = "orange"
    elif choice == "3":                        # if "3" is chosen, this block of code will run
        data = auto[auto["origin"] == 3]
        title = "MPG over time (origin 3)"
        color = "pink"
    else:
        data = auto                            # if "All" is chosen, this block of code will run
        title = "MPG over time (all origins)"
        color = "green"

    plt.figure(figsize = (8,5))
    sns.regplot(data = data, x = "model year", y = "mpg", color = color, scatter = False)
    plt.scatter(data["model year"], data["mpg"], color = color, s = 15)
    plt.title(title)
    plt.xlabel("Model Year")
    plt.ylabel("MPG")

# reacts to user interaction and updates the visualization based on the chosen value
interactive_output = widgets.interactive_output(update, {"choice": dropdown})

# displays the dropdown and the updated visualization
display(dropdown, interactive_output)

#### The RadioButtons Widget

The RadioButtons widget is another selection widget that allows the user to choose from the available options. Instead of a drop down bar, the user is presented with buttons for each option from which they can click to choose. 

The RadioButtons simple implementation:

In [None]:
# simple RadioButtons implementation
widgets.RadioButtons(
    options = ["All", "Group A", "Group B", "Group C"],   # options is the list of available items to be chosen from
    value = "All",                                        # value is set to "All" because I want it to start off on that
    description = "Groups:"                               # description is set to "Group" because that is what is being chosen
)

Similar to the Dropdown widget, the RadioButtons widget is great for adding an interactive filtering method to a visualization. It can be used for:
- Filtering by a specific category from a qualitative variable
- Choosing a sorting method on a set of data to display
- Determining a type of visual to present

Below is an example of the first option, filtering by a specific category. This visualization is the same as the RadioButtons visual, only this time using the RadioButtons widget to filter by the origin variable. This widget also allows for the user to look into the origin of their choice.

Pro Tip: When working with time-series data and you want to visualize the change over time, we need to pay attention to the trend and the seasonality. The trend is what happens to the variables, are they going up? Down? No change? The seasonality focuses on whether or not there are consistent changes during a specific time period (Cairo Chapter 8).

In [None]:
# example of radio button with data
button = widgets.RadioButtons(options = ["All", "1", "2", "3"], value = "All", description = "Origin:", disabled = False)

def update(choice):                                 # function to update the input for the visual based on the chosen value
    if choice == "1":                               # if "1" is chosen, this block of code will run
        data = auto[auto["origin"] == 1]
        title = "MPG over time (origin 1)"
        color = "red"
    elif choice == "2":                             # if "2" is chosen, this block of code will run
        data = auto[auto["origin"] == 2]
        title = "MPG over time (origin 2)"
        color = "orange"
    elif choice == "3":                             # if "3" is chosen, this block of code will run
        data = auto[auto["origin"] == 3]
        title = "MPG over time (origin 3)"
        color = "pink"
    else:                                           # if "All" is chosen, this block of code will run
        data = auto
        title = "MPG over time (all origins)"
        color = "green"

    plt.figure(figsize = (8,5))
    sns.regplot(data = data, x = "model year", y = "mpg", color = color, scatter = False)
    plt.scatter(data["model year"], data["mpg"], color = color, s = 15)
    plt.title(title)
    plt.xlabel("Model Year")
    plt.ylabel("MPG")

# reacts to user interaction and updates the visualization based on the chosen value
interactive_output = widgets.interactive_output(update, {"choice": button})

# displays the RadioButtons and the updated visualization
display(button, interactive_output)

#### The ToggleButtons Widget

The ToggleButtons widget is a selection widget. The user is can "toggle" from one choice to the next, similar to the RadioButtons. The choices are presented in rectangular boxes which can be styled to the creator's preference using the "button_style" parameter. 
button_style can take in different inputs such as "primary" (blue), "success" (green), "info" (teal), "warning" (orange), or "danger" (red). It can be left blank as well.

Here is a simple implementation of ToggleButtons:

In [None]:
# simple ToggleButtons implementation
widgets.ToggleButtons(
    options = ["All", "Group A", "Group B", "Group C"],   # options is the list of available items to be chosen from
    value = "All",                                        # value is set to "All" because I want it to start off on that
    description = "Group:",                               # description is set to "Group" because that is what is being chosen
    button_style = ""                                     # button_style is the default style in this implementation
)

As a selection widget, ToggleButtons is great for choosing a specific group to filter by when visualizing data. Similar to Dropdown and RadioButtons, ToggleButtons can be used for:
- Filtering by a specific category from a qualitative variable
- Choosing a sorting method on a set of data to display
- Determining a type of visual to present

Below is an example of the first option, filtering by a specific category. This visualization is the same as the other two selection widgets, but uses the ToggleButtons widget to filter by the origin variable. This widget gives the user an option to filter by the origin of their choice.

Pro Tip: Notice that the title of the visualization changes for each button that is chosen. This is so that the audience knows exactly which filter is being applied, limiting the chance of misinterpretation. A misleading visualization is sometimes unknown to the author, so we want to make sure to communicate the filter to the reader (Tufte, Chapter 2-3).

In [None]:
# example of ToggleButtons with data
button = widgets.ToggleButtons(options = ["All", "1", "2", "3"], value = "All", description = "Origin:",
                              disabled = False, button_style = "info")

def update(choice):                                 # function to update the input for the visual based on the chosen value
    if choice == "1":                               # if "1" is chosen, this block of code will run
        data = auto[auto["origin"] == 1]
        title = "MPG over time (origin 1)"
        color = "red"
    elif choice == "2":                             # if "2" is chosen, this block of code will run
        data = auto[auto["origin"] == 2]
        title = "MPG over time (origin 2)"
        color = "orange"
    elif choice == "3":                             # if "3" is chosen, this block of code will run
        data = auto[auto["origin"] == 3]
        title = "MPG over time (origin 3)"
        color = "pink"
    else:
        data = auto                                 # if "All" is chosen, this block of code will run
        title = "MPG over time (all origins)"
        color = "green"

    plt.figure(figsize = (8,5))
    sns.regplot(data = data, x = "model year", y = "mpg", color = color, scatter = False)
    plt.scatter(data["model year"], data["mpg"], color = color, s = 15)
    plt.title(title)
    plt.xlabel("Model Year")
    plt.ylabel("MPG")

# reacts to user interaction and updates the visualization based on the chosen value
interactive_output = widgets.interactive_output(update, {"choice": button})

# displays the ToggleButtons and the updated visualization
display(button, interactive_output)

#### The IntSlider Widget

The IntSlider widget is a numeric widget that allows the user to choose from a range of values on a slider. The user is presented with a slider that has a custom minimum and maximum value, along with a custom step value. 

Here is a simple IntSlider implementation:

In [None]:
# simple IntSlider implementation 
widgets.IntSlider(
    value = 10,                                  # default value of slider is set to 10
    min = 0,                                     # minimum value of slider is 0
    max = 20,                                    # maximum value of slider is 20
    step = 1,                                    # each step in the slider is one 
    description = "Slider:",                     # description is set to "Slider"
    continuous_update = False,                   # continuous_update is set to false, so that the return value 
                                                     # only updates when the user lets go of the slider
    orientation = "horizontal",                  # the slider orientation is set to "horizontal"
    readout = True,                              # the current value of the slider is shown next to it
    readout_format = "d"                         # readout_format is set to "d" to display as integer
)

As a numeric widget, IntSlider can be very useful when deciding how many of something to display on a visualization. Specifically, the IntSlider can be used to:
- Choose a number of bins in a histogram
- Determine the top x amount of some variable to plot
- Apply a coefficient or lambda on a regression

Here is an example of the first option, choosing a number of bins for a histogram. This visualization displays the distribution of Miles Per Gallon from the vehicles in the auto dataset. The widget allows the user to choose how many bins they want to utilize in the histogram.

Pro Tip: When working with a histogram (or any type of distribution), it is best to add a minimum number of bins on the graph. This is so that there is enough of a visual spread of the data. Too few bins can be misleading as it does not represent the true distribution (Cairo, Chapter 11). 

In [None]:
# example of IntSlider with data
slider = widgets.IntSlider(value = 25, min = 10, max = 40, step = 3, description = "Bins:", 
                           disabled = False, continous_update = True, orientation = "horizontal",
                           readout = True, readout_format = "d")

def update(number):                                 # function to update the number of bins based on the chosen value
    num_bins = number                               # sets num_bins equal to the chosen value on slider

    plt.figure(figsize = (10,6))
    plt.hist(auto["mpg"], bins = num_bins, color = "firebrick")
    plt.title("Distribution of MPG")
    plt.xlabel("MPG")
    plt.ylabel("Number of Vehicles")

# reacts to user interaction and updates the visualization based on the slider value
interactive_output = widgets.interactive_output(update, {"number": slider})

# displays the slider and the updated visualization
display(slider, interactive_output)

#### The IntRangeSlider Widget

The IntRangeSlider widget is a numeric widget and nearly identical to the IntSlider. The user is presented with a slider with a minimum and maximum value as well as a step value. The difference is that the user can select a range of values instead of one value.

Here is a simple implementation of the IntRangeSlider:

In [None]:
# simple IntRangeSlider implementation
widgets.IntRangeSlider(
    value = [40, 60],                                # default range for the slider is set to 40-60
    min = 0,                                         # minimum value of slider is set to 0
    max = 100,                                       # maximum value of slider is set to 100
    step = 5,                                        # step value for the range is set to 5
    description = "Range:",                          # description is set to "Range:"
    continuous_update = False,                       # continuous_update is set to false, so that the return value 
                                                         # only updates when the user lets go of the slider
    orientation = "horizontal",                      # the slider orientation is set to "horizontal"
    readout = True,                                  # the current range of the slider is shown next to it
    readout_format = "d",                            # readout_format is set to "d" to display as integers
)

The IntRangeSlider provides very useful data visualization applications as ranges are often used as filters. Some strong uses for the IntRangeSlider include:
- Filtering the visual by an age range
- Applying a range of months or years to the graph
- Selecting ranges of a variable for further analysis

Here is an example of applying a range of years to a visualization. This scatterplot shows the MPG vs Acceleration for the auto dataset. The IntRangeSlider allows the user to select a range of year values to filter by. 

Pro Tip: A study has proven that "position along a common scale" is the best aspect an author can have on their data visualization to maximize the takeaway from the audience. By setting the axes to stay the same no matter the year range, it makes the visualization so much more informative as the visual has a consistent scaling to it (Cleveland and McGill).

In [None]:
# example of IntRangeSlider with data
slider = widgets.IntRangeSlider(value = [min(auto["model year"]),max(auto["model year"])], min = min(auto["model year"]), 
                                max = max(auto["model year"]), step = 1, description = "Model Year:", disabled = False,
                               continuous_update = False, orientation = "horizontal", readout = True, readout_format = "d") 

def update(year_range):                                 # function to update the year range based on the chosen slider range
    min_year = year_range[0]
    max_year = year_range[1]
    title = "MPG vs Acceleration (19" + str(min_year) + "-19" + str(max_year) + ")"

    data = auto[(auto["model year"] >= min_year) & (auto["model year"] <= max_year)]

    plt.figure(figsize = (8,6))
    plt.scatter(data["mpg"], data["acceleration"], color = "gold")
    plt.xlabel("MPG")
    plt.ylabel("Acceleration")
    plt.xlim(5,50)
    plt.ylim(7.45,25.25)
    plt.title(title)

# reacts to user interaction and updates the visualization based on the slider value
interactive_output = widgets.interactive_output(update, {"year_range": slider})

# displays the slider and the updated visualization
display(slider, interactive_output)

#### The FloatRangeSlider Widget

The FloatRangeSlider widget is a numeric widget and the float version of the IntSlider. The user is presented with a slider with a minimum and maximum value as well as a step value. The difference is that the user can select a range of floating point values instead of only integer values.

Here is a simple implementation of the FloatRangeSlider:

In [None]:
# simple FloatRangeSlider implementation
widgets.FloatRangeSlider(
    value = [2.0,3.0],                               # default range for the slider is set to 2.0-3.0
    min = 0,                                         # minimum value of slider is set to 0
    max = 5,                                         # maximum value of slider is set to 5
    step = 0.1,                                      # step value for the range is set to 0.1
    description = "Range:",                          # description is set to "Range:"
    continuous_update = False,                       # continuous_update is set to false, so that the return value 
                                                     # only updates when the user lets go of the slider
    orientation = "horizontal",                      # the slider orientation is set to "horizontal"
    readout = True,                                  # the current range of the slider is shown next to it
    readout_format = ".1f",                          # readout_format is set to ".1f" to display as a float to the first decimal
)

The FloatRangeSlider provides very useful data visualization applications as ranges are often used as filters. Some good uses for the FloatRangeSlider include:
- Filtering the visual by a percentage range
- Applying a range temperature values
- Selecting ranges of a decimal variable for further analysis

Here is an example of applying a range of a decimal variable to a visualization. This scatterplot shows the weight of cars over time. The FloatRangeSlider allows the user to select a range of weight values with decimal accuracy. 

Pro Tip: The initial plot is set to all data and is not filtered by the slider range. The filtering should be up to the user's choice and all of the data should be displayed to start. This is so that the user is not misled in the initial plot. This minimizes the chance of the audience misunderstanding the point of the graph, even if the author does not have bad intentions (Tufte, Chapter 2-3).

In [None]:
# example of FloatRangeSlider with data
slider = widgets.FloatRangeSlider(value = [min(auto["acceleration"]),max(auto["acceleration"])], 
                                  min = min(auto["acceleration"]), max = max(auto["acceleration"]), step = 0.1, 
                                  description = "Acceleration:", disabled = False, continous_update = False, 
                                  orientation = "vertical", readout = True, readout_format = ".1f")

def update(acc_range):                                 # function to update the acceleration range based on the chosen slider range
    min_acc = acc_range[0]                             # sets min_acc to the chosen minimum range value
    max_acc = acc_range[1]                             # sets max_acc to the chosen maximum range value
    title = "Weights of Cars over time (Between " + str(min_acc) + "-" + str(max_acc) + " Acceleration)"

    data = auto[(auto["acceleration"] >= min_acc) & (auto["acceleration"] <= max_acc)]

    plt.figure(figsize = (8,6))
    sns.regplot(data = data, x = "model year", y = "weight", color = "limegreen", scatter =  False)
    plt.scatter(data["model year"], data["weight"], color = "green")
    plt.xlabel("Model Year")
    plt.ylabel("Weight")
    plt.title(title)
    
# reacts to user interaction and updates the visualization based on the slider value
interactive_output = widgets.interactive_output(update, {"acc_range": slider})

# displays the slider and the updated visualization
display(slider, interactive_output)

#### The ColorPicker Widget

The ColorPicker widget allows the user to choose a color from a wide range. The user can choose the color by clicking on the colored box, or they can type it into the bar. 

Here is a simple ColorPicker widget:

In [None]:
# simple ColorPicker implementation
widgets.ColorPicker(
    concise = False,                                    # concise is set to false to display the color name
    description = "Color:",                             # description is set to "Color:"
    value = "green"                                     # default color is set to "green"
)

The ColorPicker can be a fun addition to a data visualization by allowing the user to choose the color that is displayed. Some good ways to put this widget to use include:
- Picking an overall color for a plot
- Choosing the color of a specific feature of the visual
- Selecting a background color

Implemented below is an example of picking an overall color for a plot. The barchart shows the 10 cars with the highest MPG. The user is given a ColorPicker to determine the color of the bars.

Pro Tip: When using color on a data visualization, it should be the same color all around unless there is meaning to the color. In this case, a difference in color would not be used to filter by some data, so it is the same color applied to each bar (Few, 2008).

In [None]:
# example of ColorPicker with data
color = widgets.ColorPicker(concise = False, description = "Color:", value = "green", disabled = False)

def update(col):                                                          # function to update color of bars based on chosen color
    mpg_top10 = auto.sort_values(by="mpg", ascending=False).head(10)
    
    plt.figure(figsize=(12,6))
    plt.bar(mpg_top10["car name"], mpg_top10["mpg"], color = col)
    plt.title("Top Cars with Highest MPG")
    plt.xlabel("Car Name")
    plt.xticks(rotation = 90)
    plt.ylabel("MPG")

# reacts to user interaction and updates the visualization based on the chosen color
interactive_output = widgets.interactive_output(update, {"col": color})

# displays the ColorPicker and the updated visualization
display(color, interactive_output)

#### References
- Cairo, Alberto. "The Truthful Art Data, Charts, and Maps for Communication Alberto Cairo". 2016. 
- Few, Stephen. "Practical Rules for Using Color in Charts". 2008
- Jamie R. Nuñez, Christopher R. Anderton, Ryan S. Renslow. "An optimized colormap for the scientific community". 2017
- Tufte, Edward R. "The Visual Display of Quantitative Information". 2001. 
- William S. Cleveland & Robert McGill (1984) Graphical Perception: Theory,
Experimentation, and Application to the Development of Graphical Methods, Journal of the
American Statistical Association, 79:387, 531-554