# Plotly
Overview and introduction

Note: The `.ipynb` used to create these slides can be downloaded here for 

## Recall: Client vs server

![](img/2023-03-12-13-36-12.png)

## Recall: General idea behind interactive graphics
![](img/2023-03-10-11-41-04.png)

## What is Plotly?

* Plotly is a technical computing company headquartered in Montreal, Canada   
    ![](img/2023-03-12-03-10-44.png)
* The company develops tools for data visualization, analytics, and statistical tools, as well as graphing libraries for Python, R, MATLAB, Perl, Julia, Arduino, and REST.

## The Plotly company's tool suite

- **Dash** Open-source Python , R, Julia, framework for building analytic applications.  
- **Dash Enterprise** is Plotly’s paid product for building, testing, deploying, managing, and scaling Dash applications.
- **Chart Studio Cloud** is a free, online tool for creating interactive graphs. It has a point-and-click user interface for importing and analyzing data.
- **Chart Studio Enterprise** is a paid product that allows teams to create, style, and share interactive graphs on a single platform. It offers expanded authentication and file export options, and does not limit sharing and viewing.
- **Data visualization libraries** `Plotly.js` is an open-source JavaScript library for creating graphs. There are wrappers in Python, R, MATLAB, Node.js, Julia, Arudio, etc
- **Figure Converters** which convert matplotlib, ggplot2, and IGOR Pro graphs into interactive, online graphs.

## Where do we stand? 
* Today we focus on the client side libraries, which wrap `plotly.js` for R and Python.   
    <center>![](img/2023-03-09-15-38-31.png){width=775} <center>

## Plotly.js: Interactive controls 

Plotly plots have interactive controls, shown in the top-right,e to do the following:

![](img/2023-03-10-13-08-41.png)

- **Pan:** Move around in the plot.
- **Box Select:** Select a rectangular region of the plot to be highlighted.
- **Lasso Select:** Draw a region of the plot to be highlighted.
- **Autoscale:** Zoom to a "best" scale.
- **Reset axes:** Return the plot to its original state.
- **Toggle Spike Lines:** Show or hide lines to the axes whenever you hover over data.
- **Show closest data on hover:** Show details for the nearest data point to the cursor.
- **Compare data on hover:** Show the nearest data point to the x-coordinate of the cursor.

<sup>Source: [click here](https://www.datacamp.com/cheat-sheet/plotly-express-cheat-sheet) <sup>

## Example-1: Plotly.js 

* We already interacted a bit with `Plotly.js` in pure `JavaScript` form in Lab-3.1
* Many of the elements in pure `JavaScript` version of Plotly, such as traces, layout, and data, are also mirrored in the Python and R wrappers that we will see later.
    ![](img/2023-03-11-23-04-20.png){width=650}

## Plotly.js: Fundamental components

:::: {.columns}
::: {.column width="40%"}
* **Trace**: Describes a collection of data and the specifications about how you want the data displayed on the plotting surface, which is described by the `trace type` (scatter, box, , etc). 
* **Data**: Collection (list) of traces 
* **Layout**: Controls various structural and stylistic components of the figure (e.g. title, font, size, etc)  
:::
::: {.column width="60%"}
![](img/2023-03-11-23-11-27.png)
:::
::::
 
The Python (or R) wrappers create JSON files that map from python commands to the format needed by `plotly.js` with these fundamental components


# Plotly in Python
Introduction to `plotly.py`


## How does Plotly.py work?

:::: {.columns}
::: {.column width="50%"}
* Let's start on the python side for the sake of explanation
* Similar to MPL, The fundamental Plotly data-structure is the `Plotly figure` (i.e. charts, plots, maps and diagrams) 
  * Figures are represented as dictionaries or as instances of the plotly.graph_objects.Figure class
* Python exports the Figure object to a `JSON file` to be rendered by `plotly.js`.    
<br>
<sup>Source: [click here](https://plotly.com/python/figure-structure/) <sup>
:::  
::: {.column width="50%"}
![](img/2023-03-10-11-49-51.png){width=300}
::: 
::::

## Constructing a graph 

* When using Plotly in python there are generally two ways to create graphs.  
    <center>![](img/2023-03-10-11-58-21.png){width=600}<center>
* **Easy method:** The high-level `plotly.express` module which is a set of Python functions to return completed plotly.graph_objects.Figure objects. 
* **Advanced method**: Plotly `graph_objects` module which has more control, customization, and understanding of whats happening "under the hood"

## Plotly express vs Graph objects

* Plotly Express is a better starting point for plot creation, since it is powerful and easy
* Plotly Express functions use graph objects internally and return a plotly.graph_object
  * Therefore, you can also switch at any point to working as a graph_object

* **It is good to use the graph_objects module in the following cases:**

  - **Non-standard visualizations**: It is difficult to create not standard graphs with Plotly Express, therefore graph_objects must be used

  - **When you have a lot of customizations:** If you want to make a lot of changes to most aspects of your charts, you might end up writing the same amount of code and putting in the same amount of effort, so you might consider doing it directly using the graph_objects module.

  - **Sub-plots**: Sometimes you want to have multiple charts side by side, or in a grid. If these are different charts and not facets, then it's better to do with graph_objects.

## Aside: Embedding HTML into documents

* When using Quarto or Jupyter notebook, often we want to embed individual HTML files (e.g. Plotly output files) into our documents. 
* There are multiple ways to do this;
* If using `.qmd` or `.rmd` usually the figures will render correctly "inline" without saving the file or adding more code
* If using `.ipynb` sometimes the rendering can be buggy, especially in VS-code. If it doesn't render correctly then you can do the following.
  * (1) save the image as html to a file
  * (2) re-embed it into the document as an iframe 
    * You can embed in markdown using inline-html (recommended)
      * `<iframe src="file_name.html" height="1000" width="1000" title="Iframe Example"></iframe>`
    * Alternatively, you can also embed it using python commands (see next slide)

## Aside: Embedding HTML into documents

Additionally, you can embed in python using the following commands, however the markdown+HTML option is recommended.


```
# CREATE FIGURE OBJECT
import plotly.express as px
fig = px.line(x=["a","b","c"], y=[1,3,2], title="sample figure")

# SAVE TO FILES
file_name="./img/plotly-0.html"
fig.write_html(file_name)

# RE-LOAD AS AN IFRAME
from IPython.display import IFrame
IFrame(src=file_name, width=375, height=700)
```

<iframe src="./img/plotly-0.html" height="375" width="700"></iframe>

## Aside: Plotly in VS-code inside `.ipynb`

* If you are using Plotly within the VS Code Notebook Editor you will need to add a line of code to ensure that your plots can be seen both within VS Code and when rendered to HTML by Quarto. 
* Note that this workaround is only required for the VS Code Notebook Editor (it is not required if you are using Jupyter Lab or if you are editing a plain-text .qmd file).
* You can do this by configuring the Plotly default renderer as follows:

In [1]:
# | code-fold: false
import plotly.io as pio

pio.renderers.default = "plotly_mimetype+notebook_connected"

<sup>Source: [click here](https://quarto.org/docs/interactive/widgets/jupyter.html) <sup>

## Example: MPL to Plotly conversion 

 * We can recycle our old plots, but add interactivity, using `plotly.tools` converters.  
 * The conversion is done with `fig = tls.mpl_to_plotly(mpl_fig)` from `import plotly.tools as tls`


In [2]:
# import warnings
# warnings.filterwarnings("ignore")

# IMPORT MODULES
import numpy as np
import matplotlib.pyplot as plt
import plotly.tools as tls

# CREATE DATA ARRAYS
x = np.linspace(-2.0 * np.pi, 2.0 * np.pi, 51)
y = np.sin(x)

# MAKE A MPL FIGURE
mpl_fig = plt.figure()
plt.plot(x, y, "ko--")
plt.xlabel("x")
plt.ylabel("sin(x)")
# plt.show()

# CONVERT USING PLOTLY
fig = tls.mpl_to_plotly(mpl_fig)

# UPDATE THEME
fig.update_layout(template="plotly_white")
fig.update_layout(width=800, height=400)

# SHOW
fig.show()

# # # SAVE AND RENDER
# file_name="./img/mpl-to-plotly.html"
# fig.write_html(file_name)
# from IPython.display import IFrame
# IFrame(src=file_name, width=900, height=500)

# Plotly.py 
Getting started with `Plotly express`   
<br>
<sup> **Note**: Repeated package imports are intentional to make the code cells self contained <sup>

## Plotly Express: function arguments

* The input arguments for a Plotly express function are similar to other libraries.  
  * The typical data input is a Pandas data frame, list, or numpy array 
  * The x argument is a string naming the column to be used on the x-axis. 
  * The y argument can either be a string or a list of strings naming column(s) to be used on the y-axis.
  * Basic customization is straight-forward   
  ![](img/2023-03-10-13-23-33.png){width=800}
* **IMPORTANT**: To make stylistic changes, e.g. figure-size, we can use `fig.update_layout()`

<sup>Source: [click here](https://www.datacamp.com/cheat-sheet/plotly-express-cheat-sheet) <sup>

## Plotly Express: Available functions 

- **Basics**: [`scatter`](https://plotly.com/python/line-and-scatter/), [`line`](https://plotly.com/python/line-charts/), [`area`](https://plotly.com/python/filled-area-plots/), [`bar`](https://plotly.com/python/bar-charts/), [`funnel`](https://plotly.com/python/funnel-charts/), [`timeline`](https://plotly.com/python/gantt/)
- **Part-of-Whole**: [`pie`](https://plotly.com/python/pie-charts/), [`sunburst`](https://plotly.com/python/sunburst-charts/), [`treemap`](https://plotly.com/python/treemaps/), [`icicle`](https://plotly.com/python/icicle-charts/), [`funnel_area`](https://plotly.com/python/funnel-charts/)
- **1D Distributions**: [`histogram`](https://plotly.com/python/histograms/), [`box`](https://plotly.com/python/box-plots/), [`violin`](https://plotly.com/python/violin/), [`strip`](https://plotly.com/python/strip-charts/), [`ecdf`](https://plotly.com/python/ecdf-plots/)
- **2D Distributions**: [`density_heatmap`](https://plotly.com/python/2D-Histogram/), [`density_contour`](https://plotly.com/python/2d-histogram-contour/), [`imshow`](https://plotly.com/python/imshow/)
- **3-Dimensional**: [`scatter_3d`](https://plotly.com/python/3d-scatter-plots/), [`line_3d`](https://plotly.com/python/3d-line-plots/)
- **Multidimensional**: [`scatter_matrix`](https://plotly.com/python/splom/), [`parallel_coordinates`](https://plotly.com/python/parallel-coordinates-plot/), [`parallel_categories`](https://plotly.com/python/parallel-categories-diagram/)
- **Tile Maps**: [`scatter_mapbox`](https://plotly.com/python/scattermapbox/), [`line_mapbox`](https://plotly.com/python/lines-on-mapbox/), [`choropleth_mapbox`](https://plotly.com/python/mapbox-county-choropleth/), [`density_mapbox`](https://plotly.com/python/mapbox-density-heatmaps/)
- **Outline Maps**: [`scatter_geo`](https://plotly.com/python/scatter-plots-on-maps/), [`line_geo`](https://plotly.com/python/lines-on-maps/), [`choropleth`](https://plotly.com/python/choropleth-maps/)
- **Polar Charts**: [`scatter_polar`](https://plotly.com/python/polar-chart/), [`line_polar`](https://plotly.com/python/polar-chart/), [`bar_polar`](https://plotly.com/python/wind-rose-charts/)
- **Ternary Charts**: [`scatter_ternary`](https://plotly.com/python/ternary-plots/), [`line_ternary`](https://plotly.com/python/ternary-plots/)

<sup>Source: [click here](https://plotly.com/python/plotly-express/) <sup>

## The `Gap-minder` data-set 

* In the following examples, we will be using the well known `gapminder` data-set
* The Gapminder is a comprehensive data set containing information on global demographics, economics, health, and social indicators over time.
* Let's take a quick look 


In [None]:
import plotly.express as px

df = px.data.gapminder()
df = df.drop(["iso_num"], axis=1)  # DROP COLUMN
print("Shape =", df.shape, "\n")
print(df.head())

Shape = (1704, 7) 

       country continent  year  lifeExp       pop   gdpPercap iso_alpha
0  Afghanistan      Asia  1952   28.801   8425333  779.445314       AFG
1  Afghanistan      Asia  1957   30.332   9240934  820.853030       AFG
2  Afghanistan      Asia  1962   31.997  10267083  853.100710       AFG
3  Afghanistan      Asia  1967   34.020  11537966  836.197138       AFG
4  Afghanistan      Asia  1972   36.088  13079460  739.981106       AFG


## The `Tips` data-set

* In the following examples, we will also be using the well known `Tips` data-set
* This dataset contains information on the tips collected by a restaurant waiter during different shifts. It includes information about the total bill, the tip amount, the gender of the person paying, whether they smoked or not, the day of the week, the time of day, the size of the group, and the type of meal (lunch or dinner).
* Let's take a quick look 



In [38]:
import plotly.express as px
import seaborn as sns

tips = px.data.tips()
print(tips)

     total_bill   tip     sex smoker   day    time  size
0         16.99  1.01  Female     No   Sun  Dinner     2
1         10.34  1.66    Male     No   Sun  Dinner     3
2         21.01  3.50    Male     No   Sun  Dinner     3
3         23.68  3.31    Male     No   Sun  Dinner     2
4         24.59  3.61  Female     No   Sun  Dinner     4
..          ...   ...     ...    ...   ...     ...   ...
239       29.03  5.92    Male     No   Sat  Dinner     3
240       27.18  2.00  Female    Yes   Sat  Dinner     2
241       22.67  2.00    Male    Yes   Sat  Dinner     2
242       17.82  1.75    Male     No   Sat  Dinner     2
243       18.78  3.00  Female     No  Thur  Dinner     2

[244 rows x 7 columns]


## Plotly-express hello world:  scatter-plot

* Here is a very basic example using the defaults

In [3]:
import plotly.express as px
import seaborn as sns

# GET DATA
df = sns.load_dataset("penguins")
# print(df.keys())

# GENERATE PLOTLY FIGURE
fig = px.scatter(
    data_frame=df,
    x="bill_length_mm",
    y="body_mass_g",
)

fig.show()

## Basic customization 
* Now lets add some basic customization. 

In [4]:
import plotly.express as px
import seaborn as sns

# GET DATA
df = sns.load_dataset("penguins")
# print(df.keys())
df = df.dropna()  # drop rows with nan

# GENERATE PLOTLY FIGURE
fig = px.scatter(
    data_frame=df,
    x="bill_length_mm",
    y="body_mass_g",
    # color="body_mass_g",
    color="species",
    template="plotly_white",  # THEME
    labels={"body_mass_g": "Body mass (grams)", "bill_length_mm": "Bill length (mm)"},
    title="Visualization of penguins species phenotypical clustering",
)
# MAKE POINTS LARGER AND CHANGE FONT
fig.update_traces(marker_size=14)
fig.update_layout(font={"size": 18})

# SHOW
fig.show()

## Modified version 
* The following produces the same plot, but from a list instead of a data frame
* Notice the color-palette automatically changes for qualitative vs quantitative variables

In [5]:
import plotly.express as px
import seaborn as sns

# GET DATA
df = sns.load_dataset("penguins")
# print(df.keys())
df = df.dropna()  # drop rows with nan
# print(list(df["bill_length_mm"]))

# GENERATE PLOTLY FIGURE
fig = px.scatter(
    x=list(df["bill_length_mm"]),
    y=list(df["body_mass_g"]),
    color=list(df["body_mass_g"]),
    template="plotly_white",  # THEME
    labels={"x": "Body mass (grams)", "y": "Bill length (mm)"},
    title="Visualization of penguins species phenotypical clustering",
)
# MAKE POINTS LARGER AND CHANGE FONT
fig.update_traces(marker_size=14)
fig.update_layout(font={"size": 18}, height=500)

# SHOW
fig.show()

## Example: Plotting a dictionary object

* We can also plot from a dictionary data structure
* To plot from a dictionary, the x and y are typically stored as list (or numpy arrays). 


In [35]:
# GENERATE PLOTLY FIGURE OBJECT
import plotly.express as px
import numpy as np

my_dict = {"dates": ["2020-01-01", "2020-01-02"], "y_vals": np.array([100, 200])}
fig = px.scatter(my_dict, x="dates", y="y_vals")

fig.update_layout(height=500, width=500)

fig.show()

## Pair-plot: Default

As usual with most packages, the default styling is quite bad

In [None]:
import plotly.express as px
import seaborn as sns

penguin = sns.load_dataset("penguins")

fig = px.scatter_matrix(penguin, width=600, height=600)
fig.show()


## Pair-plot: Customized {.scrollable}

Even just a few minor tweaks brings the graph much closer to `publication quality`

In [None]:
import plotly.express as px
import seaborn as sns

df = sns.load_dataset("penguins")

fig = px.scatter_matrix(
    df,
    dimensions=[
        "bill_length_mm",
        "bill_depth_mm",
        "flipper_length_mm",
        "body_mass_g",
    ],
    color="species",
    width=900,
    height=900,
    labels=dict(
        bill_length_mm="Bill length (mm)",
        bill_depth_mm="Bill depth (mm)",
        flipper_length_mm="Flipper length (mm)",
        body_mass_g="Body mass (g)",
    ),
).update_traces(diagonal_visible=False)

fig.update_layout(template="plotly_white")
fig.show()

## Line-plots

* Here we show a simple examples of line-plots with `Plotly-express`

In [8]:
import plotly.express as px

# EXAMPLE-1
df = px.data.gapminder()

fig = px.line(
    df,
    x="year",
    y="lifeExp",
    color="continent",
    line_group="country",
    hover_name="country",
    height=500,
    width=1000,
    template="presentation",
    labels={"lifeExp": "Life expectancy (years)", "year": "Time (years)"},
)
fig.update_layout(showlegend=True)
fig.show()

## Bar graphs {.scrollable}

* Here we show various simple examples of bar graphs with `Plotly-express`

In [51]:
# | column: screen-inset-shaded
# | layout-nrow: 1

import plotly.express as px

tips = px.data.tips()
# print(tips)

# DEFAULT
fig = px.bar(
    tips,
    x="day",
    y="total_bill",
    width=500,
    height=500,
    labels={
        "day": "Day",
        "total_bill": "Total bill",
    },
)
fig.show()

# ADD A GROUPING VARIABLE
fig = px.bar(
    tips,
    x="day",
    barmode="group",
    y="total_bill",
    color="smoker",
    category_orders={"day": ["Thur", "Fri", "Sat", "Sun"]},
    width=500,
    height=500,
    labels={"day": "Day", "total_bill": "Total bill"},
)
fig.show()

In [52]:
# ADD FACETING GROUPING VARIABLE
fig = px.bar(
    tips,
    x="day",
    y="total_bill",
    color="smoker",
    barmode="group",
    facet_col="sex",
    category_orders={
        "day": ["Thur", "Fri", "Sat", "Sun"],
        "sex": ["Male", "Female"],
        "smoker": ["Yes", "No"],
    },
    labels={
        "day": "Day",
        "total_bill": "Total bill",
        "smoker": "Smoker",
        "sex": "Sex",
    },
    width=1000,
    height=400,
)
fig.show()

## Histograms 

* **Aside**: Combining sub-plots from various figures, which is different that "facets" (see below), in Plotly actually requires `graph objects`, however you can have a similar effect with `Quarto` using the following in the `code cell`
```
#| column: screen-inset-shaded
#| layout-nrow: 1
```
<br>

In [55]:
# | column: screen-inset-shaded
# | layout-nrow: 1

import plotly.express as px
import seaborn as sns

tips = px.data.tips()
# print(tips)

w = 325
h = w

# HISTOGRAM, NORMALIZED TO HAVE AREA 1
fig = px.histogram(
    tips,
    x="total_bill",
    nbins=20,
    histnorm="probability density",
    color_discrete_sequence=["indianred"],
    labels={"total_bill": "Total bill"},
    width=w,
    height=h,
)
fig.show()

# GROUPED HISTOGRAM
fig = px.histogram(
    tips,
    x="total_bill",
    color="sex",
    histnorm="probability density",
    labels={"total_bill": "Total bill"},
    width=w,
    height=h,
)
fig.show()

# THE HISTOGRAM, APPLIED TO A CATEGORICAL VARIABLE,
# PRODUCES A FREQUENCY BAR PLOT
fig = px.histogram(
    tips,
    x="day",
    category_orders={"day": ["Thur", "Fri", "Sat", "Sun"]},
    width=w,
    height=h,
)

# ORDER BY VALUE
fig.update_xaxes(categoryorder="total ascending")
fig.show()

# Bar plots of values of one variable grouped by categories
# of another
fig = (
    px.histogram(tips, x="day", y="tip", histfunc="avg", width=400, height=400)
    .update_xaxes(categoryorder="total ascending")
    .update_layout(
        yaxis_tickformat="$",
        width=w,
        height=h,
    )
)
fig.show()

## Boxplots, strip-plots, and violin plots 
* A `violin plot` displays the distribution of a continuous variable through a kernel density plot, mirrored and joined along a central axis, and can also include box plots or individual points.
* A `strip plot` displays the distribution of a continuous variable as a sequence of points along a categorical axis, often with jitter or adjustment to minimize overlapping, and can also display multiple groups side by side or stacked.

In [46]:
# | column: screen-inset-shaded
# | layout-nrow: 1

import plotly.express as px
import seaborn as sns

tips = px.data.tips()
# print(tips)

w = 350
h = w

# BOXPLOT
fig = px.box(
    tips,
    x="day",
    y="total_bill",
    labels={"total_bill": "Total bill"},
    width=w,
    height=h,
)
fig.show()

# VIOLIN PLOT
fig = px.violin(
    tips,
    x="day",
    y="total_bill",
    labels={"total_bill": "Total bill"},
    width=w,
    height=h,
)
fig.show()

# STRIP PLOT
fig = px.strip(
    tips,
    x="day",
    y="total_bill",
    labels={"total_bill": "Total bill"},
    width=w,
    height=h,
)
fig.show()

## Marginal plot-1

A marginal plot, controlled with `marginal_x` and `marginal_y`, displays two variables along the x and y axes, with individual distributions on the margins.  

In [64]:
import plotly.express as px
import seaborn as sns

tips = px.data.tips()
# print(tips)

fig = px.scatter(
    tips,
    x="total_bill",
    y="tip",
    marginal_x="histogram",
    marginal_y="violin",
    labels=dict(total_bill="Total bill", tip="Tip"),
    title="Tips vs Total bill",
    width=550,
    height=550,
)
fig.show()

## Marginal plot-2:

Many of the Plotly graphs allow the addition of marginal plots, e.g. here we show a `density_heatmap` instead of a `scatter` plot

In [65]:
import plotly.express as px
import seaborn as sns

tips = px.data.tips()
# print(tips)

fig = px.density_heatmap(
    tips,
    x="total_bill",
    y="tip",
    marginal_x="histogram",
    marginal_y="histogram",
    color_continuous_scale=px.colors.sequential.Viridis,
    nbinsx=50,
    nbinsy=50,
    labels=dict(total_bill="Total bill", tip="Tip"),
    title="Joint distribution of tip and total bill",
    width=550,
    height=550,
)

fig.show()

## Marginal plot-3:

In [62]:
import plotly.express as px
import seaborn as sns

tips = px.data.tips()
# print(tips)

# # GROUPED HISTOGRAM
fig = px.histogram(
    tips,
    x="total_bill",
    color="sex",
    facet_col="day",
    nbins=10,
    marginal="rug",  # or 'rug' or 'violin'
    # histnorm="probability density",
    labels={"total_bill": "Total bill"},
    width=1000,
    height=600,
)
fig.show()

## Example: Parallel coordinate plots

* Parallel coordinate plots are a way to visualize multivariate data by plotting each variable on a separate axis and connecting them with lines to show relationships and patterns between the variables. These are created with `px.parallel_coordinates()`
* The lines tie together points in the same row in the data-frame, and the height along each respective axis shows the variables value.

In [11]:
import plotly.express as px
import seaborn as sns

df = sns.load_dataset("penguins")
df = df.dropna()  # drop rows with nan

df["species"] = df.species.astype("category")
df["species_id"] = df.species.cat.codes
px.parallel_coordinates(
    df,
    color="species_id",
    color_continuous_scale=px.colors.diverging.Tealrose,
    height=300,
    width=1000,
)

## Tree-maps

* A tree-map graph is a type of visualization that uses nested rectangles to display hierarchical data. Each rectangle represents a category, and its size corresponds to the value of that category. This is done with the `px.treemap()` function

In [12]:
import plotly.express as px
import numpy as np

df = px.data.gapminder().query("year == 2007")
df["world"] = "world"  # in order to have a single root node
fig = px.treemap(
    df,
    path=["world", "continent", "country"],  # << sets hierarchy
    values="pop",
    color="lifeExp",
    hover_data=["iso_alpha"],
    color_continuous_scale="RdBu",
    color_continuous_midpoint=np.average(df["lifeExp"], weights=df["pop"]),
)
fig.show()


## Plotly: 3D graphs
* We can also generate graphics in 3D Cartesian space using `px.scatter_3d`

In [15]:
# SOURCE: https://towardsdatascience.com/cheat-codes-to-better-
# visualisations-with-plotly-express-21caece3db01

import plotly.express as px
import seaborn as sns

# GET DATA
election = px.data.election()

# GENERATE FIGURE
fig = px.scatter_3d(
    election,  # dataframe
    x="Joly",  # x-values column
    y="Coderre",  # y-values column
    z="Bergeron",  # z-values column
    color="winner",  # column shown by color
    width=600,
    height=600,
    hover_name="district",  # hover title
    symbol="result",  # column shown by shape
    color_discrete_map={"Joly": "blue", "Bergeron": "green", "Coderre": "red"},
)
# specific colors for x,y,z values            )
fig.show()

<!-- import plotly.express as px
import seaborn as sns

# SOURCE: https://towardsdatascience.com/cheat-codes-to-better-
# visualisations-with-plotly-express-21caece3db01

gapminder = px.data.gapminder()

fig = px.scatter_geo(
    gapminder,  # dataframe
    locations="iso_alpha",  # location code
    color="continent",  # column shown by color
    hover_name="country",  # hover info title
    size="pop",  # column shown by size
    animation_frame="year",  # column animated
    projection="orthographic",  # type of map
)
fig.show() -->

## Example: Plotly themes {.scrollable}

* Plotly has various built in themes, similar to MPL and GGPLOT.
* The following code demonstrates the various themes with the Gap-minder dataset
<!-- * Also, note the use of the log scale on the x-axis -->


In [16]:
# https://plotly.com/python/templates/
import plotly.express as px

df = px.data.gapminder()
df_2007 = df.query("year==2007")
k = 0
for template in [
    "plotly",
    "plotly_white",
    "plotly_dark",
    "ggplot2",
    "seaborn",
    "none",
    "simple_white",
]:
    fig = px.scatter(
        df_2007,
        x="gdpPercap",
        y="lifeExp",
        size=df_2007["pop"],
        color="continent",
        log_x=True,
        size_max=60,
        template=template,
        title="Gapminder 2007: '%s' theme" % template,
    )

    file_name = "./img/plotly-theme-" + str(k) + ".html"
    # print(file_name)
    fig.write_html(file_name)
    fig.show()
    k += 1

# from IPython.display import IFrame
# IFrame(src=file_name, width=1000, height=500)

## Facets {.scrollable}

* Of course we also often want to include subplots through faceting. This is done using the `facet_col` and `facet_col_wrap` commands
* As mentioned above, combining figures objects into subplots requires `plotly.go`

In [17]:
import plotly.express as px

# import seaborn as sns

gap = px.data.gapminder()
fig = px.line(
    data_frame=gap,
    x="year",
    y="lifeExp",
    color="continent",
    facet_col="continent",
    # line_group="country",
    facet_col_wrap=3,  # << facet_col is the key
    labels={"lifeExp": "Life expectancy"},  # , "year" : 'Time (years)'},
    template="plotly_white",
    width=1000,
    height=1000,
).update_layout(showlegend=False)
fig.show()

## Customization the tool-tip

* We can modify the hover options using the following command

<!-- * add more from https://plotly.com/python/styling-plotly-express/ -->

In [18]:
import plotly.express as px
import seaborn as sns

df = sns.load_dataset("penguins")
df = df.dropna()  # drop rows with nan

fig = px.scatter(
    data_frame=df,
    x="bill_length_mm",
    y="body_mass_g",
    size=df["flipper_length_mm"],
    color="species",
    template="plotly_white",
    labels={
        "bill_length_mm": "Bill length (mm)",
        "body_mass_g": "Body mass (g)",
        "species": "Species",
    },
    title="Palmer Penguins",
)

# UPDATE TOOL TIP
fig.update_traces(
    customdata=df,
    hovertemplate="Island: %{customdata[1]} <br>Sex: %{customdata[6]}",
)

fig.show()

## Animation and sliders

Plotly provides a range of tools that makes adding animations to visualizations easy.


In [19]:
px.scatter(
    px.data.gapminder(),
    x="gdpPercap",
    y="lifeExp",
    animation_frame="year",
    animation_group="country",
    size="pop",
    color="continent",
    hover_name="country",
    template="plotly_white",
    log_x=True,
    size_max=45,
    range_x=[100, 100000],
    range_y=[25, 90],
)

## Choropleths

* We will cover `Choropleths` in more detail during the `Geo-spatial module`. 
* However, for now, all you need to know is that a choropleth is a type of map that uses colors or shading to represent different values of a particular data variable across a geographic area, such as a country, state, or city. 
* The choropleth map divides the area into regions or polygons, usually based on administrative boundaries, and then assigns a color or shade to each region based on the value of the data variable being represented. 
* For example, if the data variable is `population density`, then regions with higher density would be shaded darker than regions with lower population density.



## Example: Choropleths

Plotly can internally create the map using location tags for countries, such as `USA`.

In [None]:
import plotly.express as px
import seaborn as sns

# SOURCE: https://towardsdatascience.com/cheat-codes-to-better-
# visualisations-with-plotly-express-21caece3db01

gapminder = px.data.gapminder()

fig = px.choropleth(
    gapminder,  # dataframe
    locations="iso_alpha",  # location code
    color="lifeExp",  # column shown by color
    hover_name="country",  # hover info title
    animation_frame="year",  # column animated
    range_color=[20, 80],  # color range
)
fig.show()

# Plotly.py 
`Graph objects`

<sup> **Note**: Graph objects are covered heavily in the lab, so we only introduce the fundamentals here<sup>

## Motivation 

* As we have seen, using `Plotly express` is generally easy and straight-forward. 
* However, when you want to increase the complexity of a graph, you often need to switch to `plotly graph objects` and things can get quite involved. 
* **Understanding Graph objects gives you FULL customization control over your graphs**
* For example, sliders, maps, and drop-down menus are easy individually with `Plotly express`, however merging them with `plotly graph objects` is non-trivial.
    [![](img/2023-03-12-14-44-39.png)](img/2023-03-12-14-44-39.png)


## Structure of a graph object

* `Plotly figures` are represented as hierarchical trees, with `Graph_objects` being the root note, and child-nodes called `attributes`.
* Graph_objects have three top-level attributes: `data`, `layout`, and `frames` (frames are only needed for animated plots)
  * Notice the connection of these attributes to pure `JavaScript` mentioned earlier.
* **Data:** This is a list of dictionaries referred to as "traces"
  - The `trace` represents a set of related graphical marks in a figure. 
  - Each trace must have a `type` attribute which defines the other allowable attributes.
  - Each trace has one of more than 40 possible types (see below for a list organized by subplot type, including e.g. [`scatter`](https://plotly.com/python/line-and-scatter/), [`bar`](https://plotly.com/python/bar-charts/), [`pie`](https://plotly.com/python/pie-charts/), [`surface`](https://plotly.com/python/3d-surface-plots/), [`choropleth`](https://plotly.com/python/choropleth-maps/) etc) 
* **Layout**: Controls various structural and stylistic components (e.g. title, font, size, etc)  

<sup>Sources: [figure structure](https://plotly.com/python/figure-structure/) [Graph objects](https://plotly.com/python/graph-objects/) <sup>


## Graph_object Structure summary

![](img/2023-03-12-17-03-01.png)

## Various available traces 

* **2D Cartesian trace types, and Subplots**.  
    [![](img/2023-03-12-15-34-01.png){width=720}](img/2023-03-12-15-34-01.png)
* **Geo-spatial trace Types**.  
    [![](img/2023-03-12-15-34-35.png){width=720}](img/2023-03-12-15-34-35.png)

<sup>For more [click here](https://plotly.com/python/figure-structure/)!<sup>


## Looking under the hood {.scrollable}

:::: {.columns}
::: {.column width="50%"}
* Viewing the underlying data structure for any plotly.graph_objects.Figure object can be done via print(fig) or  fig.show("json").
* Figures also support fig.to_dict() and fig.to_json() methods.
:::  
::: {.column width="50%"}
![](img/2023-03-10-12-08-27.png){width=400}
::: 
::::


In [20]:
import plotly.express as px

fig = px.line(x=["a", "b", "c"], y=[1, 3, 2], title="sample figure")
print(fig)
# fig.show()

Figure({
    'data': [{'hovertemplate': 'x=%{x}<br>y=%{y}<extra></extra>',
              'legendgroup': '',
              'line': {'color': '#636efa', 'dash': 'solid'},
              'marker': {'symbol': 'circle'},
              'mode': 'lines',
              'name': '',
              'orientation': 'v',
              'showlegend': False,
              'type': 'scatter',
              'x': array(['a', 'b', 'c'], dtype=object),
              'xaxis': 'x',
              'y': array([1, 3, 2]),
              'yaxis': 'y'}],
    'layout': {'legend': {'tracegroupgap': 0},
               'template': '...',
               'title': {'text': 'sample figure'},
               'xaxis': {'anchor': 'y', 'domain': [0.0, 1.0], 'title': {'text': 'x'}},
               'yaxis': {'anchor': 'x', 'domain': [0.0, 1.0], 'title': {'text': 'y'}}}
})


## Graph_objects: Hello world

* When using graph objects (without animation), you typically do the following
  * `(1)` Initialize the figure
  * `(2)` Add one or more traces
  * `(3)` customize the layout 

In [75]:
import plotly.graph_objects as go

# DATAFRAME
df = px.data.gapminder()

# INITIALIZE GRAPH OBJECT
fig = go.Figure()

# ADD TRACES FOR THE DATA-FRAME
fig.add_trace(  # Add A trace to the figure
    go.Scatter(  # Specify the type of the trace
        x=df["gdpPercap"],  # Data-x
        y=df["lifeExp"],  # Data-y
        mode="markers",
        # note the re-normalization of population to map to width to units of "pixels"
        marker=dict(
            size=50 * (df["pop"] / max(df["pop"])) ** 0.5,
            color=df["pop"],
            showscale=True,
            colorscale="Viridis",
            symbol="circle",
        ),
        opacity=1.0,
    )
)

# SET THEME, AXIS LABELS, AND LOG SCALE
fig.update_layout(
    template="plotly_white",
    xaxis_title="National GDP (per capita)",
    yaxis_title="Life expectancy (years)",
    title="Country comparison: color & size = population",
    height=400,
    width=800,
)
fig.update_xaxes(type="log")

fig.show()

## Basic charts with graph_objects
![](img/2023-03-12-15-27-15.png){width=650}
<!-- [![](img/2023-03-12-15-27-15.png){width=600}](img/2023-03-12-15-27-15.png) -->

<!-- <sup> [Source](https://images.plot.ly/plotly-documentation/images/python_cheat_sheet.pdf) <sup>  -->

## Additional charts with Graph objects

![](img/2023-03-12-15-27-51.png){width=775}
<!-- [![](img/2023-03-12-15-27-51.png){width=600}](img/2023-03-12-15-27-51.png) -->

<!-- <sup> [Source](https://images.plot.ly/plotly-documentation/images/python_cheat_sheet.pdf) <sup>  -->

## Subplots

With `plotly.graph_objects` we can finally combine figures in a subplot! [source](https://plotly.com/python/subplots/)

In [68]:
from plotly.subplots import make_subplots
import plotly.graph_objects as go

fig = make_subplots(rows=1, cols=2)

fig.add_trace(
    go.Scatter(x=[1, 2, 3], y=[4, 5, 6]),
    row=1, col=1
)

fig.add_trace(
    go.Scatter(x=[20, 30, 40], y=[50, 60, 70]),
    row=1, col=2
)

fig.update_layout(height=600, width=800, title_text="Side By Side Subplots")
fig.show()

## Multiple Subplots

Here we show a 2 x 2 subplot grid with each subplot populated with a single scatter trace.


In [69]:
import plotly.graph_objects as go
from plotly.subplots import make_subplots

fig = make_subplots(rows=2, cols=2, start_cell="bottom-left")

fig.add_trace(go.Scatter(x=[1, 2, 3], y=[4, 5, 6]),
              row=1, col=1)

fig.add_trace(go.Scatter(x=[20, 30, 40], y=[50, 60, 70]),
              row=1, col=2)

fig.add_trace(go.Scatter(x=[300, 400, 500], y=[600, 700, 800]),
              row=2, col=1)

fig.add_trace(go.Scatter(x=[4000, 5000, 6000], y=[7000, 8000, 9000]),
              row=2, col=2)

fig.show()

## Resources 

* *Graph objects are covered heavily in the lab, so we only introduced the basics here*
* Also the documentation on the Plotly website is very good: [https://plotly.com/python/](https://plotly.com/python/)
* This provides a massive collection of examples, for both `plotly-express` and `graph-objects`, which can get you started on almost any task.
    ![](img/2023-03-12-15-10-34.png){width=775}


# Plotly-R
A quick introduction

## Introduction

* Plotly has an R-based API that covers most but not all of Plotly's capabilities.
* Plotly in `R` was already introduced in this week's reading assignment: [click here](https://plotly-r.com/overview.html), so we won't go as deep here as we did with python. 
* Furthermore, the concepts from the python section, e.g. data, traces, layout, etc, also apply to the `R` case
* Similar to MPL, the Plotly package does provide conversion of `ggplot2` plots to Plotly using a single function, `ggplotly`
  * This is fantastic, since developing graphs in ggplot is more familiar.
  * You may be stuck with default settings though 

## How does Plotly-R work. 

The fundamentals are similar to what we saw in python, with different data-structures

![](img/2023-03-12-15-48-51.png){width=850}

## Plotly-R: Hello world

This is about as simple as it gets!

In [31]:
# GENERATE PLOT
library(plotly)

fig <- plot_ly(data = iris, x = ~Sepal.Length, y = ~Petal.Length)

# SAVE PLOT
htmlwidgets::saveWidget(as_widget(fig), "img/plotly-R.html", selfcontained=TRUE)


No trace type specified:
  Based on info supplied, a 'scatter' trace seems appropriate.
  Read more about this trace type -> https://plotly.com/r/reference/#scatter

No scatter mode specifed:
  Setting the mode to markers
  Read more about this attribute -> https://plotly.com/r/reference/#scatter-mode



<!-- embed plot  -->
<iframe src="img/plotly-R.html" height="450" width="700" title="Iframe Example"></iframe>

## Basic customization

Lets improve the plot a bit with some basic customization of the layout

In [33]:
# GENERATE PLOT
library(plotly)

fig <- plot_ly(data = iris, x = ~Sepal.Length, y = ~Petal.Length, color = ~Species)

fig <- fig %>% layout(title="Iris data set", xaxis=list(title='Sepal length (mm)'),
       yaxis=list(title='Pedal length (mm)'),template="plotly_white")

# SAVE PLOT
htmlwidgets::saveWidget(as_widget(fig), "img/plotly-R-2.html", selfcontained=TRUE)


No trace type specified:
  Based on info supplied, a 'scatter' trace seems appropriate.
  Read more about this trace type -> https://plotly.com/r/reference/#scatter

No scatter mode specifed:
  Setting the mode to markers
  Read more about this attribute -> https://plotly.com/r/reference/#scatter-mode



<iframe src="img/plotly-R-2.html" height="500" width="800" title="Iframe Example"></iframe>

## Basic customization

Here is a similar example, with the penguins data-set

In [6]:

library(plotly)
library(palmerpenguins)
df <- penguins; #print(df)

fig <- plot_ly(penguins,
x= ~bill_length_mm,
y= ~body_mass_g,
color = ~species,
type='scatter', 
mode='markers',
hovertemplate = paste("Species:",penguins$species,'<br>Island: ',penguins$island,"<br>Sex:",penguins$sex)
) %>% 

# ADD ANNOTATION
plotly::layout(
xaxis = list(title='Bill length (mm)'),
yaxis = list(title='Body mass (g)'),
title = 'Palmer Penguins'
)
htmlwidgets::saveWidget(as_widget(fig), "img/plotly-R-scatter.html", selfcontained=TRUE)


"Ignoring 2 observations"


<iframe src="img/plotly-R-scatter.html" height="500" width="800"></iframe>

## Example: Simple line-plot

You can customize the tooltip using the `hovertemplate` argument.

In [4]:
library(plotly)
library(gapminder)

df <- filter(gapminder, continent=='Asia')
# print(df)

fig <- plot_ly(
df, 
x = ~year, 
y = ~lifeExp,
color = ~country, 
type='scatter', mode='lines',
#CUSTOMIZE TOOL-TIP
hovertemplate=paste('<b>',df$country,'</b><br>Year=%{x}<br>Life Expectancy=%{y:.2f} yrs')) %>% 

plotly::layout(xaxis=list(title='Year'),
yaxis=list(title='Life Expectancy'),
showlegend=FALSE)

# SAVE PLOT
htmlwidgets::saveWidget(as_widget(fig), "img/plotly-R-lineplot.html", selfcontained=TRUE)


"n too large, allowed maximum for palette Set2 is 8
Returning the palette you asked for with that many colors
"
"n too large, allowed maximum for palette Set2 is 8
Returning the palette you asked for with that many colors
"


<iframe src="img/plotly-R-lineplot.html" height="500" width="800"></iframe>

## Subplots

In [16]:
# install.packages('mvtnorm')
library(plotly)

s <- matrix(c(1, -.75, -.75, 1), ncol = 2)
obs <- mvtnorm::rmvnorm(500, sigma = s)
fig <- plot_ly(x = obs[,1], y = obs[,2],height=500,width=800)
fig <- subplot(
  fig %>% add_markers(alpha = 0.2),
  fig %>% add_histogram2d()
)

htmlwidgets::saveWidget(as_widget(fig), "img/plotly-R-subplots.html", selfcontained=TRUE)


<iframe src="img/plotly-R-subplots.html" height="500" width="800"></iframe>

## Resources 

* Similar to python, the documentation for `R` on the Plotly website is very good: [https://plotly.com/r/](https://plotly.com/r/)
* This provides a massive collection of examples,to get you started on almost any task.    
    ![](img/2023-03-12-16-03-19.png){width=775}


# Plot digitizers
Optional: additional content

## Overview


:::: {.columns}
::: {.column width="40%"}
* This is un-related to Plotly, however, `plot digitizers` are a useful tool for extracting the `raw numeric data` from plots quickly 
* [Web plot digitizer](https://apps.automeris.io/wpd/) is a free online tool for doing this
* **Important**: This process isn't perfect, so use extra care, results should be double-checked, and the process documented. Especially when results are consequential. Furthermore, digitized results may not be suitable for academic publications.  
:::
::: {.column width="60%"}
[ ![](img/2023-03-12-12-16-08.png){width=700} ](img/2023-03-12-12-16-08.png)
:::
::::

## Digitization process: 

* 1. **Upload image**.   
  [ ![](img/2023-03-12-12-27-16.png){width=900} ](img/2023-03-12-12-27-16.png)
* 2. **Define coordinate axis**.   
  [ ![](img/2023-03-12-12-28-00.png){width=900} ](img/2023-03-12-12-28-00.png)


## Digitization process: 
* 3. **Digitize**: Either manually or using the automated method
  [ ![](img/2023-03-12-12-35-11.png){width=900} ](img/2023-03-12-12-35-11.png)
* 4. **Export data:** Either to a file or export automatically to `Plotly chart-studio`
  [ ![](img/2023-03-12-12-33-21.png){width=600} ](img/2023-03-12-12-33-21.png)


## Plotly Chart-Studio: Quick demo (optional) 
* `Plotly Chart Studio` is an online platform for creating, sharing, and publishing interactive charts, graphs, and dashboards. It offers a variety of visualization options and customization features.
* It has a "point and click" interface, similar to `Tableau`

![](img/2023-03-12-12-42-22.png){width=850}

## Conclusion

**Disclaimer:** This document is not an original work. It contains content from many sources and is **for educational purposes only**. 

### Break

Lets take a 10 minute break before moving onto the lab.  
![](img/2023-01-11-15-32-08.png){width=600}