In [1]:
import pandas as pd
import plotly.express as px

In [2]:
data = pd.read_csv("wdi.csv")
df = data[(data.continent.isin(["Europe","Africa"])) & (data.year==2020)].reset_index(drop=True)

# 1. Anatomy of a plotly figure

In [22]:
fig = px.scatter(data_frame=df, x='gdp_capita', y='life_expectancy', color='continent', title='Life Expectancy vs. GDP per Capita')

We can inspect the internal representation of the figure via `print(fig)` (or `fig.to_dict()`). This is useful to understand how the Plotly figure is structured, and how we can modify it. 


The figure consists of a : 

- `data` attribute: **list** of data **traces** (1 trace for Europe, 1 for Asia). The traces are the actual data points that are plotted on the figure. The appearance of each trace (e.g. color, size, symbol, showlegend, etc.) is defined in the `data` attribute
- `layout` attribute: dictionary of settings related to the title, axes, legend, background colors, etc.

In [25]:
print(fig)

Figure({
    'data': [{'hovertemplate': 'continent=Europe<br>gdp_capita=%{x}<br>life_expectancy=%{y}<extra></extra>',
              'legendgroup': 'Europe',
              'marker': {'color': '#636efa', 'symbol': 'circle'},
              'mode': 'markers',
              'name': 'Europe',
              'orientation': 'v',
              'showlegend': True,
              'type': 'scatter',
              'x': {'bdata': ('YftV8QR4y0AAAAAAAAD4f+RoVxZW9e' ... 'shrUnxQH4AHxgZNchAUNeN3wBm5kA='),
                    'dtype': 'f8'},
              'xaxis': 'x',
              'y': {'bdata': ('nu+nxks/U0AAAAAAAAD4f69OxepUTF' ... 'AAAMBUQJuQvQnZy1FAFaZnYXoWVEA='),
                    'dtype': 'f8'},
              'yaxis': 'y'},
             {'hovertemplate': 'continent=Africa<br>gdp_capita=%{x}<br>life_expectancy=%{y}<extra></extra>',
              'legendgroup': 'Africa',
              'marker': {'color': '#EF553B', 'symbol': 'circle'},
              'mode': 'markers',
              'name': 'Africa',

In [None]:
print(fig.data)

(Scatter({
    'hovertemplate': 'continent=Europe<br>gdp_capita=%{x}<br>life_expectancy=%{y}<extra></extra>',
    'legendgroup': 'Europe',
    'marker': {'color': '#636efa', 'symbol': 'circle'},
    'mode': 'markers',
    'name': 'Europe',
    'orientation': 'v',
    'showlegend': True,
    'x': array([ 14064.03861499,             nan,  57258.69022723,  20317.23192048,
                 54569.92538502,  15860.10452215,  25296.07093659,  29690.1536804 ,
                 42827.06000622,  60832.15829069,  39441.84908444,             nan,
                 52305.28931094,  48134.95979462,  56482.47563426,             nan,
                 28416.5239018 ,  34169.92228456,  54303.90416943,  93942.60574743,
                            nan,  43144.40641855,  33018.62416716,             nan,
                 40176.46004125, 120010.20830827,  44925.83118498,  12512.6003654 ,
                            nan,  20510.82530519,  59822.95426764,  17324.85624155,
                 65130.17218716,  35315.

In [27]:
print(fig.layout)

Layout({
    'legend': {'title': {'text': 'continent'}, 'tracegroupgap': 0},
    'template': '...',
    'title': {'text': 'Life Expectancy vs. GDP per Capita'},
    'xaxis': {'anchor': 'y', 'domain': [0.0, 1.0], 'title': {'text': 'gdp_capita'}},
    'yaxis': {'anchor': 'x', 'domain': [0.0, 1.0], 'title': {'text': 'life_expectancy'}}
})


## 1.1 Data: list of traces

- We can inspect the current definition of the traces via `fig.data`. 
- Note that some attributes (that are not explicitly set) have default values (e.g. marker size is 6 by default).
- We can modify the **appearance of traces** using `update_traces()` method. Any attribute defined in the `data` attribute can be modified this way (e.g. markers, name, mode).

In [5]:
fig = px.scatter(data_frame=df, x='gdp_capita', y='life_expectancy', color='continent', title='Life Expectancy vs. GDP per Capita')
fig.update_traces(marker=dict(size=10, symbol="square"))
fig

Also, we can use the `selector` argument to only modify the appearance of those traces that satisfy a condition. Typical use cases are selecting by trace name, type, or mode. 

In [6]:
fig.update_traces(marker=dict(color="gold"), selector=dict(name="Africa"))   
fig

## 1.2 Layout


- We can inspect the current definition of the layout via `fig.layout` and further zoom into specific attributes (e.g. `fig.layout.xaxis`, `fig.layout.title`, etc.).
- Note that many aspects are defined in the (default) template, see `fig.layout.template`
- We can change the layout via `fig.update_layout()`. 


In [7]:
fig = px.scatter(data_frame=df, x='gdp_capita', y='life_expectancy', color='continent')

In [8]:
fig.update_layout(
    # Size and margins
    width=800, height=500, 
    margin=dict(l=50, r=50, t=100, b=50),

    # Universal font
    font=dict(family='Consolas', size=14),

    # Background colors
    plot_bgcolor='#E3E8F6', 
    paper_bgcolor="#ACBCDE", 
    
    # Title
    title=dict(text="Life Expectancy vs. GDP per Capita", x=0.5, font=dict(color='#6770f6'), xref='paper'),

    # Axes
    xaxis=dict(title="Life Expectancy", showgrid=True),
    yaxis=dict(title="GDP per capita (US dollars)", showgrid=True, gridcolor='white'),

    # Legend
    legend=dict(title="Continent", orientation='h', y=1.1, x=0.5, xanchor='center')
)
fig

# 2. Scaling

For **each encoding channel** (x, y, color, size, symbol), we can choose between different **scaling options** (axis scaling, color scaling, size scaling)

## 2.1 Axis

In [9]:
px.scatter(df, x='gdp_capita',  y='life_expectancy', log_x=True, range_x = [1000, 150000])

 ## 2.2 Color

### Helper functions to explore predefined color scales

In [10]:
#px.colors?

### Qualitative color scales

- **Qualitative** color scales are useful for **categorical data** (e.g. continents, countries, etc.). 
- Plotly creates one trace per category and assigns **each trace a different color** from the selected qualitative color scale.
- We can inspect predefined qualitative color scales via `px.colors.qualitative.swatches()`
- Alternatively we can define a **custom color palette**, either via a list of colors, or as a dictionary mapping categories to colors.

In [11]:
#px.colors.qualitative.swatches()

In [29]:
color_list = px.colors.qualitative.Set2
#color_list = ['blue','orange']
color_dict = {'Europe':'blue', 'Africa':'orange'}

px.scatter(df, x='gdp_capita', y='life_expectancy', color='continent', color_discrete_sequence=color_list)
px.scatter(df, x='gdp_capita', y='life_expectancy', color='continent', color_discrete_map=color_dict)

### Continuous and diverging color scales

- **Continuous** color scales are useful for **numeric data** (e.g. population, fertility, etc.). If we want to highlight deviations from a central value (e.g. zero or average), we can use **diverging** color scales.
- Plotly does NOT create multiple traces in this case
- We can inspect predefined color scales via `px.colors.sequential.swatches()` and `px.colors.diverging.swatches()`
- Alternatively we can define a **custom color palette**, either via a list of colors, or as a dictionary mapping categories to colors.
- Also, we can set the minimum and maximum values of the color scale via `range_color` argument.

In [13]:
#px.colors.sequential.swatches()
#px.colors.diverging.swatches()

In [30]:
color_list = px.colors.sequential.Viridis  # or just "Viridis"
#color_list = ["blue", "orange"]
color_list = ["blue", "white", "orange"]


px.scatter(df, x='gdp_capita', y='life_expectancy', color='fertility',
           color_continuous_scale=color_list,
           range_color=[0,7]) 

### How to specify colors

When setting colors or color scales, we can use **named colors**, **RGB** colors, **HEX** colors, or **predefined color scales**

| Type |   Example |
 ----------- | ------- | 
| Named colors |<span style="color:red">red</span> |
| RGB colors | <span style="color:rgb(255, 0, 0)">rgb(255, 0, 0)</span> |
| RGBA colors | <span style="color:rgba(255, 0, 0, 0.5)">rgba(255, 0, 0, 0.5)</span> |
| HEX colors | <span style="color:#FF0000">#FF0000</span> |
| Predefined scales | 'Reds','Viridis', etc. |

## 2.3 Size and Symbol

In [31]:
px.scatter(df, x='gdp_capita', y='life_expectancy', size='population', size_max=20) 
px.scatter(df, x='gdp_capita', y='life_expectancy', symbol='continent', symbol_sequence=['star','square']) 

# 3. Plotly Templates

- A template is a set of **default settings** for the figure (relating to the layout and data traces).
- There are several [predefined templates](https://plotly.com/python/templates/) available in Plotly. 
- We can set a template 
    * (1) during figure creation: `px.scatter(..., template='plotly_white')`
    * (2) after figure creation: `fig.update_layout(template='plotly_white')`
    * (3) by setting a global default: `pio.templates.default = 'plotly_white'`

In [16]:
import plotly.io as pio

In [17]:
#pio.templates
#pio.templates['plotly'].data
#pio.templates['plotly_white'].layout

In [32]:
fig = px.scatter(data_frame=df, x='gdp_capita', y='life_expectancy', color='continent')
fig.update_layout(template='plotly_white')

We can define our own templates by modifying existing ones or creating new ones from scratch.

In [19]:
custom_template = pio.templates['plotly_white']
custom_template.layout.font.update(family='Consolas', size=16)
pio.templates['custom'] = custom_template
pio.templates.default = 'custom'

In [20]:
px.scatter(df, x='gdp_capita', y='life_expectancy', title="Life Expectancy vs. GDP per Capita")