<div class="alert alert-info">

## How to read this Jupyter Notebook
- I want to give you an impression of how simple, and boring bar plots can be **turned into truthful, useful, and beautiful visualizations using Plotly**
- Note that this is an **iterative, and time-consuming process**: it involves decisions about (1) what aspects are interesting/useful, (2) what is the appropriate visual design and (3) how can this be implemented using Plotly
- **To get the best out of Plotly**, I recommend:
    - Check online examples and adapt your code: e.g. https://plotly.com/python/bar-charts/
    - Inspect the internal structure of your figure via `figure.data` and `figure.layout` and adjust it using `update_traces()` and `update_layout()`
    - Plotly has a detailed online documentation: e.g. https://plotly.com/python/reference/layout/#layout-title
    - Plotly has a detailed internal help: e.g. `help(fig.layout.title)` shows you all options for the title of your figure
    - Generative AI tools can help you naviagate the multitude of options faster


</div>

<div class="alert alert-info">

# Design Checklist

- **Choose an effective plot type and encoding**: easy and intuitive to understand?
- **Improve figure-ground separation**: pick a light theme and delete all unnecessary elements (candidates: gridlines, ticks, axis titles and labels, ...)
- **Focus attention**: use color (or other pre-attentive attributes) to highlight important data or text, de-emphasize the rest
- **Add Explainers**: Use title, subtitle, data labels, annotations to explain what the viewer is supposed to see
- **Check the CRAP principles**:
    - **Contrast**: Similar to "focus attention"
    - **Repetition**: Same colors, sizes, fonts, etc. used for the same types of elements?
    - **Alignment**: Are bars, axis labels, data labels, annotations ,title logically and consistently aligned?
    - **Proximity**: Data labels or annotations close to data points, legend close to figure; use white space / distance to visually separate different parts

</div>

In [1]:
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go

In [2]:
df = pd.read_csv("titanic.csv")

# 1. One categorical variable

## Baseline

In [3]:
plotdata = df.value_counts('lifeboat').reset_index()
px.bar(plotdata, x='lifeboat', y='count')

## Second iteration

In [4]:
# Data
plotdata = df.value_counts('lifeboat').reset_index()

# Sorting
lifeboat_order = plotdata.sort_values('count', ascending=False).lifeboat.tolist()

# Colors
colors = ['lightgrey']*13 + ['orange']*7

fig = px.bar(plotdata, 
             y='lifeboat', 
             x='count', 
             text='count', 
             category_orders={'lifeboat': lifeboat_order})

# Customize data traces
fig.update_traces(
    textposition='outside',
    marker=dict(color = colors))

# Customize Layout
fig.update_layout(
    template = "none",
    title = dict(text="Number of Passengers per Titanic Lifeboat", x=0, xref='paper', font=dict(weight="bold", size=28)),
    yaxis = dict(title=None, showline=False, ticklabelstandoff=5),
    xaxis = dict(visible=False),
    font = dict(size=16, family = "Cambria"),
    height = 600, width = 900,
    annotations = [dict(text="These lifeboats rescued <span style='color:orange; font-weight:bold'>less than 20 passengers</span>", 
                        x=20, y=6, xanchor='left', xref='x', showarrow=True, ax=40, ay=0)]
)

## Third iteration

**Figure is currently misleading / not truthful!**:

- First, the numbers computed based on our current dataset are lower than the official numbers (see [Wikipedia](https://en.wikipedia.org/wiki/Lifeboats_of_the_Titanic) and [Encyclopedia Titanica](https://www.encyclopedia-titanica.org/titanic/lifeboats/lifeboat-1/)). Probably, the reasons it that there were some people on the boat that were not recorded by name.
- Second, the absolute numbers shown above are misleading, if the lifeboats have different capacities. 

--> Look for better data!

In [5]:
lifeboats = pd.read_csv("lifeboats.csv")
lifeboats['proportion'] = lifeboats.rescued / lifeboats.capacity

In [6]:
# Sorting
lifeboats.sort_values('proportion', ascending=True, inplace=True)
lifeboat_order = lifeboats.sort_values('proportion', ascending=False).lifeboat.tolist()

# Colors
colors = ["orange"]*9 + ['lightgrey']*10 + ["#4BA123"]

fig = px.bar(lifeboats, y='lifeboat', x='proportion', text='proportion')

# Customize data traces
fig.update_traces(
    marker=dict(color = colors),
    textposition='outside', 
    texttemplate='%{text:.0%}'
)

# Customize Layout
fig.update_layout(
    title = dict(text="Could Titanic's lifeboats have saved more people?", 
                 x=0, xref='paper', 
                 font=dict(weight="bold", size=28)),
    font = dict(size=18, family = "Cambria"),
    template = "none",
    yaxis = dict(title=None, showline=False, ticklabelstandoff=5),
    xaxis = dict(visible=False),
    height = 650, width = 900,
    margin=dict(t=130),
    annotations = [
        dict(text="Proportion of occupied seats per lifeboat", x=0, xref='paper', y=1.14, yref='paper', showarrow=False),
        dict(text="<span style='color:#4BA123; font-weight:bold'>Only lifeboat number 15 has used its full capacity</span>", 
             x=0, xanchor='left', xref='x', y=1.05, yref='paper', showarrow=False, ax=40, ay=0),
        dict(text="<span style='color:orange; font-weight:bold'>Less than 50% occupied</span>", 
                        x=0.48, y=4, xanchor='left', xref='x', showarrow=False, ax=40, ay=0),
        dict(text="Capacity: 65<br>Rescued: 66", x=1, xref='x', y=18.5, yref='y', showarrow=True, ax=0, ay=50, bordercolor="black")
    ]
)



# 2. Two categorical variables

In [7]:
plotdata = pd.crosstab(df.lifeboat, df.pclass, dropna=False, normalize='index')
plotdata = plotdata.stack().reset_index(name='percent')
plotdata.dropna(subset=['lifeboat'], inplace=True)
plotdata.fillna('Crew members', inplace=True)
plotdata.head()


Unnamed: 0,lifeboat,pclass,percent
0,boat 1,1st Class Passengers,0.416667
1,boat 1,2nd Class Passengers,0.0
2,boat 1,3rd Class Passengers,0.0
3,boat 1,Crew members,0.583333
4,boat 10,1st Class Passengers,0.225806


## Stacked Bar Plot

In [8]:
px.bar(plotdata, x='lifeboat', y='percent', color='pclass', barmode='stack') 

In [9]:
pclass_order = ['Crew members','1st Class Passengers','2nd Class Passengers','3rd Class Passengers']
lifeboat_order = plotdata.loc[plotdata.pclass=="Crew members"].sort_values('percent', ascending=False).lifeboat.tolist()

fig = px.bar(plotdata, y='lifeboat', x='percent', color='pclass', 
             barmode='stack', 
             category_orders=dict(pclass=pclass_order, lifeboat=lifeboat_order),
             color_discrete_sequence=["#FAA81A", "#8F8F8F", "#B4B4B4", "#DCDCDC"])

fig.update_traces(marker=dict(line=dict(color='black', width=0.2)))

fig.update_layout(
    template="none",
    title = dict(text="Did crew members get access to lifeboats?", font=dict(size=28, weight="bold"), x=0.5, xref='paper'),
    xaxis = dict(title=None, showgrid=False, tickformat=".0%"),
    yaxis = dict(title=None, showline=False, zeroline=False, ticklabelstandoff=5),
    bargap = 0,
    height = 600, width = 900,
    legend = dict(orientation='h', yanchor='bottom', y=0.99, xanchor='left', x=0, xref='paper', title=None),
    font = dict(size=16, family = "Cambria"),
    margin = dict(t=100)
)


## Facetted Bar Plot

In [10]:
pclass_order = ['Crew members','1st Class Passengers','2nd Class Passengers','3rd Class Passengers']
lifeboat_order = plotdata.loc[plotdata.pclass=="Crew members"].sort_values('percent', ascending=False).lifeboat.tolist()

fig = px.bar(plotdata, y='lifeboat', x='percent', color='pclass', facet_col='pclass',
             category_orders=dict(pclass=pclass_order, lifeboat=lifeboat_order),
             color_discrete_sequence=["#FAA81A", "#8F8F8F", "#B4B4B4", "#DCDCDC"])

fig.update_traces(dict(showlegend=False))

fig.update_layout(
    template="none",
    title = dict(text="Did crew members get access to lifeboats?", font=dict(size=28, weight="bold"), x=0.5, xref='paper'),
    bargap = 0.2,
    height = 600, width = 1400,
    font = dict(size=16, family = "Cambria"),
    margin = dict(t=100)
)

fig.update_xaxes(dict(title=None, showgrid=True, tickformat=".0%", showline=True))
fig.update_yaxes(dict(title=None, showline=False, ticklabelstandoff=5))

## Grouped Bar Plot

In [11]:
fig = px.bar(plotdata, x='lifeboat', y='percent', color='pclass', barmode='group',
             category_orders=dict(pclass=pclass_order, lifeboat=lifeboat_order),
             color_discrete_sequence=["#FAA81A", "#8F8F8F", "#B4B4B4", "#DCDCDC"])

fig.update_layout(
    template="none",
    title = dict(text="Did crew members get access to lifeboats?", font=dict(size=28, weight="bold"), x=0.5, xref='paper'),
    legend = dict(orientation='h', title=None, y=1.08, x=0.5, xanchor='center'),
    xaxis = dict(title=None),
    height = 600, width = 1400,
    font = dict(size=16, family = "Cambria"),
    margin = dict(t=100)
)