# Introduction to Plotly
Frauke Albrecht, Bendix Haß, Marcel Meyer, Sebastian Thomas @ neue fische Bootcamp Data Science

- **Interactive plots**
- Product of the company Plotly
- Graphs can be stored ...
  * ... online in a personal plotly profile
  * ... locally (interactive as .html or static as .png export)
- Open source plotting library with over 40 different chart types (some in 3D)
- Geo support if data should be displayed in a countries shape
- Easy to get a first impression of the data

In this introduction, we concentrate on `plotly.graph_objects`. There are other possibilities to import plotly graphics.

Tutorial: [https://plot.ly/python/](https://plot.ly/python/), cheat sheet: [https://images.plot.ly/plotly-documentation/images/python_cheat_sheet.pdf](https://images.plot.ly/plotly-documentation/images/python_cheat_sheet.pdf)

## Installation
If plotly isn't installed in environment (nf.yml does contain plotly), run the following line:

In [None]:
# %conda install plotly

## Import
We import some packages, in particular `plotly.graph_objects` as well as some well-known data set.

In [None]:
import plotly.graph_objects as go
import pandas as pd
import matplotlib.pyplot as plt

In [None]:
df = pd.read_csv('winequality-red.csv', delimiter=';')
df.columns

## Create a histogram
The creation of an interactive plotly graphic with `go` follows two steps: First, we create a data object, here a `Histogram` object, which stores the data as well as additional information to display the data in the desired format, such as e.g. a `name`. Second, we use the `Figure` object to display the data.

In [None]:
data = go.Histogram(x=df['quality'], name='Quality')
print(type(data))
data

In [None]:
fig = go.Figure(data, layout={'title': 'Histogram of quality', 
                              'xaxis_title': 'Quality',
                              'yaxis_title': 'Frequency'})
print(type(fig))
fig.show() # or just fig as it is the last row in this cell

Of course, we could combine the steps, if we wanted:

In [None]:
go.Figure(go.Histogram(x=df['quality'], name='Quality'),
          layout={'title': 'Histogram of quality',
                  'xaxis_title': 'Quality',
                  'yaxis_title': 'Frequency'}
         ).show()

## Create a scatter plot

In [None]:
data = go.Scatter(x=df['free sulfur dioxide'], y=df['total sulfur dioxide'], mode='markers')
print(type(data))
data

In [None]:
fig = go.Figure(data, layout={'title': 'Scatter plot: Free vs total sulfur dioxide',
                              'xaxis_title': 'Free sulfur dioxide',
                              'yaxis_title': 'Total sulfur dioxide'})
fig.show()

## Create a box plot

In [None]:
data1 = go.Box(y=df['fixed acidity'], name='fixed acidity')
data2 = go.Box(y=df['alcohol'], name='alcohol')
print(type(data1))
data1

In [None]:
fig = go.Figure(data1, layout={'title': 'Box plot of total sulfur dioxide',
                               'yaxis_title': 'Total sulfur dioxide'})
fig.show()

In [None]:
fig = go.Figure([data1, data2], layout={'title': 'Box plots of fixed acidity and alcohol'})
fig.show()

## Creating a bar plot

In [None]:
data = go.Bar(y=df['sulphates'].iloc[0:20])
print(type(data))
data

In [None]:
fig = go.Figure(data, layout={'title': 'Bar plot of some sulphate values (restricted to first 20 observations)',
                              'xaxis_title': 'Index of observation',
                              'yaxis_title': 'Concentration of sulphates'})
fig.show()

## Creating a Heatmap (with correlation)

In [None]:
wine_corr = df.corr()
data = go.Heatmap(z=wine_corr, x=wine_corr.columns, y=wine_corr.columns,
                  hovertemplate='Feat1: %{x}<br />Feat2: %{y}<br />Corr: %{z}')
print(type(data))
data

In [None]:
fig = go.Figure(data, layout={'title': 'Heatmap of series correlation for red-wine'})
fig.show()

## Summary
### Workflow
1. Create the data object
    * `data = go.<plot name>(<appropriate arguments>)`
2. Create the figure object
    * `fig = go.Figure(data, <optional arguments>)`
    * `<optional arguments>` may be: `layout={<flag> : <value>}`
3. Show the figure
    * `fig.show()`

## And there is much, much more...

In [None]:
data = go.Scatter3d(x=df['volatile acidity'], y=df['citric acid'], z=df['fixed acidity'], mode='markers', 
                    marker=dict(size=2))
fig = go.Figure(data, layout={'title': 'Some fancy 3d plot'})
fig.show()

In [None]:
df = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/2011_february_us_airport_traffic.csv')
df['text'] = df['airport'] + ' ' + df['city'] + ', ' + df['state'] + ' ' + 'Arrivals: ' + df['cnt'].astype(str)

data = go.Scattergeo(lon = df['long'], lat = df['lat'], text = df['text'], mode = 'markers',
                     marker_color = df['cnt'])
fig = go.Figure(data)
fig.update_layout(
    title = 'Even maps are possible:<br />Most trafficked US airports<br />(Hover for airport names)',
    geo_scope='usa')
fig.show()