<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Getting-the-data" data-toc-modified-id="Getting-the-data-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Getting the data</a></span></li><li><span><a href="#plotly-express" data-toc-modified-id="plotly-express-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>plotly express</a></span></li><li><span><a href="#Exercise-1---10-minutes" data-toc-modified-id="Exercise-1---10-minutes-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Exercise 1 - 10 minutes</a></span></li><li><span><a href="#plotly-graph-objects" data-toc-modified-id="plotly-graph-objects-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>plotly graph objects</a></span><ul class="toc-item"><li><span><a href="#Getting-the-data-ready" data-toc-modified-id="Getting-the-data-ready-4.1"><span class="toc-item-num">4.1&nbsp;&nbsp;</span>Getting the data ready</a></span></li><li><span><a href="#Bar-Charts" data-toc-modified-id="Bar-Charts-4.2"><span class="toc-item-num">4.2&nbsp;&nbsp;</span>Bar Charts</a></span></li><li><span><a href="#Scatterplot" data-toc-modified-id="Scatterplot-4.3"><span class="toc-item-num">4.3&nbsp;&nbsp;</span>Scatterplot</a></span></li><li><span><a href="#Line-Charts" data-toc-modified-id="Line-Charts-4.4"><span class="toc-item-num">4.4&nbsp;&nbsp;</span>Line Charts</a></span><ul class="toc-item"><li><ul class="toc-item"><li><span><a href="#When-using-graph-objects,-line-charts-are-scatter-charts-with-connected-marks." data-toc-modified-id="When-using-graph-objects,-line-charts-are-scatter-charts-with-connected-marks.-4.4.0.1"><span class="toc-item-num">4.4.0.1&nbsp;&nbsp;</span>When using graph objects, line charts are scatter charts with connected marks.</a></span></li></ul></li></ul></li></ul></li><li><span><a href="#Exercise---30-minutes" data-toc-modified-id="Exercise---30-minutes-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Exercise - 30 minutes</a></span></li></ul></div>

https://plotly.com/python-api-reference/plotly.express.html

https://plotly.com/python/

1. Plotly express
    - bar chart
    - line chart
    - scatterplot
    - exercise - pick two, create share  

2. plotly graph objects (go)
    - figure structure - The structure of a figure - data, traces and layout explained
        - https://plotly.com/python/figure-structure/
        - tree of attributes
        - data (aka traces)
        - layout
        - frames (used in animated plots)
    - display figures
        - in a notebook or script... fig.show()
        - renderers png, jpeg, etc. fig.show(renderer="png", width=800, height=300)
        - export to html
        - static using Kaleido.....https://plotly.com/python/static-image-export/
    - bar charts
    - line charts
    - scatterplot
    - map
3. subplots
    - https://plotly.com/python/creating-and-updating-figures/
    - go down to subplot section
        

# Getting the data

These next few steps read data from data.cdc.gov and do some clean-up and data prep.

In [None]:
import requests
import pandas as pd
import numpy as np
import plotly.express as px

In [None]:
# Get the data from CDC and look at it in json format

response = requests.get("https://data.cdc.gov/resource/saz5-9hgg.json")
jsonhold = response.json()
#jsonhold

In [None]:
# Put the data into a DataFrame

vaccines = pd.DataFrame(jsonhold)

# Create month and week columns

vaccines['month'] = pd.to_datetime(vaccines['week_of_allocations']).dt.month
vaccines['day'] =  pd.to_datetime(vaccines['week_of_allocations']).dt.day


# Changing the datatypes & column names

vaccines['month'] = vaccines.month.astype(str)
vaccines['day'] = vaccines.day.astype(str)
vaccines['_1st_dose_allocations'] = pd.to_numeric(vaccines['_1st_dose_allocations']).astype(int)
vaccines['_2nd_dose_allocations'] = pd.to_numeric(vaccines['_2nd_dose_allocations']).astype(int)
vaccines['_2nd_dose_allocations'] = vaccines._2nd_dose_allocations*1.2


short_names = {'_1st_dose_allocations':'first',
               '_2nd_dose_allocations':'second'}
vaccines.rename(columns=short_names, inplace=True)
vaccines = vaccines[vaccines.jurisdiction.isin(['Massachusetts','New Hampshire', 'Rhode Island'])]


vaccines.head()
               

In [None]:
vaccines.shape

In [None]:
vaccines = vaccines.sort_values(by='month')

fig = px.line(vaccines, 
              x = 'month', 
              y = 'first', 
              color = 'jurisdiction',
              markers = True,
              symbol = 'jurisdiction')
fig.show()

In [None]:
v_day = vaccines.groupby('day').sum().reset_index()
v_day.head()

In [None]:
v_month = vaccines.groupby('month').sum().reset_index()
v_month.head()

In [None]:
v_sm = vaccines.groupby(['jurisdiction','month']).sum().reset_index()
v_sm.head()

# plotly express

https://plotly.com/python-api-reference/plotly.express.html

In [None]:
fig = px.line(v_day, x = 'day', y = 'first')
fig.show()

In [None]:
fig = px.line(v_sm, 
              x = 'month', 
              y = 'first', 
              color = 'jurisdiction',
              markers = True,
              symbol = 'jurisdiction',
              text = 'first')
fig.show()

In [None]:
fig = px.scatter(v_sm, 
              x = 'first', 
              y = 'second')
fig.show()

In [None]:
# Using aggregated data
fig = px.bar(v_sm, 
              x = 'month', 
              y = 'first')
fig.show()

In [None]:
fig = px.bar(v_sm, 
             x = 'month', 
             y = 'first',
             color = 'jurisdiction')
fig.show()

In [None]:
# Continuous color
fig = px.bar(v_month, 
             x = 'month', 
             y = 'first',
             color = 'second')
fig.show()

In [None]:
# Unaggregated data

fig = px.bar(vaccines, x = 'jurisdiction', y = 'first', color = 'month')
fig.show()

In [None]:
# A more dramatic example of same phenomena

df = px.data.tips()
fig = px.bar(df, 
             x="sex", 
             y="total_bill", 
             color='time')
fig.show()

In [None]:
# Stacked unaggregated data

fig = px.bar(vaccines, x = 'jurisdiction', y = 'first', color = 'month')
fig.show()

In [None]:
# Side-by-side unaggregated data

fig = px.bar(vaccines, 
             x = 'jurisdiction', 
             y = 'first', 
             color = 'month',
             barmode = 'group')
fig.show()

In [None]:
# Use histogram to aggregate

fig = px.histogram(vaccines, 
             x = 'jurisdiction', 
             y = 'first', 
             color = 'month',
             barmode = 'group')
fig.show()

In [None]:
# faceted subplots   ##### Different dataset!

df = px.data.tips()
fig = px.bar(df, 
             x="sex", 
             y="total_bill", 
             color="smoker", 
             barmode="group",
             facet_row="time", 
             facet_col="day",
             category_orders={"day": ["Thur", "Fri", "Sat", "Sun"],
                              "time": ["Lunch", "Dinner"]})
fig.show()

# Exercise 1 - 10 minutes

In [None]:
# Exercise 1 plotly express 1 - pie chart

In [None]:
# Exercise 1 plotly express 2 - boxplot

# plotly graph objects

## Getting the data ready

In [None]:
import plotly.graph_objects as go
import pandas as pd
ob = pd.read_csv('https://raw.githubusercontent.com/jimcody2014/Python-Data/main/outbreaks-dashboard.csv')
ob.head()

In [None]:
ob_month = ob.groupby('Month')[['Illnesses','Hospitalizations', 'Fatalities']].sum().reset_index()

In [None]:
oby = ob.groupby('Year')[['Illnesses','Hospitalizations', 'Fatalities']].sum().reset_index()

In [None]:
obs = ob.groupby('State')[['Illnesses','Hospitalizations', 'Fatalities']].sum().reset_index()

## Bar Charts

In [None]:
# Basic graph object
fig = go.Figure(
    data=[go.Bar(x=['apples', 'oranges', 'bananas'], y=[1, 3, 2])],
    layout=go.Layout(
        title=go.layout.Title(text="A Figure Specified By A Graph Object")
    )
)

fig.show()

In [None]:
print(fig)

In [None]:
# Very minimal

fig = go.Figure([go.Bar(x=['apples', 'oranges', 'bananas'], y=[1, 3, 2])])
fig.show()

In [None]:
# With dataframe data - version 1
fig = go.Figure(go.Bar(x=ob['Month'], y = ob['Illnesses'],hovertemplate = "%{x}: <br>Illnesses: %{y} </br> %{y}"))
fig.show()

In [None]:
# With dataframe data - version 2 - just a different way of accessing the variables

fig = go.Figure(go.Bar(x=ob.Month, y = ob.Illnesses))
fig.show()

In [None]:
# With aggregated dataframe data
fig = go.Figure(go.Bar(x=ob_month.Month, y = ob_month.Illnesses))
fig.show()

In [None]:
fig = go.Figure(go.Bar(x=ob_month.Month, y = ob_month.Illnesses))
fig.update_layout(xaxis={'categoryorder':'array', 'categoryarray':['January','February','March','April','May','June','July','August',
                              'September','October','November','December']})
fig.show()

In [None]:
# Multiple traces
fig = go.Figure(
    data=[go.Bar(name = 'ill', x=ob_month.Month, y = ob_month.Illnesses),
         go.Bar(name = 'hosp', x=ob_month.Month, y = ob_month.Hospitalizations)],
    layout=go.Layout(
        title=go.layout.Title(text="A Figure Specified By A Graph Object")
    )
)

fig.show()

In [None]:
# Layout update
fig = go.Figure(
    data=[go.Bar(name = 'ill', x=ob_month.Month, y = ob_month.Illnesses),
         go.Bar(name = 'hosp', x=ob_month.Month, y = ob_month.Hospitalizations)],
    layout=go.Layout(
        title=go.layout.Title(text="A Figure Specified By A Graph Object")
    )
)
fig.update_layout(barmode='stack')
fig.show()

In [None]:
# From the documentation - Adding multiple 'traces'
months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
          'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']

fig = go.Figure()
fig.add_trace(go.Bar(
    x=months,
    y=[20, 14, 25, 16, 18, 22, 19, 15, 12, 16, 14, 17],
    name='Primary Product',
    marker_color='indianred'
))
fig.add_trace(go.Bar(
    x=months,
    y=[19, 14, 22, 14, 16, 19, 15, 14, 10, 12, 12, 16],
    name='Secondary Product',
    marker_color='lightsalmon'
))

# Here we modify the tickangle of the xaxis, resulting in rotated labels.
fig.update_layout(barmode='group', xaxis_tickangle=-45)
fig.show()

In [None]:
# Modifying the Hover text & traces update

fig = go.Figure(go.Bar(x=ob_month.Month, y = ob_month.Illnesses,
                      hovertext=['A lot', 'medium', 'Big']))

fig.update_traces(marker_color='rgb(158,202,225)', marker_line_color='rgb(8,48,107)',
                  marker_line_width=1.5, opacity=0.6)
fig.update_layout(title_text='Outbreaks by Month')
fig.show()

In [None]:
# Modifying colors

# amts = [37,27,33,30,29,30,35,33,37,32,27,24]
colors = ['lightslategray',] * 12
colors[11] = 'crimson'

fig = go.Figure(go.Bar(x=ob_month.Month, y = ob_month.Illnesses,
                      hovertext=['A lot', 'medium', 'Big'],
                      text = ob_month.Illnesses,
                      textposition = 'auto',
                      marker_color = colors)
               )

fig.update_layout(title_text='Outbreaks by Month')

fig.update_traces(texttemplate='%{text:.2s}', textposition='outside')
fig.update_layout(uniformtext_minsize=8, uniformtext_mode='hide')

fig.show()

In [None]:
# Sorting as part ogf the layout

fig = go.Figure(
    data=[go.Bar(name = 'ill', x=ob_month.Month, y = ob_month.Illnesses),
         go.Bar(name = 'hosp', x=ob_month.Month, y = ob_month.Hospitalizations)],
    layout=go.Layout(
        title=go.layout.Title(text="A Figure Specified By A Graph Object")
    )
)
fig.update_layout(barmode='stack', xaxis={'categoryorder':'total ascending'})  # descending
fig.show()

## Scatterplot

**Reminder:**  ob is outbreaks.  ob_month is outbreak data aggregated to the month

In [None]:
fig = go.Figure(data=go.Scatter(x=ob_month.Illnesses, y=ob_month.Fatalities, mode = 'markers'))
fig.show()


In [None]:
# Same figure as above
fig = go.Figure()
fig.add_trace(go.Scatter(
    x=ob_month.Illnesses, 
    y=ob_month.Fatalities, 
    mode = 'markers',
    marker_color='indianred'
))

When using Plotly graphic objects, **Scatter** is also used to create line charts.  The marker used chnges the style.

In [None]:
# From documentation

import numpy as np
np.random.seed(1)

N = 100
random_x = np.linspace(0, 1, N)
random_y0 = np.random.randn(N) + 5
random_y1 = np.random.randn(N)
random_y2 = np.random.randn(N) - 5

# Create traces
fig = go.Figure()
fig.add_trace(go.Scatter(x=random_x, y=random_y0, mode='lines', name='lines'))
fig.add_trace(go.Scatter(x=random_x, y=random_y1, mode='lines+markers', name='lines+markers'))
fig.add_trace(go.Scatter(x=random_x, y=random_y2, mode='markers', name='markers'))

fig.show()

In [None]:
# Change the marker size
fig = go.Figure()
fig.add_trace(go.Scatter(
    x=ob_month.Illnesses, 
    y=ob_month.Hospitalizations, 
    mode = 'markers',
    marker_size=ob_month.Fatalities,
    marker_color='indianred'
    
# Below are different formatting options to try.    
    
    #marker_color = ob_month.Fatalities
    #marker=dict(
    #    size=16,
    #    color=ob_month.Fatalities, #set color equal to a variable
    #    colorscale='inferno', # one of plotly colorscales
    #    showscale=True
    #)
))
#fig.update_traces(mode='markers', marker_line_width=2, marker_size=ob_month.Fatalities)
# If multiple traces exist, the update will be applied to all traces.

#fig.update_layout(title='Sized Scatterplot')

# Update the x axes
#fig.update_xaxes(tickangle = 90,title_text = "Illnesses",title_font={"size": 20},title_standoff = 25)
#fig.update_xaxes(showline=True, linewidth=2, linecolor='black')
#fig.update_xaxes(showgrid=False)

# Update the x axes
#fig.update_yaxes(title_text = "Hospitalizations",title_standoff = 25)
#fig.update_yaxes(showline=True, linewidth=2, linecolor='black')
#fig.update_yaxes(title_font=dict(size=18, family='Courier', color='crimson'))
#fig.update_yaxes(ticklabelposition="inside top", title='Hospitalizations')



fig.show()

# https://plotly.com/python/builtin-colorscales/

In [None]:
# Using a large dataset - from documentation
N = 100000
fig = go.Figure(data=go.Scattergl(
    x = np.random.randn(N),
    y = np.random.randn(N),
    mode='markers',
    marker=dict(
        color=np.random.randn(N),
        colorscale='Viridis',
        line_width=1
    )
))
fig.show()

## Line Charts

#### When using graph objects, line charts are scatter charts with connected marks.

In [None]:
# Line charts are Scatter charts with connected markers.
# The default scatter creates a line

fig = go.Figure(go.Scatter(x=oby.Year, y=oby.Illnesses))
fig.show()

In [None]:
fig = go.Figure()

fig.add_trace(go.Scatter(x=oby.Year, 
                           y=oby.Illnesses,
                           name = 'Illnesses'))

fig.add_trace(go.Scatter(x=oby.Year, 
                         y=oby.Hospitalizations,
                         name = 'Hospitalizations',
                         line=dict(color='lightgrey', width=4, dash='dot')))
# dash options include 'dash', 'dot', and 'dashdot'

fig.add_trace(go.Scatter(x=oby.Year, 
                           y=oby.Fatalities,
                           name = 'Fatalities'))

fig.update_layout(title='Illnesses by Year',
                   xaxis_title='Year',
                   yaxis_title='Number of Illnesses')


fig.show()

# Exercise - 30 minutes

- Create a new notebook (don't forget the imports)
- Name the notebook **Diabetes Analysis Dashboard**
- read in the diabetes_for_plotly dataset
- group data as needed
- Use express or graph objects
- Create a scatter plot of any two measures.  Use a third measure to adjust the size.  Color by a categorical value. Add hover text to show the age group.
- Create a side-by-side bar chart showing number of lab procedures and number of non lab procedures by gender.
- Create a line chart showing number of number of medications by month.
- Create a line chart showing number of number of procedures by month.
- Create a fifth chart of your choice (NOT scatter, bar or line) using the documentation.

https://bitbucket.org/jimcody/sampledata/raw/b2aa6df015816ec35afc482b53df1b7ca7a31f80/diabetes_for_plotly.csv