# <center>Class 4<br>Part 1: Plotly</center>
 
## Opjectives
In this class we will learn:
<ul>
    <li>How to create basic plotly graphs</li>
    <li>How to edit plotly graphs via update layout</li>
    <li>Creaing more complex graphic structures</li>
    <li>Making a Choropleth graph</li>
</ul>

### First things first: we need data

In [1]:
import pandas as pd
import yfinance as yf
ticker_list = 'AAPL, TSLA, MSFT, BAC, GS, AAL'
myTickers = yf.Tickers(ticker_list)
stock_prices = myTickers.history(period="max")

stock_prices.head()

[*********************100%***********************]  6 of 6 completed


Unnamed: 0_level_0,Close,Close,Close,Close,Close,Close,Dividends,Dividends,Dividends,Dividends,...,Stock Splits,Stock Splits,Stock Splits,Stock Splits,Volume,Volume,Volume,Volume,Volume,Volume
Unnamed: 0_level_1,AAL,AAPL,BAC,GS,MSFT,TSLA,AAL,AAPL,BAC,GS,...,BAC,GS,MSFT,TSLA,AAL,AAPL,BAC,GS,MSFT,TSLA
Date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2
1973-02-21,,,1.669042,,,,,,0.0,,...,0.0,,,,,,99200,,,
1973-02-22,,,1.674681,,,,,,0.0,,...,0.0,,,,,,47200,,,
1973-02-23,,,1.669042,,,,,,0.0,,...,0.0,,,,,,133600,,,
1973-02-26,,,1.669042,,,,,,0.0,,...,0.0,,,,,,24000,,,
1973-02-27,,,1.669042,,,,,,0.0,,...,0.0,,,,,,41600,,,


## 1. Creating a simple plot

In [2]:
import plotly.express as px

AAL = stock_prices['Close'][['AAL']].dropna()        # Keeping the double bracket makes it a DF
fig = px.line(AAL, y="AAL", x=AAL.index)
fig.show()

fig = px.scatter(AAL, y="AAL", x=AAL.index)
fig.show()


The graph above looks very generic, let's try to make it look better. Notice that:
<ul>
    <li>I already have plotly express and the data in AAL</li>
    <li>In, fact, I even have the plot ready</li>
</ul>

In [3]:
# First, let's have a look at the graph construction
print(fig)

Figure({
    'data': [{'hovertemplate': 'Date=%{x}<br>AAL=%{y}<extra></extra>',
              'legendgroup': '',
              'marker': {'color': '#636efa', 'symbol': 'circle'},
              'mode': 'markers',
              'name': '',
              'showlegend': False,
              'type': 'scattergl',
              'x': array([datetime.datetime(2005, 9, 27, 0, 0),
                          datetime.datetime(2005, 9, 28, 0, 0),
                          datetime.datetime(2005, 9, 29, 0, 0), ...,
                          datetime.datetime(2022, 2, 16, 0, 0),
                          datetime.datetime(2022, 2, 17, 0, 0),
                          datetime.datetime(2022, 2, 18, 0, 0)], dtype=object),
              'xaxis': 'x',
              'y': array([18.19491196, 19.32620049, 19.05280304, ..., 18.81999969, 18.21999931,
                          17.87000084]),
              'yaxis': 'y'}],
    'layout': {'legend': {'tracegroupgap': 0},
               'margin': {'t': 60},
         

In [4]:
# I can call the update method enbedded in plotly express:
fig.update_layout(
    # this is a function taking multiple kwargs where complex args have to be passed as dictionaries
    title = {
        'text': 'American Airlines Historical Price',
        'y': 1,
        'x': 0.5,
        'font': {'size': 22, 'color':'orange'}
    },
    paper_bgcolor = 'white',
    plot_bgcolor = 'white',
    autosize = False,
    height = 300,
    xaxis = {
        'title': 'Closing Date',
        'showline': True, 
        'linewidth': 1,
        'linecolor': 'black'
    },
    yaxis = {
        'showline': True, 
        'linewidth': 1,
        'linecolor': 'black'
    }
)

# This updates the data portion
fig.update_traces(line = {'color': 'gray', 'width': 1})

# check this and see what happens
# fig.update_traces(line = {'color': 'gray', 'wwidth': 1})

fig.show()

Let's try adding some time-series features to this:

In [5]:
fig.update_layout(
    xaxis=dict(
        rangeselector = dict(
            buttons = list([
                dict(count=1,
                     label="1m",
                     step="month",
                     stepmode="todate"
                     ),
                dict(count=6,
                     label="6m",
                     step="month",
                     # stepmode="backward"
                     ),
                dict(count=1,
                     label="YTD",
                     step="year",
                     # stepmode="todate"
                     ),
                dict(count=2,
                     label="1y",
                     step="year",
                     # stepmode="backward"
                     ),
                dict(step="all")
            ])
        ),
        rangeslider=dict(
            visible=True
        ),
        type="date",
    )
)
fig.show()

## 2. Creating more complex graphics
### Let's get deeper into plotly
So far, we have seen the express capabilities of plotly. This is a high level API (not many configurable options). Let's now look into the full API.

In [6]:
import plotly.graph_objects as go

# with the full graphics object, we can create a template figure which is fully flexible.
fig = go.Figure()

# the new figure, is now ready to have anything added to it:
fig.add_trace(go.Scatter(y=AAL["AAL"].to_list(), x=AAL.index.to_list()))
fig.show()

The above graph is pretty naked in comparison to the express graph. But, we can add all the detailst to it very easily via updates:

In [7]:
# I can call the update method enbedded in plotly express:
fig.update_layout(
    # this is a function taking multiple kwargs where complex args have to be passed as dictionaries
    title = {
        'text': 'American Airlines Historical Price',
        'y': 0.95,
        'x': 0.5,
        'font': {'size': 22}
    },
    paper_bgcolor = 'white',
    plot_bgcolor = 'white',
    autosize = False,
    height = 400,
    xaxis = {
        'title': 'Closing Date',
        'showline': True, 
        'linewidth': 1,
        'linecolor': 'black'
    },
    yaxis = {
        'showline': True, 
        'linewidth': 1,
        'linecolor': 'black'
    }
)

# This updates the data portion
fig.update_traces(line = {'color': 'gray', 'width': 1})

fig.update_layout(
    xaxis=dict(
        rangeselector = dict(
            buttons = list([
                dict(count=1,
                     label="1m",
                     step="month",
                     # stepmode="backward"
                     ),
                dict(count=6,
                     label="6m",
                     step="month",
                     # stepmode="backward"
                     ),
                dict(count=1,
                     label="YTD",
                     step="year",
                     # stepmode="todate"
                     ),
                dict(count=1,
                     label="1y",
                     step="year",
                     # stepmode="backward"
                     ),
                dict(step="all")
            ])
        ),
        rangeslider=dict(
            visible=True
        ),
        type="date",
    )
)

fig.show()

Here's the fun part: we can add a new scatter to this without having to start from scratch.

In [8]:
BAC = stock_prices['Close'][['BAC']].dropna()
fig.add_trace(go.Scatter(y = BAC["BAC"].to_list(), x = BAC.index.to_list()))
fig.update_layout(
    title = {'text': 'Historical Stock Prices'}
)
print(fig)
fig.show()

Figure({
    'data': [{'line': {'color': 'gray', 'width': 1},
              'type': 'scatter',
              'x': [2005-09-27 00:00:00, 2005-09-28 00:00:00, 2005-09-29 00:00:00,
                    ..., 2022-02-16 00:00:00, 2022-02-17 00:00:00, 2022-02-18
                    00:00:00],
              'y': [18.19491195678711, 19.326200485229492, 19.05280303955078, ...,
                    18.81999969482422, 18.219999313354492, 17.8700008392334]},
             {'type': 'scatter',
              'x': [1973-02-21 00:00:00, 1973-02-22 00:00:00, 1973-02-23 00:00:00,
                    ..., 2022-02-16 00:00:00, 2022-02-17 00:00:00, 2022-02-18
                    00:00:00],
              'y': [1.6690424680709839, 1.6746811866760254, 1.6690424680709839,
                    ..., 47.68000030517578, 46.06999969482422, 45.959999084472656]}],
    'layout': {'autosize': False,
               'height': 400,
               'paper_bgcolor': 'white',
               'plot_bgcolor': 'white',
               

And just like before, we can give some format to the second line<br>
A few things to notice:
<ul>
    <li>It is easier to update the traces when they have been properly named. This can me added to each trace upon creation.</li>
    <li>Everything listed within the figure element</li>
</ul>

In [9]:
# Add 1 more trace with a name
MSFT = stock_prices['Close'][['MSFT']].dropna()
fig.add_trace(go.Scatter(y = MSFT["MSFT"].to_list(), x = MSFT.index.to_list(), name = 'MSFT'))
print(fig)

Figure({
    'data': [{'line': {'color': 'gray', 'width': 1},
              'type': 'scatter',
              'x': [2005-09-27 00:00:00, 2005-09-28 00:00:00, 2005-09-29 00:00:00,
                    ..., 2022-02-16 00:00:00, 2022-02-17 00:00:00, 2022-02-18
                    00:00:00],
              'y': [18.19491195678711, 19.326200485229492, 19.05280303955078, ...,
                    18.81999969482422, 18.219999313354492, 17.8700008392334]},
             {'type': 'scatter',
              'x': [1973-02-21 00:00:00, 1973-02-22 00:00:00, 1973-02-23 00:00:00,
                    ..., 2022-02-16 00:00:00, 2022-02-17 00:00:00, 2022-02-18
                    00:00:00],
              'y': [1.6690424680709839, 1.6746811866760254, 1.6690424680709839,
                    ..., 47.68000030517578, 46.06999969482422, 45.959999084472656]},
             {'name': 'MSFT',
              'type': 'scatter',
              'x': [1986-03-13 00:00:00, 1986-03-14 00:00:00, 1986-03-17 00:00:00,
               

In [10]:
# Names can be updated in bulk, based on the order the traces were added
names = ['AAL', 'BAC', 'MSFT']
fig.for_each_trace(lambda t: t.update(name = names.pop(0)))   #that's a fun way of doing it

# now that each plot has a name, it is easy to uptade it via selecto
fig.update_traces(selector = {'name': 'BAC'}, line = {'color': 'blue'})
fig.update_traces(selector = {'name': 'MSFT'}, line = {'color': 'lightgreen'})

fig.show()

Clearly, MSFT overshadows the other, making it harder to see the actual price changes. Let's recreate the graph but with MSFT on it's own.

In [11]:
from plotly.subplots import make_subplots

# with the full graphics object, we can create a template figure which is fully flexible.
fig = make_subplots(specs = [[{'secondary_y': True}]])

fig.add_trace(go.Scatter(y=AAL["AAL"].to_list(), x=AAL.index.to_list(), name = 'AAL'), 
              secondary_y = False )
fig.add_trace(go.Scatter(y = BAC["BAC"].to_list(), x = BAC.index.to_list(), name = 'BAC'), 
              secondary_y = False )
fig.add_trace(go.Scatter(y = MSFT["MSFT"].to_list(), x = MSFT.index.to_list(), name = 'MSFT'), 
              secondary_y = True )

fig.update_layout(
    title = {
        'text': 'Historical Stock Prices<Br>Dual-Axed Graph',
        'y': 0.95,
        'x': 0.5,
        'font': {'size': 22}
    },
    paper_bgcolor = 'white',
    plot_bgcolor = 'white',
    autosize = False,
    # height = 300,
    xaxis = {
        'title': 'Closing Date',
        'showline': True, 
        'linewidth': 1,
        'linecolor': 'black',
        'domain': [0, 0.945]        # just for showing the effect
    },
    yaxis = {
        'title': 'BAC - AAL',
        'showline': True, 
        'linewidth': 1,
        'linecolor': 'black'
    },
    yaxis2 = {
        'title': 'MSFT',
        'showline': True, 
        'linewidth': 1,
        'linecolor': 'black',
        'anchor': 'free',
        'side': 'right',
        'position': 0.95
    },
    legend = {
        'orientation': 'h',
        #This orients lhe legend but it will result in an overlay of the graph
        # 'yanchor': 'top', 'y': 0.9,
        'yanchor': 'top', 'y': 1.15,
        'xanchor': 'left', 'x': 0,
    },
    # this fixes the overlay
    margin = {'t': 100},
    
)

# shows all the tooltips at once
fig.layout.hovermode = 'x'

# hiding a trace
#fig.update_traces(selector = {'name': 'BAC'}, visible = 'legendonly')

fig.show()

## 3. Additional Complex Graphs
So far, we have use plotted lines and scatterplots. But that is only one type of graph available in Plotly. Let's check out other graphing tools. Let's make our dataframe a little more interesting by adding returns.<br><br>
We'll keep: prices, returns, and volume.

In [12]:
import numpy as np

price = stock_prices['Close'].copy()
volume = stock_prices['Volume'].copy()
returns = (price/price.shift(1)).apply(np.log, axis = 1)*100

In [13]:
# in case we want to use yearly returns, we can group by year
# returns = (price/price.shift(1)).apply(np.log, axis = 1)*100
# returns.groupby(returns.index.year).mean()

Let's plot a bar chart of the returns of AAL grouped by the price level. For starters, let's do it on a $5 interval.

In [14]:
df = price[['AAL']].merge(volume[['AAL']], left_index = True, right_index = True).merge(
                            returns[['AAL']], left_index = True, right_index = True).dropna()
df.columns = ['Price', 'Volume', 'Return']

# we need to use pandas cut to create the groups so that we can bin the different categories
# Technically, prices have a 0-value lower limit. But, we can get the lower bound dynamically
bin_min = df['Price'].min()
bin_max = df['Price'].max()
bin_size = 5
pctls = [round(x*bin_size + bin_min, 2) for x in range(int(bin_max // bin_size))]
print(pctls)


[1.66, 6.66, 11.66, 16.66, 21.66, 26.66, 31.66, 36.66, 41.66, 46.66, 51.66]


In [15]:
# Now, let's cut the df based on the above cut-off points
df['labels'] = pd.cut(df['Price'], pctls)
df

Unnamed: 0_level_0,Price,Volume,Return,labels
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2005-09-28,19.326200,5747900.0,6.031972,"(16.66, 21.66]"
2005-09-29,19.052803,1078200.0,-1.424748,"(16.66, 21.66]"
2005-09-30,19.807001,3123300.0,3.882123,"(16.66, 21.66]"
2005-10-03,20.268942,1057900.0,2.305429,"(16.66, 21.66]"
2005-10-04,20.891150,1768800.0,3.023584,"(16.66, 21.66]"
...,...,...,...,...
2022-02-14,17.430000,37395900.0,-1.027408,"(16.66, 21.66]"
2022-02-15,18.840000,46528700.0,7.778940,"(16.66, 21.66]"
2022-02-16,18.820000,30557500.0,-0.106216,"(16.66, 21.66]"
2022-02-17,18.219999,29950300.0,-3.240026,"(16.66, 21.66]"


In [16]:
y = df['Return'].groupby(df['labels']).mean().to_list()

fig = go.Figure(data = go.Bar(name='AAL', x = pctls,
                                y=y))

fig.show()

Let's try the same thing for all the securities, plus some formatting. Here's a couple of things to consider:
<ul>
    <li>We need to plot each series separately.</li>
    <li>The bins are the same for all series. So we should consolidate the buckets/bins.</li>
    <li>The easiest way to do this is to stack all ther series in the same Price/Return column.</li>
</ul>

In [17]:
# Let's use the same thing we did for AAL, but for all the equities. 
names = price.columns

df = pd.DataFrame()
for name in names:
    df_tmp = price[[name]].merge(volume[[name]], left_index = True, right_index = True).merge(
                            returns[[name]], left_index = True, right_index = True).dropna()
    df_tmp.columns = ['Price', 'Volume', 'Return']
    df_tmp['name'] = name
    df = df.append(df_tmp)
df.sort_index().tail(10)

Unnamed: 0_level_0,Price,Volume,Return,name
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2022-02-17,18.219999,29950300.0,-3.240026,AAL
2022-02-17,349.059998,3102400.0,-3.099906,GS
2022-02-17,46.07,49442400.0,-3.435006,BAC
2022-02-17,876.349976,18392800.0,-5.228617,TSLA
2022-02-18,17.870001,30384100.0,-1.939648,AAL
2022-02-18,167.300003,82614200.0,-0.939981,AAPL
2022-02-18,346.040009,2907600.0,-0.868942,GS
2022-02-18,45.959999,37840800.0,-0.239054,BAC
2022-02-18,287.929993,34223200.0,-0.967767,MSFT
2022-02-18,856.97998,22710500.0,-2.235097,TSLA


As before, we need to instanciate the plotly figure with the Data. So we have to create the data first. The data should be a list of plotly go objects.

In [18]:
#this appends the data
data = []
for name in names:
    # We need to get the grouping by name
    this_name = df.loc[df['name'] == name].reset_index(drop=True)
    this_name['labels'] = pd.qcut(this_name['Price'], 10)
    this_data = this_name['Return'].groupby(this_name['labels']).mean()
    data.append(
        go.Bar(
            name = name, 
            x = [str(round(x.left, 2)) + '-' + str(round(x.right, 2)) for x in this_data.index.categories],
            y = this_data.to_list()
        )
    )
data

[Bar({
     'name': 'AAL',
     'x': [1.66-5.52, 5.52-8.55, 8.55-11.8, 11.8-16.91, 16.91-24.41, 24.41-31.41,
           31.41-36.1, 36.1-40.68, 40.68-45.94, 45.94-59.35],
     'y': [-0.4831565684822454, 0.0559507316169483, 0.027935341542606334,
           0.20804566338291566, 0.05849390602244656, -0.10920188219785033,
           0.012722172349122084, 0.03895530243701193, -0.07892564947060149,
           0.2653526685885149]
 }),
 Bar({
     'name': 'AAPL',
     'x': [0.04-0.1, 0.1-0.2, 0.2-0.27, 0.27-0.31, 0.31-0.39, 0.39-1.4, 1.4-5.83,
           5.83-18.18, 18.18-39.82, 39.82-181.78],
     'y': [-0.11518683933224018, 0.030596640170628034, -0.11824788205637009,
           0.0965573361478024, 0.1575543044658219, 0.1980426703873283,
           0.1255196372178466, 0.10470496244602194, 0.08009714053317014,
           0.15563826686267435]
 }),
 Bar({
     'name': 'BAC',
     'x': [0.3-0.59, 0.59-1.2, 1.2-2.29, 2.29-5.24, 5.24-8.8, 8.8-13.45,
           13.45-16.5, 16.5-22.73, 22.73-30.16, 3

In [19]:
#let's make this into a function so that we can try different ways:
def make_barchart(data):
    fig = go.Figure(data = data)

    fig.update_layout(
        barmode = 'group',
        title = 'Bar Chart of Equity Returns grouped by Prices',
        paper_bgcolor = 'white',
        plot_bgcolor = 'white',
        xaxis = dict(
            showline = True, 
            linewidth = 2, 
            linecolor = 'black'
        ),
        yaxis=dict(
            title = 'Stock Returns',
            titlefont_size = 16,
            tickfont_size = 14,
            gridcolor = '#dfe5ed'
        )
    )

    fig.layout.hovermode = 'x' # tell you where you are on x axis
    return(fig)

make_barchart(data).show()

In [32]:
# let's do ranges based on all the prices so that we can consolidate them
cut_offs = pd.qcut(df['Volume'], 10).drop_duplicates()
cut_offs_l = [round(x.left,2) for x in cut_offs]
cut_offs_r = [round(x.right,2) for x in cut_offs]
cut_offs = cut_offs_l + [cut_offs_r[-1]]
cut_offs.sort()
cut_offs


[-0.0,
 1016060.0,
 3233800.0,
 6481560.0,
 14074880.0,
 32370500.0,
 52821760.0,
 79939440.0,
 132890240.0,
 276628800.0,
 7421640800.0]

In [33]:
data = []
for name in names:
    # We need to get the grouping by name
    this_name = df.loc[df['name'] == name].reset_index(drop=True)
    this_name['labels'] = pd.cut(this_name['Volume'], cut_offs)
    this_data = this_name['Price'].groupby(this_name['labels']).mean()
    data.append(
        go.Bar(
            name = name, 
            x = [str(round(x.left, 2)) + '-' + str(round(x.right, 2)) for x in this_data.index.categories],
            y = this_data.to_list()
        )
    )

make_barchart(data).show()

Maps - do population growth with animation
Multiple plots - do a scatter/line with different time series