# **Chapter 3 - Storytelling with Visualizations**

**Shooting guards versus small forwards**

The sports media agency will be producing a blog comparing the importance of shooting guards and small forwards in points production. They have asked you to produce a scatter plot displaying points and assists per game for each of the two positions, using different glyph colors, sizes, and transparency.

The nba dataset has been filtered for "SG" and "SF", and preloaded for you as two Bokeh source objects called shooting_guards and small_forwards. A HoverTool has also been created to display "player", "team", and "field_goal_perc".

In [1]:
import pandas as pd
from bokeh.plotting import figure
from bokeh.io import output_file, show
from bokeh.io import output_notebook

# Enable viewing Bokeh plots in the notebook
output_notebook()

In [2]:
nba = pd.read_csv('nba.csv')
nba.head()

Unnamed: 0,player,position,minutes,field_goal_perc,three_point_perc,free_throw_perc,rebounds,assists,steals,blocks,points,team,conference,scorer_category
0,Russell Westbrook,PG,34.6,0.425,0.343,0.845,10.7,10.4,1.6,0.4,31.6,OKC,West,High Scorer
1,James Harden,PG,36.4,0.44,0.347,0.847,8.1,11.2,1.5,0.5,29.1,HOU,West,High Scorer
2,Isaiah Thomas,PG,33.8,0.463,0.379,0.909,2.7,5.9,0.9,0.2,28.9,BOS,East,High Scorer
3,Anthony Davis,C,36.1,0.505,0.299,0.802,11.8,2.1,1.3,2.2,28.0,NO,West,High Scorer
4,DeMar DeRozan,SG,35.4,0.467,0.266,0.842,5.2,3.9,1.1,0.2,27.3,TOR,East,High Scorer


In [3]:
shooting_guards = nba.loc[nba["position"] == "SG"]
shooting_guards.head()

Unnamed: 0,player,position,minutes,field_goal_perc,three_point_perc,free_throw_perc,rebounds,assists,steals,blocks,points,team,conference,scorer_category
4,DeMar DeRozan,SG,35.4,0.467,0.266,0.842,5.2,3.9,1.1,0.2,27.3,TOR,East,High Scorer
17,Bradley Beal,SG,34.9,0.482,0.404,0.825,3.1,3.5,1.1,0.3,23.1,WSH,East,High Scorer
22,Klay Thompson,SG,34.0,0.468,0.414,0.853,3.7,2.1,0.8,0.5,22.3,GS,West,High Scorer
23,Devin Booker,SG,35.0,0.423,0.363,0.832,3.2,3.4,0.9,0.3,22.1,PHX,West,High Scorer
35,Zach LaVine,SG,37.2,0.459,0.387,0.836,3.4,3.0,0.9,0.2,18.9,MIN,West,Average Scorer


In [4]:
small_forwards = nba.loc[nba["position"] == "SF"]
small_forwards.head()

Unnamed: 0,player,position,minutes,field_goal_perc,three_point_perc,free_throw_perc,rebounds,assists,steals,blocks,points,team,conference,scorer_category
7,LeBron James,SF,37.8,0.548,0.363,0.674,8.6,8.7,1.2,0.6,26.4,CLE,East,High Scorer
8,Kawhi Leonard,SF,33.4,0.485,0.38,0.88,5.8,3.5,1.8,0.7,25.5,SA,West,High Scorer
11,Kevin Durant,SF,33.4,0.537,0.375,0.875,8.3,4.8,1.1,1.6,25.1,GS,West,High Scorer
13,Jimmy Butler,SF,37.0,0.455,0.367,0.865,6.2,5.5,1.9,0.4,23.9,CHI,East,High Scorer
14,Paul George,SF,35.9,0.461,0.393,0.898,6.6,3.3,1.6,0.4,23.7,IND,East,High Scorer


In [5]:
TOOLTIPS = [("Name", "@player"), ("Team", "@team"), ("Field Goal %", "@field_goal_perc{0.2f}")]
fig = figure(x_axis_label="Assists", y_axis_label="Points", title="Shooting Guard vs Small Forward", tooltips=TOOLTIPS)

# Add glyphs for shooting guards
fig.circle(x="assists", y="points", source=shooting_guards, legend_label="Shooting Guard", size=16, fill_color="red", fill_alpha=0.2)

# Add glyphs for small forwards
fig.circle(x="assists", y="points", source=small_forwards, legend_label="Small Forward", size=6, fill_color="green", fill_alpha=0.6)

output_file(filename="sg_vs_sf.html")
show(fig)

**Big shooters**

Traditionally, the tallest basketball players are in the center position, and they primarily shoot close to the basket. However, there has been a trend towards all positions shooting more three-point field goals in recent years.

The agency has a scatter plot visualizing "three_point_perc" versus "field_goal_perc" for centers and power forwards in the NBA. The dataset has been filtered for each position, and Bokeh source objects named centers and power_forwards have been preloaded for you. TOOLTIPS has also been created to display player names and average points per game.

The agency has asked you to change the plot's glyph settings to aid interpretation.

In [6]:
nba['position'].unique()

array(['PG', 'C', 'SG', 'SF', 'PF'], dtype=object)

In [7]:
centers = nba.loc[nba["position"] == "C"]
centers.head()

Unnamed: 0,player,position,minutes,field_goal_perc,three_point_perc,free_throw_perc,rebounds,assists,steals,blocks,points,team,conference,scorer_category
3,Anthony Davis,C,36.1,0.505,0.299,0.802,11.8,2.1,1.3,2.2,28.0,NO,West,High Scorer
5,DeMarcus Cousins,C,34.2,0.452,0.361,0.772,11.0,4.6,1.4,1.3,27.0,NO/SAC,West,High Scorer
12,Karl-Anthony Towns,C,37.0,0.542,0.367,0.832,12.3,2.7,0.7,1.3,25.1,MIN,West,High Scorer
28,Brook Lopez,C,29.6,0.474,0.346,0.81,5.4,2.3,0.5,1.7,20.5,BKN,East,High Scorer
30,Joel Embiid,C,25.4,0.466,0.367,0.783,7.8,2.1,0.9,2.5,20.2,PHI,East,High Scorer


In [8]:
power_forwards = nba.loc[nba["position"] == "PF"]
power_forwards.head()

Unnamed: 0,player,position,minutes,field_goal_perc,three_point_perc,free_throw_perc,rebounds,assists,steals,blocks,points,team,conference,scorer_category
25,Blake Griffin,PF,34.0,0.493,0.336,0.76,8.1,4.9,0.9,0.4,21.6,LAC,West,High Scorer
31,Jabari Parker,PF,33.9,0.49,0.365,0.743,6.2,2.8,1.0,0.4,20.1,MIL,East,High Scorer
33,Harrison Barnes,PF,35.5,0.468,0.351,0.861,5.0,1.5,0.8,0.2,19.2,DAL,West,Average Scorer
34,Kevin Love,PF,31.4,0.427,0.373,0.871,11.1,1.9,0.9,0.4,19.0,CLE,East,Average Scorer
39,Paul Millsap,PF,34.0,0.442,0.311,0.768,7.7,3.7,1.3,0.9,18.1,ATL,East,Average Scorer


In [9]:
fig = figure(x_axis_label="Field Goal Percentage", y_axis_label="Three Point Field Goal Percentage", tooltips = TOOLTIPS)
center_glyphs = fig.circle(x="field_goal_perc", y="three_point_perc", source=centers, legend_label="Center", fill_alpha=0.2)
power_forward_glyphs = fig.circle(x="field_goal_perc", y="three_point_perc", source=power_forwards, legend_label="Power Forward", fill_color="green", fill_alpha=0.6)

# Update glyph size
center_glyphs.glyph.size = 20
power_forward_glyphs.glyph.size = 10

# Update glyph fill_color
center_glyphs.glyph.fill_color = "red"
power_forward_glyphs.glyph.fill_color = "yellow"
output_file(filename="big_shooters.html")
show(fig)

**Evolution of the point guard**

The agency is going to run an article on the evolution of the Point Guard position in basketball.

They have asked you to produce a line plot displaying points and assists for two players who have redefined the standards of this position - Steph Curry and Chris Paul. Two Bokeh source objects, steph and chris, have been preloaded for you along with a figure.

You will add line glyphs representing points and assists for the two players, using different glyph settings.

In [None]:
fig = figure(x_axis_label="Season", y_axis_label="Performance")

# Add line glyphs for Steph Curry
fig.line(x="season", y="points", source=steph, line_width=2, line_color="green", alpha=0.5, legend_label="Steph Curry Points")
fig.line(x="season", y="assists", source=steph, line_width=4, line_color="purple", alpha=0.3, legend_label="Steph Curry Assists")

# Add line glyphs for Chris Paul
fig.line(x="season", y="points", source=chris, line_width=1, line_color="red", alpha=0.8, legend_label="Chris Paul Points")
fig.line(x="season", y="assists", source=chris, line_width=3, line_color="orange", alpha=0.2, legend_label="Chris Paul Assists")

output_file(filename="point_guards.html")
show(fig)

**Highlighting by glyph size**

The sports media agency you worked with previously has contacted you as they would like some more visualizations! They've requested a plot that uses different size glyphs to communicate about player statistics.

The nba dataset has been preloaded for you, and subset into two DataFrames, east and west, for the East and West conferences. You'll create a plot visualizing points against assists, with the glyph size depending on how many blocks per game a player averages

In [17]:
east = nba.loc[nba["conference"] == "East"]
east.head()

Unnamed: 0,player,position,minutes,field_goal_perc,three_point_perc,free_throw_perc,rebounds,assists,steals,blocks,points,team,conference,scorer_category
2,Isaiah Thomas,PG,33.8,0.463,0.379,0.909,2.7,5.9,0.9,0.2,28.9,BOS,East,High Scorer
4,DeMar DeRozan,SG,35.4,0.467,0.266,0.842,5.2,3.9,1.1,0.2,27.3,TOR,East,High Scorer
7,LeBron James,SF,37.8,0.548,0.363,0.674,8.6,8.7,1.2,0.6,26.4,CLE,East,High Scorer
10,Kyrie Irving,PG,35.1,0.473,0.401,0.905,3.2,5.8,1.2,0.3,25.2,CLE,East,High Scorer
13,Jimmy Butler,SF,37.0,0.455,0.367,0.865,6.2,5.5,1.9,0.4,23.9,CHI,East,High Scorer


In [18]:
west = nba.loc[nba["conference"] == "West"]
west.head()

Unnamed: 0,player,position,minutes,field_goal_perc,three_point_perc,free_throw_perc,rebounds,assists,steals,blocks,points,team,conference,scorer_category
0,Russell Westbrook,PG,34.6,0.425,0.343,0.845,10.7,10.4,1.6,0.4,31.6,OKC,West,High Scorer
1,James Harden,PG,36.4,0.44,0.347,0.847,8.1,11.2,1.5,0.5,29.1,HOU,West,High Scorer
3,Anthony Davis,C,36.1,0.505,0.299,0.802,11.8,2.1,1.3,2.2,28.0,NO,West,High Scorer
5,DeMarcus Cousins,C,34.2,0.452,0.361,0.772,11.0,4.6,1.4,1.3,27.0,NO/SAC,West,High Scorer
6,Damian Lillard,PG,35.9,0.444,0.37,0.895,4.9,5.9,0.9,0.3,27.0,POR,West,High Scorer


In [19]:
# Create sizes
east_sizes = east["blocks"]/5
west_sizes = west["blocks"]/5
fig = figure(x_axis_label="Assists", y_axis_label="Points", title="NBA Points, Blocks, and Assists by Conference")

# Add circle glyphs for east
fig.circle(x=east["assists"], y=east["points"], fill_color="blue", fill_alpha=0.3, radius=east_sizes, legend_label="East")

# Add circle glyphs for west
fig.circle(x=west["assists"], y=west["points"], fill_color="red", fill_alpha=0.3, radius=west_sizes, legend_label="West")

output_file(filename="size_contrast.html")
show(fig)

**Steals vs. assists**

The agency has heard about linear color mapping and would like you to incorporate it into a plot visualizing steals versus assists. You will use linear color mapping to change the glyph color as assists increase.

A source object called source has been created from the nba dataset and preloaded for you.

In [20]:
from bokeh.models import ColumnDataSource
source = ColumnDataSource(data=nba)

In [21]:
# Import required modules
from bokeh.transform import linear_cmap
from bokeh.palettes import RdBu8

# Create mapper
mapper = linear_cmap(field_name="assists", palette=RdBu8, low=min(nba["assists"]), high=max(nba["assists"]))

# Create the figure
fig = figure(x_axis_label="Steals", y_axis_label="Assists", title="Steals vs. Assists")

# Add circle glyphs
fig.circle(x="steals", y="assists", source=source, color=mapper)
output_file(filename="steals_vs_assists.html")
show(fig)

**Adding a color bar**

The agency has requested you include a ColorBar so people viewing the plot will understand the thresholds at which the glyph color changes.

The figure from the previous exercise, a mapper, and glyphs, have all been provided for you.

In [22]:
# Import ColorBar
from bokeh.models import ColorBar

mapper = linear_cmap(field_name="assists", palette=RdBu8, low=min(nba["assists"]), high=max(nba["assists"]))
fig = figure(x_axis_label="Steals", y_axis_label="Assists", title="Steals vs. Assists")
fig.circle(x="steals", y="assists", source=source, color=mapper)

# Create the color_bar
color_bar = ColorBar(color_mapper=mapper["transform"], width=8)

# Update layout with color_bar on the right
fig.add_layout(color_bar, "right")
output_file(filename="steals_vs_assists_color_mapped.html")
show(fig)

**Free throw percentage by position**

The agency has asked for one final plot from you. You'll use factor_cmap to build a scatter plot visualizing free throw percentage versus average points, displaying each player position as a different color.

A source object called source has been created from the nba dataset and preloaded for you. The variable TOOLTIPS, containing the name of the player, has also been created, so it can be viewed when hovering the mouse over the plot.

In [23]:
TOOLTIPS = [('Name', '@player')]

In [24]:
# Import modules
from bokeh.transform import factor_cmap
from bokeh.palettes import Category10_5

# Create positions
positions = ["PG", "SG", "SF", "PF", "C"]
fig = figure(x_axis_label="Free Throw Percentage", y_axis_label="Points", title="Free Throw Percentage vs. Average Points", tooltips=TOOLTIPS)

# Add circle glyphs
fig.circle(x="free_throw_perc", y="points", source=source, legend_field="position", fill_color=factor_cmap("position", palette=Category10_5, factors=positions))

output_file(filename="average_points_vs_free_throw_percentage.html")
show(fig)

**Sales by time and type of day**

The bakery you are working with is considering a review of their opening hours. As such, they have asked you to produce a visualization displaying sales information by the time of day for weekdays and weekends.

The day_time column of bakery contains four values: "Morning", "Afternoon", "Evening", and "Night".

The dataset also contains "Weekend" and "Weekday" values for the day_type column.

You will produce a grouped bar plot visualizing sales by both time and type of day. FactorRange has been imported for you.

The bakery dataset has been grouped by day_time and day_type, stored as grouped_bakery, and preloaded for you. A tuple containing every variation of these two columns has been stored as factors and also preloaded for you.

In [25]:
bakery = pd.read_csv('bakery.csv')
bakery.head()

Unnamed: 0.1,Unnamed: 0,transaction,items,day_time,day_type,date,sales
0,0,1,Bread,Morning,Weekend,2016-10-30,1.4
1,3,3,Hot chocolate,Morning,Weekend,2016-10-30,1.9
2,4,3,Cookies,Morning,Weekend,2016-10-30,1.25
3,5,4,Muffin,Morning,Weekend,2016-10-30,2.3
4,6,5,Coffee,Morning,Weekend,2016-10-30,2.2


In [26]:
grouped_bakery = bakery.groupby(["day_time", "day_type"], as_index=False)["sales"].sum()
grouped_bakery.head()

Unnamed: 0,day_time,day_type,sales
0,Afternoon,Weekday,12940.5
1,Afternoon,Weekend,7261.6
2,Evening,Weekday,522.8
3,Evening,Weekend,161.15
4,Morning,Weekday,8982.9


In [28]:
factors = [('Weekday', 'Morning'),
 ('Weekday', 'Afternoon'),
 ('Weekday', 'Evening'),
 ('Weekday', 'Night'),
 ('Weekend', 'Morning'),
 ('Weekend', 'Afternoon'),
 ('Weekend', 'Evening'),
 ('Weekend', 'Night')]

In [29]:
from bokeh.models import NumeralTickFormatter, FactorRange

# Create figure
fig = figure(x_range=FactorRange(*factors), y_axis_label="Sales", title="Sales by type of day")

# Create bar glyphs
fig.vbar(x=factors, top=grouped_bakery["sales"], width=0.9)
fig.yaxis[0].formatter = NumeralTickFormatter(format="$0,0")

# Update title text size
fig.title.text_font_size = "25px"

# Update title alignment
fig.title.align = "center"

output_file("sales_by_type_of_day.html")
show(fig)

**Products sold by the time of day**

The bakery would like a view of how many products are sold at different times of the day.

A figure, fig, has been set up and preloaded, including a HoverTool to display the time of day, item name, and the number of items sold.

You will need to modify add a title to the legend so stakeholders understand its meaning, move the legend to avoid obstructing the view of observations and change the legend to hide observations upon clicking.

In [32]:
bakery.drop(columns=bakery.columns[0], axis=1, inplace=True)

In [62]:
bakery_grouped = bakery.groupby(["day_time", "day_type", "items"], as_index=False).agg(
    {'transaction': 'count','sales': 'sum'}).reset_index().rename(columns={'transaction':'count'})

bakery_grouped

Unnamed: 0,index,day_time,day_type,items,count,sales
0,0,Afternoon,Weekday,Bread,1060,1484.0
1,1,Afternoon,Weekday,Brownie,139,389.2
2,2,Afternoon,Weekday,Cake,437,1092.5
3,3,Afternoon,Weekday,Coffee,1798,3955.6
4,4,Afternoon,Weekday,Coke,102,112.2
...,...,...,...,...,...,...
100,100,Morning,Weekend,Tea,128,153.6
101,101,Morning,Weekend,Toast,55,55.0
102,102,Morning,Weekend,Truffles,14,55.3
103,103,Night,Weekday,Juice,1,2.4


In [63]:
morning = bakery_grouped.loc[bakery_grouped["day_time"] == "Morning"]
morning.head()

Unnamed: 0,index,day_time,day_type,items,count,sales
67,67,Morning,Weekday,Bread,987,1381.8
68,68,Morning,Weekday,Brownie,61,170.8
69,69,Morning,Weekday,Cake,152,380.0
70,70,Morning,Weekday,Coffee,1679,3693.8
71,71,Morning,Weekday,Coke,21,23.1


In [64]:
afternoon = bakery_grouped.loc[bakery_grouped["day_time"] == "Afternoon"]
afternoon.head()

Unnamed: 0,index,day_time,day_type,items,count,sales
0,0,Afternoon,Weekday,Bread,1060,1484.0
1,1,Afternoon,Weekday,Brownie,139,389.2
2,2,Afternoon,Weekday,Cake,437,1092.5
3,3,Afternoon,Weekday,Coffee,1798,3955.6
4,4,Afternoon,Weekday,Coke,102,112.2


In [65]:
evening = bakery_grouped.loc[bakery_grouped["day_time"] == "Evening"]
evening.head()

Unnamed: 0,index,day_time,day_type,items,count,sales
36,36,Evening,Weekday,Bread,45,63.0
37,37,Evening,Weekday,Brownie,7,19.6
38,38,Evening,Weekday,Cake,23,57.5
39,39,Evening,Weekday,Coffee,66,145.2
40,40,Evening,Weekday,Coke,4,4.4


In [66]:
TOOLTIPS = [('Time of Day', '@day_time'), ('Item', '@items'), ('Volume Sold', '@count')]

In [105]:
fig = figure(x_axis_label="Count of Products Sold", y_axis_label="Sales", title="Bakery Product Sales", tooltips=TOOLTIPS)
fig.circle(x="count", y="sales", source=morning, line_color="red", size=12, fill_alpha=0.4, legend_label="Morning")
fig.circle(x="count", y="sales", source=afternoon, fill_color="purple", size=10, fill_alpha=0.6, legend_label="Afternoon")
fig.circle(x="count", y="sales", source=evening, fill_color="yellow", size=8, fill_alpha=0.6, legend_label="Evening")

# Add legend title
fig.legend.title = "Time of Day"

# Move the legend
fig.legend.location = "top_left"

# Make the legend interactive
fig.legend.click_policy = "hide"
fig.yaxis[0].formatter = NumeralTickFormatter(format="$0.00")
output_file("Sales_by_time_of_day.html")
show(fig)

**Box annotations for sales performance**

The bakery has asked for one last plot from you. The visualization will display sales by date with two box annotations, so they can easily see which dates are under their revenue target of $250.

A figure, fig, with line glyphs, has been created using the code below:
```
sales = bakery.groupby("date", as_index=False)["sales"].sum()
source = ColumnDataSource(data=sales)
fig = figure(x_axis_label="Date", y_axis_label="Revenue ($)")
fig.line(x="date", y="sales", source=source)
fig.xaxis[0].formatter = DatetimeTickFormatter(months="%b %Y")
```
fig is preloaded for you. Your task is to create box annotations to show where sales in the bakery dataset are above or below $250.

In [75]:
from bokeh.models import NumeralTickFormatter, DatetimeTickFormatter

sales = bakery.groupby("date", as_index=False)["sales"].sum()
date_format="%Y-%m-%d"
sales["date"] = pd.to_datetime(sales['date'], format=date_format)
sales.head()

Unnamed: 0,date,sales
0,2016-01-11,243.7
1,2016-01-12,144.5
2,2016-02-11,259.9
3,2016-02-12,207.25
4,2016-03-11,309.8


In [76]:
source = ColumnDataSource(data=sales)
fig = figure(x_axis_label="Date", y_axis_label="Revenue ($)")
fig.line(x="date", y="sales", source=source)
fig.xaxis[0].formatter = DatetimeTickFormatter(months="%b %Y")

In [77]:
from bokeh.models import BoxAnnotation

# Create low_box
low_box = BoxAnnotation(top=250, fill_alpha=0.1, fill_color='red')

# Create high_box
high_box = BoxAnnotation(bottom=250, fill_alpha=0.2, fill_color='green')

# Add low_box
fig.add_layout(low_box)

# Add high_box
fig.add_layout(high_box)

output_file(filename="sales_annotated.html")
show(fig)

**Setting up a polygon annotation**

A member of a hedge fund, who is an avid sports fan, saw your work for the sports media agency and has reached out as they need some plots produced for stock market analysis.

They are looking into the online media market and have provided you with a dataset called netflix, containing stock prices for Netflix. It has been stored as a source object called source and preloaded for you.

A figure, fig, has been created containing line glyphs. They would like you to highlight a period of significant growth. You plan to use a polygon annotation to draw attention to changes in Netflix's stock price in mid-2017 and add it to the line plot. To start, you need to create the start and end dates and timestamps.

In [97]:
stocks = pd.read_csv('stocks_cleaned.csv')
stocks.head()

Unnamed: 0.1,Unnamed: 0,date,open,high,low,close,volume,name
0,0,2014-06-02,90.5656,90.6899,88.9285,89.8071,92337903,AAPL
1,1,2014-06-03,89.7799,91.2485,89.7499,91.0771,73231620,AAPL
2,2,2014-06-04,91.0628,92.5556,90.8728,92.1171,83870521,AAPL
3,3,2014-06-05,92.3142,92.767,91.8013,92.4785,75951141,AAPL
4,4,2014-06-06,92.8428,93.037,92.0671,92.2242,87620911,AAPL


In [98]:
stocks.drop(columns=stocks.columns[0], axis=1, inplace=True)
stocks.head(2)

Unnamed: 0,date,open,high,low,close,volume,name
0,2014-06-02,90.5656,90.6899,88.9285,89.8071,92337903,AAPL
1,2014-06-03,89.7799,91.2485,89.7499,91.0771,73231620,AAPL


In [99]:
date_format="%Y-%m-%d"
stocks["date"] = pd.to_datetime(stocks['date'], format=date_format)
stocks.head()

Unnamed: 0,date,open,high,low,close,volume,name
0,2014-06-02,90.5656,90.6899,88.9285,89.8071,92337903,AAPL
1,2014-06-03,89.7799,91.2485,89.7499,91.0771,73231620,AAPL
2,2014-06-04,91.0628,92.5556,90.8728,92.1171,83870521,AAPL
3,2014-06-05,92.3142,92.767,91.8013,92.4785,75951141,AAPL
4,2014-06-06,92.8428,93.037,92.0671,92.2242,87620911,AAPL


In [100]:
nflx = stocks.loc[stocks["name"] == "NFLX"]
nflx.head()

Unnamed: 0,date,open,high,low,close,volume,name
135780,2014-06-02,59.9257,60.4157,58.9285,60.2942,20501467,NFLX
135781,2014-06-03,59.9985,60.7999,59.5785,59.6528,17140074,NFLX
135782,2014-06-04,59.5371,60.6428,59.0428,60.4585,17444735,NFLX
135783,2014-06-05,60.5357,61.3428,59.7957,61.1928,18946123,NFLX
135784,2014-06-06,61.4299,62.1271,61.1942,61.4471,15734915,NFLX


In [101]:
import datetime as dt

nflx = nflx[(nflx['date']>dt.datetime(2016,12,1)) & (nflx['date']<dt.datetime(2017,11,1))]
nflx.head()

Unnamed: 0,date,open,high,low,close,volume,name
136413,2016-12-02,116.75,120.98,116.75,120.81,8953590,NFLX
136414,2016-12-05,120.73,120.75,118.4,119.16,7626283,NFLX
136415,2016-12-06,120.1,124.79,119.42,124.57,11513209,NFLX
136416,2016-12-07,124.48,125.75,123.25,125.39,8204687,NFLX
136417,2016-12-08,125.4,126.35,122.16,123.24,8987712,NFLX


In [102]:
source = ColumnDataSource(data=nflx)
fig = figure(x_axis_label="Date", y_axis_label="Price ($)", title="Netflix Stock Price")
fig.line(x="date", y="close", line_color = "red", source=source)
fig.xaxis[0].formatter = DatetimeTickFormatter(months="%b %Y")

output_file(filename="netflix_stock_price.html")
show(fig)

In [103]:
# Import PolyAnnotation
from bokeh.models import PolyAnnotation

# Create start and end dates
start_date = dt.datetime(2017, 6, 30)
end_date = dt.datetime(2017, 7, 27)

# Create start and end floats
start_float = start_date.timestamp() * 1000
end_float = end_date.timestamp() * 1000

**Annotating Netflix stock price growth**

As a reminder, you previously created dates and timestamps, displayed below:
```
start_date = dt.datetime(2017, 6, 30)
end_date = dt.datetime(2017, 7, 27)
start_float = start_date.timestamp() * 1000
end_float = end_date.timestamp() * 1000
```
The final steps to display the Netflix line plot with a polygon annotation are to subset the data for the stock price, call PolyAnnotation(), and add the annotation to the figure's layout.

In [104]:
source = ColumnDataSource(data=nflx)
fig = figure(x_axis_label="Date", y_axis_label="Price ($)", title="Netflix Stock Price")
fig.line(x="date", y="close", line_color = "red", source=source)
fig.xaxis[0].formatter = DatetimeTickFormatter(months="%b %Y")

# Create start and end dates
start_date = dt.datetime(2017, 6, 30)
end_date = dt.datetime(2017, 7, 27)

# Create start and end floats
start_float = start_date.timestamp() * 1000
end_float = end_date.timestamp() * 1000

# Create start and end data
start_data = nflx.loc[nflx["date"] == start_date]["close"].values[0]
end_data = nflx.loc[nflx["date"] == end_date]["close"].values[0]

# Create polygon annotation
polygon = PolyAnnotation(fill_color="green", fill_alpha=0.4,
                         xs=[start_float, start_float, end_float, end_float],
                         ys=[start_data - 10, start_data + 10, end_data + 15, end_data - 15])

# Add polygon to figure and display
fig.add_layout(polygon)
output_file(filename="netflix_annotated.html")
show(fig)