# What are glyphs?
In Bokeh, visual properties of shapes are called glyphs. The visual properties of these glyphs such as position or color can be assigned single values, for example `x=10` or `fill_color='red'`.

In [1]:
from bokeh.plotting import figure
from bokeh.io import output_file, show

# Plotting data from NumPy arrays
In the previous exercises, you made plots using data stored in lists. You learned that Bokeh can plot both numbers and datetime objects.

In this exercise, you'll generate NumPy arrays using `np.linspace()` and `np.cos()` and plot them using the circle glyph.

`np.linspace()` is a function that returns an array of evenly spaced numbers over a specified interval. For example, `np.linspace(0, 10, 5)` returns an array of 5 evenly spaced samples calculated over the interval `[0, 10]`. `np.cos(x)` calculates the element-wise cosine of some array x.

In [None]:
import numpy as np

In [22]:
# Create array using np.linspace: x
x = np.linspace(0,5,100)

# Create array using np.cos: y
y = np.cos(x)

# Add circles at x and y
p = figure(x_axis_label='x', y_axis_label='y')
p.circle(x,y)

show(p)

# Plotting data from Pandas DataFrames
You can create Bokeh plots from Pandas DataFrames by passing column selections to the glyph functions.

In [None]:
import pandas as pd

# A simple scatter plot
In this example, you're going to make a scatter plot of female literacy vs fertility using data from the European Environmental Agency. This dataset highlights that countries with low female literacy have high birthrates. The x-axis data has been loaded for you as fertility and the y-axis data has been loaded as female_literacy.

Your job is to create a figure, assign x-axis and y-axis labels, and plot female_literacy vs fertility using the circle glyph.

After you have created the figure, in this exercise and the ones to follow, play around with it! Explore the different options available to you on the tab to the right, such as "Pan", "Box Zoom", and "Wheel Zoom". You can click on the question mark sign for more details on any of these tools.

In [58]:
df = pd.read_csv('../data/37. Visualización Interactiva con Bokeh/literacy_birth_rate.csv')
df

Unnamed: 0,Country,Continent,female literacy,fertility,population
0,Chine,ASI,90.5,1.769,1.324655e+09
1,Inde,ASI,50.8,2.682,1.139965e+09
2,USA,NAM,99,2.077,3.040600e+08
3,Indonésie,ASI,88.8,2.132,2.273451e+08
4,Brésil,LAT,90.2,1.827,1.919715e+08
...,...,...,...,...,...
177,Antilles néerlandaises,,96.3,,
178,Iles Caïmanes,,99,,
179,Seychelles,,92.3,,
180,Territoires autonomes palestiniens,,90.9,,


In [59]:
df.columns = ['country', 'continent', 'female_literacy', 'fertility','population']
df.dropna(inplace=True)
df.female_literacy = df.female_literacy.astype(float)
df.fertility = df.fertility.astype(float)
df

Unnamed: 0,country,continent,female_literacy,fertility,population
0,Chine,ASI,90.5,1.769,1.324655e+09
1,Inde,ASI,50.8,2.682,1.139965e+09
2,USA,NAM,99.0,2.077,3.040600e+08
3,Indonésie,ASI,88.8,2.132,2.273451e+08
4,Brésil,LAT,90.2,1.827,1.919715e+08
...,...,...,...,...,...
157,Vanuatu,OCE,79.5,3.883,2.338660e+05
158,Samoa,OCE,98.5,3.852,1.788690e+05
159,Sao Tomé-et-Principe,AF,83.3,3.718,1.601740e+05
160,Aruba,LAT,98.0,1.732,1.054550e+05


In [32]:
# Create the figure: p
p = figure(x_axis_label='fertility (children per woman)', y_axis_label='female_literacy (% population)')

# Add a circle glyph to the figure p
p.circle(df.fertility, df.female_literacy)

# Call the output_file() function and specify the name of the file
output_file('../outputs/fert_lit.html')

# Display the plot
show(p)

## A scatter plot with different shapes
By calling multiple glyph functions on the same figure object, we can overlay multiple data sets in the same figure.

In this exercise, you will plot female literacy vs fertility for two different regions, Africa and Latin America. Each set of x and y data has been loaded separately for you as `fertility_africa`, `female_literacy_africa`, `fertility_latinamerica`, and `female_literacy_latinamerica`.

Your job is to plot the Latin America data with the `circle()` glyph, and the Africa data with the `x()` glyph.

In [6]:
africa = df[df.continent=='AF']
latinamerica = df[df.continent=='LAT']

In [66]:
# Create the figure: p
p = figure(x_axis_label='fertility (children per woman)', y_axis_label='female_literacy (% population)')

# Add a circle glyph to the figure p
p.circle(latinamerica.fertility,latinamerica.female_literacy, legend='Latin America')

# Add an x glyph to the figure p
p.x(africa.fertility,africa.female_literacy, legend='Africa')

# Assign the legend to the bottom left: p.legend.location
p.legend.location='bottom_left'

# Fill the legend background with the color 'lightgray': p.legend.background_fill_color
p.legend.background_fill_color='lightgray'

# Display the plot
show(p)

## Customizing your scatter plots
The three most important arguments to customize scatter glyphs are `color`, `size`, and `alpha`. Bokeh accepts colors as hexadecimal strings, tuples of RGB values between 0 and 255, and any of the 147 CSS color names. Size values are supplied in screen space units with 100 meaning the size of the entire figure.

The `alpha` parameter controls transparency. It takes in floating point numbers between 0.0, meaning completely transparent, and 1.0, meaning completely opaque.

In this exercise, you'll plot female literacy vs fertility for Africa and Latin America as red and blue circle glyphs, respectively.

In [8]:
# Create the figure: p
p = figure(x_axis_label='fertility (children per woman)', y_axis_label='female_literacy (% population)')

# Add a blue circle glyph to the figure p
p.circle(latinamerica.fertility,latinamerica.female_literacy, size=10, alpha=0.8, color='blue')

# Add a red circle glyph to the figure p
p.circle(africa.fertility,africa.female_literacy, size=10, alpha=0.8, color='red')

# Display the plot
show(p)

# Lines
We can draw lines on Bokeh plots with the `line()` glyph function.

In this exercise, you'll plot the daily adjusted closing price of Apple Inc.'s stock (AAPL) from 2000 to 2013.

The data points are provided for you as lists. `date` is a list of datetime objects to plot on the x-axis and `open` is a list of prices to plot on the y-axis.

Since we are plotting dates on the x-axis, you must add `x_axis_type='datetime'` when creating the figure object.

In [35]:
appl = pd.read_csv('../data/37. Visualización Interactiva con Bokeh/appl.csv')
appl.date = pd.to_datetime(appl.date)
appl

Unnamed: 0,adj_close,close,date,high,low,open,volume
0,31.68,130.31,2000-03-01,132.06,118.50,118.56,38478000
1,29.66,122.00,2000-03-02,127.94,120.69,127.00,11136800
2,31.12,128.00,2000-03-03,128.23,120.00,124.87,11565200
3,30.56,125.69,2000-03-06,129.13,125.00,126.00,7520000
4,29.87,122.87,2000-03-07,127.44,121.12,126.44,9767600
...,...,...,...,...,...,...,...
3265,437.00,442.80,2013-02-25,455.12,442.57,453.85,13306400
3266,443.09,448.97,2013-02-26,451.54,437.66,443.82,17910700
3267,438.75,444.57,2013-02-27,452.44,440.65,448.43,20976800
3268,435.62,441.40,2013-02-28,447.87,441.40,444.05,11518400


In [36]:
# Create a figure with x_axis_type="datetime": p
p = figure(x_axis_type='datetime', x_axis_label='Date', y_axis_label='US Dollars')

# Plot date along the x axis and price along the y axis
p.line(appl.date, appl.open)

# Specify the name of the output file and show the result
show(p)

# Lines and markers
Lines and markers can be combined by plotting them separately using the same data points.

In this exercise, you'll plot a line and circle glyph for the AAPL stock prices. Further, you'll adjust the `fill_color` keyword argument of the `circle()` glyph function while leaving the line_color at the default value.

In [20]:
# Create a figure with x_axis_type='datetime': p
p = figure(x_axis_type='datetime', x_axis_label='Date', y_axis_label='US Dollars')

# Plot date along the x-axis and price along the y-axis
p.line(appl.date,appl.open)

# With date on the x-axis and price on the y-axis, add a white circle glyph of size 4
p.circle(appl.date, appl.open, fill_color='white', size=4)

show(p)

# The Bokeh ColumnDataSource

The ColumnDataSource is a table-like data object that maps string column names to sequences (columns) of data. It is the central and most common data structure in Bokeh.


You can create a `ColumnDataSource` object directly from a Pandas DataFrame by passing the DataFrame to the class initializer.

In this exercise, we have imported pandas as `pd` and read in a data set containing all Olympic medals awarded in the 100 meter sprint from 1896 to 2012. A color column has been added indicating the CSS colorname we wish to use in the plot for every data point.

Your job is to import the `ColumnDataSource` class, create a new `ColumnDataSource` object from the DataFrame `medals`, and plot circle glyphs with 'Year' on the x-axis and 'Time' on the y-axis. Color each glyph by the color column.

In [37]:
medals = pd.read_csv('../data/37. Visualización Interactiva con Bokeh/medals.csv')
medals

Unnamed: 0,Name,Country,Medal,Time,Year,color
0,Usain Bolt,JAM,GOLD,9.63,2012,goldenrod
1,Yohan Blake,JAM,SILVER,9.75,2012,silver
2,Justin Gatlin,USA,BRONZE,9.79,2012,saddlebrown
3,Usain Bolt,JAM,GOLD,9.69,2008,goldenrod
4,Richard Thompson,TRI,SILVER,9.89,2008,silver
...,...,...,...,...,...,...
80,Stanley Rowley,AUS,BRONZE,11.20,1900,saddlebrown
81,Thomas Burke,USA,GOLD,12.00,1896,goldenrod
82,Fritz Hofmann,GER,SILVER,12.20,1896,silver
83,Alojz Sokol,HUN,BRONZE,12.60,1896,saddlebrown


In [24]:
from bokeh.plotting import ColumnDataSource

In [40]:
# Create a ColumnDataSource from df: source
source = ColumnDataSource(medals)

# Add circle glyphs to the figure p
p = figure(x_axis_label='Year', y_axis_label='Time')

p.circle(x='Year',y='Time', source=source, color='color',size=8)

show(p)

# Hover glyphs
Now let's practice using and customizing the hover tool.

In this exercise, you're going to plot the blood glucose levels for an unknown patient. The blood glucose levels were recorded every 5 minutes on October 7th starting at 3 minutes past midnight.

The date and time of each measurement are provided to you as `x` and the blood glucose levels in mg/dL are provided as `y`.

Your job is to add a circle glyph that will appear red when the mouse is hovered near the data points. You will also add a customized hover tool object to the plot.

When you're done, play around with the hover tool you just created! Notice how the points where your mouse hovers over turn red.

In [51]:
gluc = pd.read_csv('../data/37. Visualización Interactiva con Bokeh/glucose.csv')
gluc.datetime = pd.to_datetime(gluc.datetime)
gluc

Unnamed: 0,datetime,isig,glucose
0,2010-10-07 00:03:00,22.10,150
1,2010-10-07 00:08:00,21.46,152
2,2010-10-07 00:13:00,21.06,149
3,2010-10-07 00:18:00,20.96,147
4,2010-10-07 00:23:00,21.52,148
...,...,...,...
283,2010-10-07 23:38:00,16.72,96
284,2010-10-07 23:43:00,17.60,100
285,2010-10-07 23:48:00,17.10,101
286,2010-10-07 23:53:00,16.06,99


In [45]:
from bokeh.models import HoverTool

In [52]:
p = figure(x_axis_label='Time of Day', y_axis_label='Blood glucose (mg/dL)')

# Add circle glyphs to figure p
p.circle(gluc.datetime, gluc.glucose, size=10,
         fill_color='grey', alpha=0.1, line_color=None,
         hover_fill_color='firebrick', hover_alpha=0.5,
         hover_line_color='white')

# Create a HoverTool: hover
hover = HoverTool(tooltips=None, mode='vline')

# Add the hover tool to the figure p
p.add_tools(hover)

# Specify the name of the output file and show the result
show(p)

# Colormapping
The final glyph customization we'll practice is using the CategoricalColorMapper to color each glyph by a categorical property.

Here, you're going to use the automobile dataset to plot miles-per-gallon vs weight and color each circle glyph by the region where the automobile was manufactured.

The `origin` column will be used in the ColorMapper to color automobiles manufactured in the US as blue, Europe as red and Asia as green.

In [54]:
auto = pd.read_csv('../data/37. Visualización Interactiva con Bokeh/auto.csv')
auto

Unnamed: 0,mpg,cyl,displ,hp,weight,accel,yr,origin,name,color,size
0,18.0,6,250.0,88,3139,14.5,71,US,ford mustang,blue,15.0
1,9.0,8,304.0,193,4732,18.5,70,US,hi 1200d,blue,20.0
2,36.1,4,91.0,60,1800,16.4,78,Asia,honda civic cvcc,red,10.0
3,18.5,6,250.0,98,3525,19.0,77,US,ford granada,blue,15.0
4,34.3,4,97.0,78,2188,15.8,80,Europe,audi 4000,green,10.0
...,...,...,...,...,...,...,...,...,...,...,...
387,18.0,6,250.0,88,3021,16.5,73,US,ford maverick,blue,15.0
388,27.0,4,151.0,90,2950,17.3,82,US,chevrolet camaro,blue,10.0
389,29.5,4,98.0,68,2135,16.6,78,Asia,honda accord lx,red,10.0
390,17.5,6,250.0,110,3520,16.4,77,US,chevrolet concours,blue,15.0


In [55]:
from bokeh.models import CategoricalColorMapper

In [68]:
# Convert df to a ColumnDataSource: source
source = ColumnDataSource(auto)

# Make a CategoricalColorMapper object: color_mapper
color_mapper = CategoricalColorMapper(factors=['Europe', 'Asia', 'US'],
                                      palette=['red', 'green', 'blue'])

p = figure(x_axis_label='weight (lbs)', y_axis_label='miles-per-gallon')

# Add a circle glyph to the figure p
p.circle('weight', 'mpg', source=source,
            color=dict(field='origin',transform=color_mapper),
            legend='origin')

show(p)

# Creating rows of plots
Layouts are collections of Bokeh figure objects.

In this exercise, you're going to create two plots from the Literacy and Birth Rate data set to plot fertility vs female literacy and population vs female literacy.

By using the `row()` method, you'll create a single layout of the two figures.

Remember, as in the previous chapter, once you have created your figures, you can interact with them in various ways.

In [57]:
from bokeh.layouts import row

In [60]:
source = ColumnDataSource(df)

# Create the first figure: p1
p1 = figure(x_axis_label='fertility (children per woman)', y_axis_label='female_literacy (% population)')

# Add a circle glyph to p1
p1.circle('fertility','female_literacy', source=source)

# Create the second figure: p2
p2 = figure(x_axis_label='population', y_axis_label='female_literacy (% population)')

# Add a circle glyph to p2
p2.circle('population','female_literacy', source=source)

# Put p1 and p2 into a horizontal row: layout
layout = row(p1, p2)

# Specify the name of the output_file and show the result
show(layout)

# Creating columns of plots
In this exercise, you're going to use the `column()` function to create a single column layout of the two plots you created in the previous exercise.

In [61]:
from bokeh.layouts import column

In [62]:
source = ColumnDataSource(df)

# Create the first figure: p1
p1 = figure(x_axis_label='fertility (children per woman)', y_axis_label='female_literacy (% population)')

# Add a circle glyph to p1
p1.circle('fertility','female_literacy', source=source)

# Create the second figure: p2
p2 = figure(x_axis_label='population', y_axis_label='female_literacy (% population)')

# Add a circle glyph to p2
p2.circle('population','female_literacy', source=source)

# Put p1 and p2 into a horizontal row: layout
layout = column(p1, p2)

# Specify the name of the output_file and show the result
show(layout)

# Adding a hover tooltip
Working with the `HoverTool` is easy for data stored in a ColumnDataSource.

In this exercise, you will create a `HoverTool` object and display the country for each circle glyph in the figure that you created in the last exercise. This is done by assigning the tooltips keyword argument to a list-of-tuples specifying the label and the column of values from the ColumnDataSource using the `@` operator.

In [70]:
# Convert df to a ColumnDataSource: source
source = ColumnDataSource(auto)

# Make a CategoricalColorMapper object: color_mapper
color_mapper = CategoricalColorMapper(factors=['Europe', 'Asia', 'US'],
                                      palette=['red', 'green', 'blue'])

p = figure(x_axis_label='weight (lbs)', y_axis_label='miles-per-gallon')

# Add a circle glyph to the figure p
p.circle('weight', 'mpg', source=source,
            color=dict(field='origin',transform=color_mapper),
            legend='origin')

# Create a HoverTool object: hover
hover = HoverTool(tooltips=[('Name','@name')])

# Add the HoverTool object to figure p
p.add_tools(hover)

show(p)

# Basic Interactor 
This code shows off an interactive visualization using Bokeh for plotting, and Ipython interactors for widgets. The demo runs entirely inside the Ipython notebook, with no Bokeh server required.

The dropdown offers a choice of trig functions to plot, and the sliders control the frequency, amplitude, and phase.

In [25]:
from ipywidgets import interact
import numpy as np

from bokeh.io import push_notebook, show, output_notebook
from bokeh.plotting import figure
output_notebook()

In [26]:
x = np.linspace(0, 2*np.pi, 2000)
y = np.sin(x)

In [27]:
p = figure(title="simple line example", plot_height=300, plot_width=600, y_range=(-5,5),
           background_fill_color='#efefef')
r = p.line(x, y, color="#8888cc", line_width=1.5, alpha=0.8)

In [28]:
def update(f, w=1, A=1, phi=0):
    if   f == "sin": func = np.sin
    elif f == "cos": func = np.cos
    r.data_source.data['y'] = A * func(w * x + phi)
    push_notebook()

In [31]:
interact(update, f=["sin", "cos"], w=(0,50), A=(1,10), phi=(0, 20, 0.1));

interactive(children=(Dropdown(description='f', options=('sin', 'cos'), value='sin'), IntSlider(value=1, descr…

In [29]:
show(p, notebook_handle=True)

In [32]:
from bokeh.io import push_notebook, show, output_notebook
from bokeh.layouts import row
from bokeh.plotting import figure
output_notebook()

In [33]:
opts = dict(plot_width=250, plot_height=250, min_border=0)

In [34]:
p1 = figure(**opts)
r1 = p1.circle([1,2,3], [4,5,6], size=20)

p2 = figure(**opts)
r2 = p2.circle([1,2,3], [4,5,6], size=20)

# get a handle to update the shown cell with
t = show(row(p1, p2), notebook_handle=True)

In [36]:
# this will update the left plot circle color with an explicit handle
r1.glyph.fill_color = "white"
push_notebook(handle=t)

In [37]:
# and this will update the right plot circle color because it was in the last shown cell
r2.glyph.fill_color = "pink"
push_notebook()

# Continuous update

We can update our plot anytime we want.

In [40]:
import time

import numpy as np
from bokeh.io import push_notebook, show, output_notebook
from bokeh.models import HoverTool
from bokeh.plotting import figure 
output_notebook()

In [41]:
N = 1000
x = np.random.random(size=N) * 100
y = np.random.random(size=N) * 100
radii = np.random.random(size=N) * 2
colors = ["#%02x%02x%02x" % (int(r), int(g), 150) for r, g in zip(50+2*x, 30+2*y)]

In [42]:
TOOLS="crosshair,pan,wheel_zoom,box_zoom,reset,tap,box_select,lasso_select"

p = figure(tools=TOOLS)
p.axis.major_label_text_font_size = "24px"
hover = HoverTool(tooltips=None, mode="vline")
p.add_tools(hover)
r = p.circle(x,y, radius=radii, 
             fill_color=colors, fill_alpha=0.6, line_color=None, 
             hover_fill_color="black", hover_fill_alpha=0.7, hover_line_color=None)

In [43]:
# get and explicit handle to update the next show cell with
target = show(p, notebook_handle=True)

In [46]:
i = 0
while True:
    i +=1 
    p.title.text = str(i)
    
    r.data_source.data['radius'] = radii * (2 + np.sin(i/5))
    
    x = r.data_source.data['x']
    y = r.data_source.data['y']
    d = np.sqrt((x-50)**2 + (y-50)**2)/100
    rand = 2 * (np.random.random(size=N) - 0.5)
    r.data_source.data['x'] = x + 2 * np.sin(d) * rand
    r.data_source.data['y'] = y + np.cos(d**2) * rand
    
    p.axis.major_label_text_color = r.data_source.data['fill_color'][int(i%N)]

    # push updates to the plot continuously using the handle (intererrupt the notebook kernel to stop)
    push_notebook(handle=target)
    time.sleep(0.1)

KeyboardInterrupt: 

# Some exploratory plots of the data
Here, you'll continue your Exploratory Data Analysis by making a simple plot of Life Expectancy vs Fertility for the year 1970.

Your job is to import the relevant Bokeh modules and then prepare a `ColumnDataSource` object with the `fertility`, `life` and `Country` columns, where you only select the rows with the index value 1970.

In [8]:
gapminder = pd.read_csv('../data/37. Visualización Interactiva con Bokeh/gapminder_tidy.csv', index_col='Year')
gapminder

Unnamed: 0_level_0,Country,fertility,life,population,child_mortality,gdp,region
Year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
1964,Afghanistan,7.671,33.639,10474903.0,339.7,1182.0,South Asia
1965,Afghanistan,7.671,34.152,10697983.0,334.1,1182.0,South Asia
1966,Afghanistan,7.671,34.662,10927724.0,328.7,1168.0,South Asia
1967,Afghanistan,7.671,35.170,11163656.0,323.3,1173.0,South Asia
1968,Afghanistan,7.671,35.674,11411022.0,318.1,1187.0,South Asia
...,...,...,...,...,...,...,...
2002,Åland,,81.800,26257.0,,,Europe & Central Asia
2003,Åland,,80.630,26347.0,,,Europe & Central Asia
2004,Åland,,79.880,26530.0,,,Europe & Central Asia
2005,Åland,,80.000,26766.0,,,Europe & Central Asia
