## plotting with Bokeh

In [4]:
import pandas as pd

devs = pd.read_csv("Data/dev_salaries.csv")
devs.head()

Unnamed: 0,Age,All_Devs,Python,JavaScript
0,18,17784,20046,16446
1,19,16500,17100,16791
2,20,18012,20000,18942
3,21,20628,24744,21780
4,22,25206,30500,25704


In [34]:
# Import figure from bokeh.plotting
from bokeh.plotting import figure

# Import output_file and show from bokeh.io
from bokeh.io import show, output_notebook

# Create the figure: p
p = figure(x_axis_label='Age', y_axis_label='All_Devs')

# Add a circle glyph to the figure p
p.circle(devs["Age"], devs["All_Devs"])

## Call the output_file() function and specify the name of the file
## output_file("devs.html")

output_notebook()

# Display the plot
show(p)

### A scatter plot with different shapes
By calling multiple glyph functions on the same figure object, we can overlay multiple data sets in the same figure.

We will plot the Python data with the circle() glyph, and the JavaScript data with the x() glyph.

In [11]:
# Create the figure: p
p = figure(x_axis_label='Age', y_axis_label='Income')

# Add a circle glyph to the figure p
p.circle(devs["Age"], devs["Python"])

# Add an x glyph to the figure p
p.x(devs["Age"], devs["JavaScript"])

# Specify the name of the file
# output_file('Python_JavaScript.html')

output_notebook()

# Display the plot
show(p)


### Customizing your scatter plots
The three most important arguments to customize scatter glyphs are color, size, and alpha. Bokeh accepts colors as hexadecimal strings, tuples of RGB values between 0 and 255, and any of the 147 [CSS color names](http://www.colors.commutercreative.com/grid/). Size values are supplied in screen space units with 100 meaning the size of the entire figure.

The alpha parameter controls transparency. It takes in floating point numbers between 0.0, meaning completely transparent, and 1.0, meaning completely opaque.

In [14]:
# Create the figure: p
p = figure(x_axis_label='Age', y_axis_label='All_Devs')

# Add a blue circle glyph to the figure p
p.circle(devs["Age"], devs["Python"], color="blue", size=10, alpha=0.7)

# Add a red circle glyph to the figure p
p.circle(devs["Age"], devs["JavaScript"], color="red", size=10, alpha=0.7)

# Specify the name of the file
# output_file('fert_lit_separate_colors.html')

output_notebook()

# Display the plot
show(p)


### Lines

We can draw lines on Bokeh plots with the `line()` glyph function.

In [18]:
# Create a figure: p
p = figure(x_axis_label='Age', y_axis_label='All_Devs')

# Plot date along the x axis and price along the y axis
p.line(devs["Age"], devs["All_Devs"])

output_notebook()

show(p)

### Lines and markers

Lines and markers can be combined by plotting them separately using the same data points.

We can adjust the fill_color keyword argument of the `circle()` glyph function while leaving the line_color at the default value.

In [20]:
# Create a figure with x_axis_type='datetime': p
p = figure(x_axis_label='Age', y_axis_label='All_Devs')

# Plot date along the x-axis and price along the y-axis
p.line(devs["Age"], devs["All_Devs"])

# With date on the x-axis and price on the y-axis, add a white circle glyph of size 4
p.circle(devs["Age"], devs["All_Devs"], fill_color='white', size=6)

output_notebook()

show(p)

### Patches
In Bokeh, extended geometrical shapes can be plotted by using the `patches()` glyph function. The patches glyph takes as input a list-of-lists collection of numeric values specifying the vertices in x and y directions of each distinct patch to plot.

We will plot the state borders of Arizona, Colorado, New Mexico and Utah. The latitude and longitude vertices for each state have been read from csv files.

We will plot longitude on the x-axis and latitude on the y-axis.

In [67]:
lons = pd.read_csv("Data/lons.csv", float_precision='round_trip').fillna(0)
lons

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,198,199,200,201,202,203,204,205,206,207
0,-114.63332,-114.63349,-114.63423,-114.60899,-114.63064,-114.57354,-114.58031,-114.61121,-114.6768,-114.66076,...,-114.4355,-114.35765,-114.26017,-114.14737,-114.29195,-114.38169,-114.44166,-114.48236,-114.56953,-114.63305
1,-109.04984,-109.06017,-109.06015,-109.05655,-109.05305,-109.05158,-109.05119,-109.05077,-109.05132,-109.05077,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,-103.55583,-104.00265,-104.64165,-105.14679,-105.90075,-106.55721,-106.63119,-106.62216,-106.63325,-106.61103,...,-103.04312,-103.04338,-103.04362,-103.04374,-103.04376,-103.04993,-103.05727,-103.06464,-103.06478,-103.53275
3,-114.04392,-114.04391,-114.04375,-114.04195,-114.04061,-114.04055,-114.0398,-114.04172,-114.0391,-113.80254,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [68]:
lats = pd.read_csv("Data/lats.csv", float_precision='round_trip').fillna(0)
lats

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,198,199,200,201,202,203,204,205,206,207
0,34.87057,35.00186,35.00332,35.07971,35.11791,35.14231,35.21811,35.37012,35.49125,35.5417,...,34.04257,34.12866,34.17212,34.31087,34.41527,34.47903,34.64288,34.71453,34.79181,34.86997
1,38.215,38.40118,38.60929,38.81393,38.95788,39.11656,39.22605,39.36423,39.56752,39.79876,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,32.00032,32.00001,32.00041,32.0005,32.00198,32.00076,31.98981,31.93601,31.90997,31.84661,...,34.67259,34.53564,34.40999,34.27181,34.03983,33.71754,33.35051,33.00011,32.59516,32.00034
3,40.68928,40.68985,40.76026,41.05548,41.36,41.59062,41.89425,41.99372,41.99367,41.98895,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [77]:
# Create a figure : p
p = figure(x_axis_label='Age', y_axis_label='All_Devs')

# Create a list of az_lons, co_lons, nm_lons and ut_lons: x
x1 = lons.iloc[0].tolist()
x2 = lons.iloc[1].tolist()
x2 = [i for i in x2 if i != 0.0] 
x3 = lons.iloc[2].tolist()
x4 = lons.iloc[3].tolist()
x4 = [i for i in x4 if i != 0.0] 
x = [x1,x2,x3,x4]

# # Create a list of az_lats, co_lats, nm_lats and ut_lats: y
y1 = lats.iloc[0].tolist()
y2 = lats.iloc[1].tolist()
y2 = [i for i in y2 if i != 0.0] 
y3 = lats.iloc[2].tolist()
y4 = lats.iloc[3].tolist()
y4 = [i for i in y4 if i != 0.0] 
y = [y1,y2,y3,y4]

# Add patches to figure p with line_color=white for x and y
p.patches(x,y, line_color = 'white')

output_notebook()

show(p)

### Plotting data from NumPy arrays

We can generate NumPy arrays using `np.linspace()` and `np.cos()` and plot them using the circle glyph.

`np.linspace()` is a function that returns an array of evenly spaced numbers over a specified interval. For example, `np.linspace(0, 10, 5)` returns an array of 5 evenly spaced samples calculated over the interval [0, 10]. `np.cos(x)` calculates the element-wise cosine of some array x.

In [11]:
# Import numpy as np
import numpy as np

# Create a figure : p
p = figure()

# Create array using np.linspace: x
x = np.linspace(0, 5, 100)

# Create array using np.cos: y
y = np.cos(x)

# Add circles at x and y
p.circle(x,y)

output_notebook()

show(p)

### Plotting data from Pandas DataFrames

We can create Bokeh plots from Pandas DataFrames by passing column selections to the glyph functions.

In [14]:
# Import pandas as pd
import pandas as pd

# Read in the CSV file: df
df = pd.read_csv("Data/dev_salaries.csv")

# Import figure from bokeh.plotting
from bokeh.plotting import figure

# Create the figure: p
p = figure(x_axis_label='Age', y_axis_label='All_Devs')

# Plot mpg vs hp by color
p.circle(df["Age"], df["All_Devs"], size =10)

output_notebook()

show(p)

### The Bokeh ColumnDataSource 

We can create a "ColumnDataSource" object directly from a Pandas DataFrame by passing the DataFrame to the class initializer.

We will import the ColumnDataSource class, create a new ColumnDataSource object from the DataFrame df, and plot circle glyphs.

In [35]:
# Import the ColumnDataSource class from bokeh.plotting
from bokeh.plotting import ColumnDataSource 

# Create a ColumnDataSource from df: source
source = ColumnDataSource(df)

# Create the figure: p
p = figure(x_axis_label='Age', y_axis_label='Python')

# Add circle glyphs to the figure p
p.circle(x='Age', y="Python", size=10, source=source)

output_notebook()

show(p)

### Customizing glyphs

### Selection and non-selection glyphs
We will add the box_select and lasso_select tool to a figure and change the selected and non-selected circle glyph properties so that selected glyphs are red and non-selected glyphs are transparent blue.

In [42]:
# Import output_file and show from bokeh.io
from bokeh.io import show, output_file

# Create a figure with the "box_select" tool: p
p = figure(x_axis_label = "Age", y_axis_label="Python", tools='box_select, lasso_select, crosshair')

# Add circle glyphs to the figure p with the selected and non-selected properties
p.circle(devs["Age"], devs["All_Devs"], selection_color="red", nonselection_alpha=0.2)

# Specify the name of the output file and show the result
output_file ('selection_glyph.html')

output_notebook()

show(p)

### Hover glyphs

We will add a circle glyph that will appear red when the mouse is hovered near the data points. We will also add a customized hover tool object to the plot.

In [47]:
# import the HoverTool
from bokeh.models import HoverTool

# Create a figure: p
p = figure(x_axis_label = "Age", y_axis_label="Python")

# Add circle glyphs to figure p
p.circle(devs["Age"], devs["All_Devs"], size =10, fill_color="grey", alpha=0.1, line_color=None,
         hover_fill_color="firebrick", hover_alpha=0.5,
         hover_line_color="white")

# Create a HoverTool: hover
hover = HoverTool(tooltips=None, mode='vline')

# Add the hover tool to the figure p
p.add_tools(hover)

output_notebook()
         
show(p)

### Colormapping

We will use the CategoricalColorMapper to color each glyph by a categorical property.

We will use the automobile dataset to plot miles-per-gallon vs weight and color each circle glyph by the region where the automobile was manufactured.

The origin column will be used in the ColorMapper to color automobiles manufactured in the US as blue, Europe as red and Asia as green.

In [49]:
#Import CategoricalColorMapper from bokeh.models
from bokeh.models import CategoricalColorMapper

df = pd.read_csv("Data/auto.csv")

# Convert df to a ColumnDataSource: source
source = ColumnDataSource(df)

# Make a CategoricalColorMapper object: color_mapper
color_mapper = CategoricalColorMapper(factors=['Europe', 'Asia', 'US'],
                                      palette=['red', 'green', 'blue'])


p = figure(x_axis_label='weight (lbs)', y_axis_label='meter per galonl')

# Add a circle glyph to the figure p
p.circle("weight", 'mpg', source=source,
            color=dict(field='origin', transform=color_mapper),
            legend='origin')

# Specify the name of the output file and show the result
output_file('colormap.html')
show(p)



Source : Datacamp - Interactive Data Visualization with Bokeh