# Bokeh 

### Bokeh is a library for interactive visualizations in Python. 

website : https://bokeh.pydata.org/en/latest/

We will be creating an interactive visualisation with bokeh. Our data will be a subset of the Gapminder Data - population data. We will be trying to recreate the visualizations that Hans Rosling uses in his famous TED talk (https://www.youtube.com/watch?v=hVimVzgtD6w). 

In [None]:
import bokeh
import pandas as pd
from vega_datasets import data

In [None]:
from bokeh.io import output_notebook, show
from bokeh.plotting import figure

We will import more functions from bokeh 

In [None]:
output_notebook()

In [None]:
gapminder = pd.DataFrame(data.gapminder())
gapminder.head()

In [None]:
gapminder.loc[gapminder['year'] == 2000].head()

Lets plot fertility on the x axis and life expectancy on the y axis

In [None]:
fertility = gapminder.loc[gapminder['year'] == 2000]['fertility']
life = gapminder.loc[gapminder['year'] == 2000]['life_expect']

grph = figure() # can adjust height
grph.circle(x = fertility, y = life) # try cross, change color
show(grph)

The ColumnDataSource is the core of most Bokeh plots, providing the data that is visualized by the glyphs of the plot.  It's a mapping between column names and lists of data. The ColumnDataSource takes a data parameter which is a dict, with string column names as keys and lists (or arrays) of data values as values. 

In [None]:
from bokeh.models import ColumnDataSource


In [None]:
country = gapminder.loc[gapminder['year'] == 2000]['country']

In [None]:
source = ColumnDataSource(dict(x=fertility,y=life, country = country))
#print(source.column_names)
#source.data

In [None]:
grph.circle(x = 'x', y = 'y', source =source)
show(grph)

In [None]:
# use other columns in our data

In [None]:
from bokeh.models import HoverTool

In [None]:
hover = HoverTool(tooltips = '@country')
tool_properties = [hover,"pan,wheel_zoom,box_zoom,reset"]
grph = figure(tools = tool_properties, height = 300)
grph.circle(x = 'x', y = 'y', source =source, color = 'teal')
show(grph)

To learn more about configuring plot tools - https://bokeh.pydata.org/en/latest/docs/user_guide/tools.html

In [None]:
# map the size of the bubble to the population
from bokeh.models import LinearInterpolator

population = gapminder.loc[gapminder['year'] == 2000]['pop']
source = ColumnDataSource(dict(x=fertility,y=life, country = country, size = population))

hover = HoverTool(tooltips = '@country')
tool_properties = [hover,"pan,wheel_zoom,box_zoom,reset"]
grph = figure(tools = tool_properties, height = 300)

grph.circle(x = 'x', y = 'y', size ='size', source =source, color = 'teal')

show(grph)

In [None]:
# map the size of the bubble to the population
from bokeh.models import LinearInterpolator

population = gapminder.loc[gapminder['year'] == 2000]['pop']
size_mapper = LinearInterpolator(x = [population.min(), population.max()], y = [5,50])

source = ColumnDataSource(dict(x=fertility,y=life, country = country, population = population))

hover = HoverTool(tooltips = [('Country','@country'),('Population','@population'),])
tool_properties = [hover,"pan,wheel_zoom,box_zoom,reset"]

grph = figure(tools = tool_properties, height = 300, title = 'Fertility and Life Expectancy')

#can add transparancy with alpha =0.6
grph.circle(x = 'x', y = 'y', size ={'field' : 'population', 'transform' :size_mapper}, alpha =0.6, source =source, color = 'teal') 

show(grph)

In [None]:
# countries colored by region 
from bokeh.models import LinearInterpolator, CategoricalColorMapper
from bokeh.palettes import Spectral6, brewer

region_values = list(gapminder.cluster.unique())
region_values = [str(r) for r in region_values ]

# Get the number of colors we'll need for the plot.
colors = brewer["Spectral"][len(gapminder.cluster.unique())]

population = gapminder.loc[gapminder['year'] == 2000]['pop']
region = gapminder.loc[gapminder['year'] == 2000]['cluster']

size_mapper = LinearInterpolator(x = [population.min(), population.max()], y = [5,50])
#color_mapper = CategoricalColorMapper(factors = region_values ,palette = Spectral6)
color_mapper ={i: colors[i] for i in gapminder.cluster.unique()}


# Create a list of colors for each value that we will be looking at.
colors = [color_mapper[x] for x in gapminder.cluster]
source = ColumnDataSource(dict(x=fertility,y=life, country = country, population = population, color = colors))

hover = HoverTool(tooltips = [('Country','@country'),('Population','@population'),])
tool_properties = [hover,"pan,wheel_zoom,box_zoom,reset"]
grph = figure(tools = tool_properties, height = 300, title = 'Fertility and Life Expectancy')

grph.circle(x = 'x', y = 'y', 
            color = 'color',
            size ={'field' : 'population', 'transform' :size_mapper}, 
            alpha =0.6, 
            source =source) 

show(grph)