# Bokeh

Bokeh is an interactive Python library for visualizations  that uses web browsers for its presentation. Bokeh supports a wide variety of visualization tasks from basic exploration through to building advanced data applications. Its goal is to provide elegant, concise construction of novel graphics in the style of D3.js, and to extend this capability with high-performance interactivity over very large or streaming datasets. Bokeh can help anyone who would like to quickly and easily create interactive plots, dashboards, and data applications.

 - To get started using Bokeh , see the [User Guide](http://bokeh.pydata.org/en/latest/docs/user_guide.html#userguide).
 - Check out the [Gallery] to see examples (http://bokeh.pydata.org/en/latest/docs/gallery.html#gallery).
 - A complete API reference of Bokeh is at [Reference Guide](http://bokeh.pydata.org/en/latest/docs/reference.html#refguide).



## Importing data

In [166]:
dataset = 


data = pd.read_csv('gapminder.tsv', delimiter='\t',thousands=',',index_col='year')
data.head()

Unnamed: 0_level_0,country,continent,lifeExp,pop,gdpPercap
year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1952,Afghanistan,Asia,28.801,8425333,779.445314
1957,Afghanistan,Asia,30.332,9240934,820.85303
1962,Afghanistan,Asia,31.997,10267083,853.10071
1967,Afghanistan,Asia,34.02,11537966,836.197138
1972,Afghanistan,Asia,36.088,13079460,739.981106


let's rename the pop column to population since, pop is a keyword in Python

In [167]:
data = data.rename(columns={'pop':'population'});
data.head()

Unnamed: 0_level_0,country,continent,lifeExp,population,gdpPercap
year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1952,Afghanistan,Asia,28.801,8425333,779.445314
1957,Afghanistan,Asia,30.332,9240934,820.85303
1962,Afghanistan,Asia,31.997,10267083,853.10071
1967,Afghanistan,Asia,34.02,11537966,836.197138
1972,Afghanistan,Asia,36.088,13079460,739.981106


Before, we start with using Bokeh on the Gapminder data, lets get an idea about its various features and usage. 

* We need to begin by setting up an output interface for Bokeh. This is where the visualisations will be rendered. There are a [number of ways](https://docs.bokeh.org/en/latest/docs/user_guide/concepts.html#output-methods) to output the visualisations but let's stick to the basic two:
  - **output_file('filename.html')** outputs the visualization to a static HTML file.
  - **output_notebook()** renders the visualization in a Jupyter Notebook.
  
To display Bokeh plots inline in a Jupyter notebook, use the `output_notebook()` function from bokeh.io. When `show()` is called, the plot will be displayed inline in the next notebook output cell. To save your Bokeh plots, you can use the `output_file()` function instead (or in addition).
  
  

In [168]:
from bokeh.io import output_notebook, output_file

#output_file('filename.html')  # Render to static HTML, or 
output_notebook()  # Render inline in a Jupyter Notebook

## Generating an Empty canvas with Bokeh

Let's generate a bare empty figure with Bokeh just to get things started

In [169]:
from bokeh.io import show
from bokeh.plotting import figure

# Set up a generic figure() object
fig = figure()

# See what it looks like
show(fig)





It is important to notice here that the default Bokeh figure comes pre-loaded with a toolbarwhich can be further configured for various interactions. 

## Generating a Basic first plot with Axes

let's create a basic first plot to learn how to configure the figure() object.

The figure() object is the key to all of Bokeh’s available tools for visualizing data. The Bokeh figure is a subclass of the [Bokeh Plot](https://docs.bokeh.org/en/latest/docs/reference/models/plots.html#bokeh.models.plots.Plot) object, which provides a lot of the parameters to configure the plot.

In [170]:
fig = figure(x_range=(100,1000),
             y_range=(0,10),
             background_fill_color='pink',
             background_fill_alpha=0.2,
             plot_height=400,
             plot_width=600,
             x_axis_label='X Label',
             y_axis_label='Y Label',
             title='Example Figure',
             toolbar_location='below',
             tools='save')

# See what it looks like
show(fig)





## Glyphs : the building blocks of Bokeh visualizations.

The basic visual building blocks of Bokeh plots, e.g. lines, rectangles, squares, wedges, patches, etc. The bokeh.plotting interface provides a convenient way to create plots centered around glyphs. See Plotting with Basic Glyphs for more information [Source](https://docs.bokeh.org/en/latest/docs/user_guide/plotting.html#userguide-plotting)

In [171]:
# Circles

p = figure(plot_width=300, plot_height=300)

# add a circle renderer with a size, color, and alpha
p.circle([1, 2, 3, 4, 5], [6, 7, 1, 4, 3], size=20, color="green", alpha=0.5)

# show the results
show(p)

Available Markers in Bokeh

* asterisk()
* circle()
* circle_cross()
* circle_x()
* cross()
* dash()
* diamond()
* diamond_cross()
* inverted_triangle()
* square()
* square_cross()
* square_x()
* triangle()



In [172]:
# Squares

p = figure(plot_width=300, plot_height=300)

# add a circle renderer with a size, color, and alpha
p.square([1, 2, 3, 4, 5], [6, 7, 1, 4, 3], size=20, color="red", alpha=0.5)

# show the results
show(p)

In [173]:
# Line

p = figure(plot_width=300, plot_height=300)

# add a circle renderer with a size, color, and alpha(for transparency)
p.line([1, 2, 3, 4, 5], [6, 7, 1, 4, 3], line_width=2, color="pink")

# show the results
show(p)

## Creating a basic scatter plot between GDP and Life Expectancy
Now,thatwe have a basic idea about working with Bokeh, let's start working with our original **Gapminder** dataset.

In [174]:
from bokeh.io import show
from bokeh.plotting import figure

p = figure(height = 400, width = 400)
p.circle(x = data['gdpPercap'], y = data['lifeExp'])
p.xaxis.axis_label = "GDP per Capita"
p.yaxis.axis_label = "Life Expectancy"
show(p)

## Customizing the plot

Let's customize out plot by :

* changing the color and the alpha value which decides the transparency. 
* Also, the plot is skewed and it will mak much more sense if we change the x-axis to logarithmic scale. We would also set the ranges for both the axes.
* Adding the [NumeralTickFormatter](https://docs.bokeh.org/en/latest/docs/reference/models/formatters.html#bokeh.models.formatters.NumeralTickFormatter) to to format x-axis to $ format

In [175]:
from bokeh.io import show
from bokeh.plotting import figure


p = figure(
    height = 400, 
    width = 600,
    x_axis_type = 'log',
    x_range =(100,100000), y_range = (0,100),
    title = "Life Expectancy vs GDP of Nations ",)

p.circle(x = data['gdpPercap'], y = data['lifeExp'],color="orange",alpha = 0.6)
p.xaxis.axis_label = "GDP per Capita"
p.yaxis.axis_label = "Life Expectancy"


from bokeh.models import NumeralTickFormatter
p.xaxis[0].formatter = NumeralTickFormatter(format='$0,')

show(p)

## ColumnData Source

Let's choose a specific year instead of taking in all the years, which is a lot of data. Let's see the relationship betwwen GDP and Life expectancy in 2007. For this we shall use the ColumnDataSoure Object.

ColumnDataSource is a built in functionality within Bokeh to handle various data structures like:
* Python dict
* Pandas Dataframe
* Pandas groupby

The COlumnDataSOurce helps to pass the data to the glyphs for the purpose of visualisation. It essentially maps name of the column to the data.


In [176]:
# Choosing year 2007

from bokeh.models import ColumnDataSource
source = ColumnDataSource(data.loc[2007])

In [177]:
# Adding the source option in our plot

from bokeh.io import show
from bokeh.plotting import figure

p = figure(
    height = 400, 
    width = 600,
    x_axis_type = 'log',
    x_range = (100,100000), y_range = (0,100),
    title = "Life Expectancy vs GDP of Nations for 2010")

p.circle(x='gdpPercap', y='lifeExp',color="orange",alpha=0.6,source=source)
p.xaxis.axis_label = "GDP per Capita"
p.yaxis.axis_label = "Life Expectancy"


from bokeh.models import NumeralTickFormatter
p.xaxis[0].formatter = NumeralTickFormatter(format='$0,')

show(p)

In [178]:
source

## Customizing bubble size according to Population

In the above graph all the bubbles are of th esame size. Let's map bubble zise according to the size so that countries having large population have bigger bubbles. For this we shall use the **Linear Interpolator**.

In [179]:
#Resizing the bubbles w.r.t population

from bokeh.io import show
from bokeh.models import LinearInterpolator
from bokeh.plotting import figure


p = figure(
    height = 400, 
    width = 600,
    x_axis_type = 'log',
    x_range = (100,100000), y_range = (0,100),
    title = "Life Expectancy vs GDP of Nations for 2010")


size_mapper = LinearInterpolator(
    x = [data.population.min(), data.population.max()],
    y = [10,100]
)


p.circle(x='gdpPercap', y='lifeExp',color="orange",alpha=0.6,source=source,
         size={'field':'population', 'transform':size_mapper})

p.xaxis.axis_label = "GDP per Capita"
p.yaxis.axis_label = "Life Expectancy"


from bokeh.models import NumeralTickFormatter
p.xaxis[0].formatter = NumeralTickFormatter(format='$0,')

show(p)

Voila ! here we have different countries represented by different size of bubbles. But they all look the same. Now let us give each region a color to add more dimensions and provide a legend.

## Coloring the countries by the continent column

For this we shall use the **CategoricalColorMapper** option from bokeh and the **Spectral6** pallete.

In [180]:
from bokeh.io import show
from bokeh.models import LinearInterpolator
from bokeh.plotting import figure
from bokeh.models import CategoricalColorMapper
from bokeh.palettes import Spectral6

p = figure(
    height = 400, 
    width = 600,
    x_axis_type = 'log',
    x_range = (100,100000), y_range = (0,100),
    title = "Life Expectancy vs GDP of Nations")


size_mapper = LinearInterpolator(
    x = [data.population.min(), data.population.max()],
    y = [10,100]
)

# Mapping color to region
color_mapper = CategoricalColorMapper(
    factors=list(data.continent.unique()),
    palette=Spectral6
)


p.circle(x='gdpPercap', y='lifeExp',alpha=0.6,source=source,
         size={'field':'population', 'transform':size_mapper},
         color={'field':'continent', 'transform':color_mapper},
         legend='continent'
        )

p.xaxis.axis_label = "GDP per Capita"
p.yaxis.axis_label = "Life Expectancy"


from bokeh.models import NumeralTickFormatter
p.xaxis[0].formatter = NumeralTickFormatter(format='$0,')

show(p)

## Adding Hovertool Capabilities

Finally its time to add the Hovrtools functionality of Bokeh. Till will help us to identifythe country being represented by each bubble.

In [181]:
from bokeh.io import show
from bokeh.models import LinearInterpolator
from bokeh.plotting import figure
from bokeh.models import CategoricalColorMapper
from bokeh.palettes import Spectral6
from bokeh.models import HoverTool


hover = HoverTool(tooltips='@country')

p = figure(
    height = 400, 
    width = 600,
    x_axis_type = 'log',
    tools = [hover],
    x_range = (100,100000), y_range = (0,100),
    title = "Life Expectancy vs GDP of Nations")


size_mapper = LinearInterpolator(
    x = [data.population.min(), data.population.max()],
    y = [10,100]
)

# Mapping color to region
color_mapper = CategoricalColorMapper(
    factors=list(data.continent.unique()),
    palette=Spectral6
)


p.circle(x='gdpPercap', y='lifeExp',alpha=0.6,source=source,
         size={'field':'population', 'transform':size_mapper},
         color={'field':'continent', 'transform':color_mapper},
         legend='continent'
        )

p.xaxis.axis_label = "GDP per Capita"
p.yaxis.axis_label = "Life Expectancy"


from bokeh.models import NumeralTickFormatter
p.xaxis[0].formatter = NumeralTickFormatter(format='$0,')

# formating the legend
p.legend.location = (30, -5)
p.right.append(p.legend[0])
p.legend.border_line_color = None


show(p)

In [184]:
from bokeh.io import show
from bokeh.models import LinearInterpolator
from bokeh.plotting import figure
from bokeh.models import CategoricalColorMapper
from bokeh.palettes import Spectral6
from bokeh.models import HoverTool


hover = HoverTool(tooltips='@country')

p = figure(
    height = 400, 
    width = 600,
    x_axis_type = 'log',
    tools = [hover],
    x_range = (100,100000), y_range = (0,100),
    title = "Life Expectancy vs GDP of Nations")


size_mapper = LinearInterpolator(
    x = [data.population.min(), data.population.max()],
    y = [10,100]
)

# Mapping color to continent
color_mapper = CategoricalColorMapper(
    factors=list(data.continent.unique()),
    palette=Spectral6
)


from bokeh.io import push_notebook
def update(year):
    new_data = dict(
        gdpPercap=data.loc[year].gdpPercap,
        lifeExp=data.loc[year].lifeExp,
        country=data.loc[year].country,
        population=data.loc[year].population,
        continent=data.loc[year].continent
    )
    source.data = new_data
    if len(p.title.text) > len("Life Expectancy vs GDP of Nations "):
        tmp = str(p.title.text)
        p.title.text=tmp[:-4]
        p.title.text = p.title.text + str(year)
    else:
        p.title.text = p.title.text + str(year)
    push_notebook()


p.circle(x='gdpPercap', y='lifeExp',alpha=0.6,source=source,
         size={'field':'population', 'transform':size_mapper},
         color={'field':'continent', 'transform':color_mapper},
         legend='continent'
        )

p.xaxis.axis_label = "GDP per Capita"
p.yaxis.axis_label = "Life Expectancy"


from bokeh.models import NumeralTickFormatter
p.xaxis[0].formatter = NumeralTickFormatter(format='$0,')

# formating the legend
p.legend.location = (30, -5)
p.right.append(p.legend[0])
p.legend.border_line_color = None


show(p,notebook_handle=True)

import time
year = 1952
while True:
    update(year)
    if year == 2007:
        year=1952
    else:
        year+=5
    time.sleep(0.5)
    continue

KeyboardInterrupt: 