# BOKEH- An Interactive Data Visualization Python Library

**BOKEH is a Python Library for creating interactive visualizations for modern web browsers.With the help of this library we are able to build beautiful graphics,varying from simple plots to complex dashboards with streaming datasets.**

Bokeh is a python library which is interactive in nature and is used for data visualization. It targets modern web browsers for presentation.This library aims at providing high performing interactivity with concise constructionog novel graphics over very large or even streaming datasets in a quick manner.

![image](https://drive.google.com/uc?id=1dl-IdQcVHoJpiH7xR3ZRW-NDqleeY5h0)

* BOKEH is a Python Library which provides interactive visualizations and targets web browsers for representation.This is one of the key differences between BOKEH and other visualization libraries.
* Bokeh is capable of producing interactive and elegant visualizations with high performance over very large streaming datasets.
* Unlike other visualization libraries BOKEH renders it graphics using HTML and Javascript.

## **I.** **Python BOKEH Library**

  Bokeh provides us with three interface models:
  * ```Bokeh.models```: It is a low-level interface which provides developers with the most flexibility.
  * ```Bokeh.plotting```: It is an intermediate level interface to compose visual glyphs.
  * ```Bokeh.charts```: It is a high level interface used for presentation in standard visualization form.

## **II. BOKEH Dependancies**

Before we get started with Bokeh;we need to have Numpy and Pandas installed on our machine.We will import data handling libraries Numpy and Pandas.

In [1]:
import pandas as pd
import numpy as np

## **III. Installing BOKEH Module**

The simplest way to install Bokeh and its dependancies is by using ```conda``` or ```pip```.


In [None]:
pip install bokeh

## **IV.Verifying Installation of Bokeh** 

We can verify the installation by using some commands but instead we will run a very small program which will provide Bokeh output if the installation is successful.We will generate an empty Bokeh plot and represent it in a static HTML page.

But firstly we need to import certain BOKEH Libraries.

In [4]:
from bokeh.io import output_file
from bokeh.plotting import figure, show

In order to render the plot in a static HTML page,we need to use the following command.

In [5]:
output_file('output_file_test.html', 
            title='Empty Bokeh Plot')


Set up a generic figure object and then see what it looks like.

In [6]:
fig = figure()
show(fig)



An empty Bokeh plot is rendered in a static HTML page as follows.

![image](https://drive.google.com/uc?id=1WwhF98ZSdOt9Y3B2WZTn0wZ8rgfPP578)

Hence,we have verified the installation of BOKEH.

## **V.Python Bokeh Examples** (Graphs and Plots)

Now,we will see certain examples how various graphs and plots can be represented using BOKEH.
* **Plotting a simple line graph.**

A simple step-by-step tutorial will be given on how to plot a simple line graph.




**STEP 1 : Import the necessary from the bokeh plotting module.**

In [14]:
from bokeh.plotting import figure, show

**STEP 2 : Prepare some data. For doing this,define two lists which contains the data line chart.**

In [15]:
# prepare some data
x = [1, 2, 3, 4, 5]
y = [6, 7, 2, 4, 5]

**STEP 3 : Use the figure() function to create the plot.Following arguments need to be passed:**
- ```title```:Title of our line chart.
- ```x_axis_label```: Used to label the X-axis of the plot.
- ```y_axis_label```: Used to label the Y-axis of the plot.

In [16]:
# create a new plot with a title and axis labels
p = figure(title="Simple line example", x_axis_label='x', y_axis_label='y')

**STEP 4 : Now we will add a line graph to the plot created,using line() function;by passing the following arguments:**
- ```legend_label```: It adds labels to the line graph.
- ```line_width```: It defines the width of line.

In [17]:
# add a line renderer with legend and line thickness to the plot
p.line(x, y, legend_label="Temp.", line_width=2)

**STEP 5 : Use the show() function to generate the graph and open a web browser to display the generated static HTML file.**

In [19]:
# output to static HTML file
output_file("lines.html")

# show the results
show(p)

*The same steps will be followed for other examples as well.*

 A simple-line-graph is rendered in a static HTML page as given below:
 ![image](https://drive.google.com/uc?id=1tXw9hxs8hQuHafe4pEFq_AaXLmKPJMWD)
 
*By using the figure() function and its parameters;we are able to provide titles for the axes and give more description about the data which we wish to represent.*

* **Multiple Plots**

Multiple plots can also be represented easily using BOKEH.The following steps are to be followed.

**STEP 1 : Add more data as the basis of additional graphs.**


In [20]:
x = [1, 2, 3, 4, 5]
y1 = [6, 7, 2, 4, 5]
y2 = [2, 3, 4, 5, 6]
y3 = [4, 5, 5, 7, 2]

**STEP 2 : Change the title from 'simple line example' to 'multiple line example'.**

In [21]:
# create a new plot with a title and axis labels
p = figure(title="Multiple line example", x_axis_label='x', y_axis_label='y')

**STEP 3 : Add more calls to the line() function.**

In [22]:
# add multiple renderers
p.line(x, y1, legend_label="Temp.", color="blue", line_width=2)
p.line(x, y2, legend_label="Rate", color="red", line_width=2)
p.line(x, y3, legend_label="Objects", color="green", line_width=2)

**STEP 4 : Use the show() function to see the results.**

In [23]:
# show the results
show(p)

*We get to see the following graph:*
![image](https://drive.google.com/uc?id=13FYq7q7t-430xTZityTOY2Eq3dDBF6iF)

Similarly,we can plot multiple graphs along with glyphs and various shapes.


In [10]:
# prepare some data
x = [0.1, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0]
y0 = [i**2 for i in x]
y1 = [10**i for i in x]
y2 = [10**(i**2) for i in x]

# output to static HTML file
output_file("log_lines.html")

# create a new plot
p = figure(
   tools="pan,box_zoom,reset,save",
   y_axis_type="log", y_range=[0.001, 10**11], title="log axis example",
   x_axis_label='sections', y_axis_label='particles'
)

# add some renderers
p.line(x, x, legend="y=x")
p.circle(x, x, legend="y=x", fill_color="white", size=8)
p.line(x, y0, legend="y=x^2", line_width=3)
p.line(x, y1, legend="y=10^x", line_color="red")
p.circle(x, y1, legend="y=10^x", fill_color="red", line_color="red", size=6)
p.line(x, y2, legend="y=10^x^2", line_color="orange", line_dash="4 4")

# show the results
show(p)




This gives us a multiple plot rendered in a static HTML page.
![image](https://drive.google.com/uc?id=10QoD2Cq3aHw7koMtNYyTgK558LbVA3vJ)

By using multiple plots,we will get more customised graph with legends and line colours.

We can differentiate between multiple line plots on the same graph.

* **Rendering Bar Graphs**

By using the same procedure as stated above we can represent bar graphs using BOKEH. We just need to execute the following code.

In [24]:
from bokeh.plotting import figure, show

# prepare some data
x = [1, 2, 3, 4, 5]
y1 = [6, 7, 2, 4, 5]
y2 = [2, 3, 4, 5, 6]
y3 = [4, 5, 5, 7, 2]

# create a new plot with a title and axis labels
p = figure(title="Bar graph  example", x_axis_label="x", y_axis_label="y")

# add bar graph

p.vbar(x=x, top=y2, legend_label="Rate", width=0.5, bottom=0, color="red")


# show the results
show(p)

A bar graph is rendered on a static HTML page as follows:
![image](https://drive.google.com/uc?id=1tgNeKTGHxP5nSiWnWdBsYOC7Qohmm8Lv)

* **Vectorized Colours and Sizes**

When we need to plot a large amount of data,we may need to represent data in different colours and sizes as we have a lot to visualize.

In [11]:
# prepare some data
N = 4000
x = np.random.random(size=N) * 100
y = np.random.random(size=N) * 100
radii = np.random.random(size=N) * 1.5
colors = [
    "#%02x%02x%02x" % (int(r), int(g), 150) for r, g in zip(50+2*x, 30+2*y)
]

# output to static HTML file (with CDN resources)
output_file("color_scatter.html", title="color_scatter.py example", mode="cdn")
TOOLS="crosshair,pan,wheel_zoom,box_zoom,reset,box_select,lasso_select"

# create a new plot with the tools above, and explicit ranges
p = figure(tools=TOOLS, x_range=(0,100), y_range=(0,100))

# add a circle renderer with vectorized colors and sizes
p.circle(x,y, radius=radii, fill_color=colors, fill_alpha=0.6, line_color=None)

# show the results
show(p)


![image](https://drive.google.com/uc?id=1VDJjsgejb9erxw8dC1dGnuSDDP9T00bn)

Vectorized graphs are applicable in scenarios like:
* Showing heatmap like data
* Showing data which exhibit density property of some parameters.

* **Linked Panning and Brushing**

One of the useful techniques for data visualization is linking various attributes and aspects.

This can be achieved using BOKEH by executing following lines of code.

In [13]:
from bokeh.layouts import gridplot

# prepare some data
N = 100
x = np.linspace(0, 4*np.pi, N)
y0 = np.sin(x)
y1 = np.cos(x)
y2 = np.sin(x) + np.cos(x)

# output to static HTML file
output_file("linked_panning.html")

# create a new plot
s1 = figure(width=250, plot_height=250, title=None)
s1.circle(x, y0, size=10, color="navy", alpha=0.5)

# NEW: create a new plot and share both ranges
s2 = figure(width=250, height=250, x_range=s1.x_range, y_range=s1.y_range, title=None)
s2.triangle(x, y1, size=10, color="firebrick", alpha=0.5)

# NEW: create a new plot and share only one range
s3 = figure(width=250, height=250, x_range=s1.x_range, title=None)
s3.square(x, y2, size=10, color="olive", alpha=0.5)

# NEW: put the subplots in a gridplot
p = gridplot([[s1, s2, s3]], toolbar_location=None)

# show the results
show(p)


We get a graph rendered in a static HTML page as follows:

![image](https://drive.google.com/uc?id=1Qf4dP_riMEAdyk-BJ1J71Ng6_shwaE1t)

These kind of plots are helpful when we need to show variation of parameter based on other parameters.

## Benefits of BOKEH:

* Bokeh permits us to build complex statistical plots quickly via some simple commands.
* Via Bokeh,we can represent our visualizations in various medium like notebook and HTML static pages.
* Bokeh has the benefit of applying interactivity and other styling options.

## Challenges with BOKEH:
Bokeh is an up and coming open source library;which means that is still under development.
The code which we are writing today may not be reusable in the future.

**Therefore,in this tutorial we learnt how to incorporate BOKEH Library into data visualizations.Bokeh helps us to visualize data in a more attractive manner and make readable visualizations.**

Refer the offical BOKEH Documentation for more tutorials <a href="https://docs.bokeh.org/en/latest/index.html">here.</a>


# Working with a dataset
After knowing the basics of BOKEH,we can now procees further to work with dataset and produce visualizations using Bokeh.

We will be working with the dataset Automobiles.csv and produce various visualized patterns.

**Firstly,we will import certain important libraries like Numpy and Pandas.**

In [1]:
import numpy as np 
import pandas as pd

**Now,we will read our dataset.**

In [2]:
auto_df=pd.read_csv("Automobile_data.csv")

**Now we will import certain libraries from Bokeh.**

In [3]:
from bokeh.io import output_file,show,output_notebook,push_notebook
from bokeh.plotting import figure
from bokeh.models import ColumnDataSource,HoverTool,CategoricalColorMapper
from bokeh.layouts import row,column,gridplot
from bokeh.models.widgets import Tabs,Panel
output_notebook()

## Basic Glyphs

We will now produce some simple visualizations using Bokeh.

In [5]:
# Scatter Markers

p = figure(plot_width=500, plot_height=500)

# add a circle renderer with a size, color, and alpha
p.circle(auto_df["engine-size"], auto_df["wheel-base"], size=22, color="red", alpha=0.5)

# show the results
show(p)

In [6]:
p = figure(plot_width=500, plot_height=500)

# add a square renderer with a size, color, and alpha
p.square(auto_df["engine-size"], auto_df["wheel-base"], size=20, color="purple", alpha=0.5)

# show the results
show(p)

In [7]:
from bokeh.transform import factor_cmap, factor_mark
BODY_STYLE = ['sedan', 'hatchback', 'Other']
MARKERS = ['hex', 'circle_x', 'triangle']

p = figure(title = "Automobile Dataset")
p.xaxis.axis_label = 'Engine Size'
p.yaxis.axis_label = 'Wheel Base'

p.scatter("engine-size", "wheel-base", source=auto_df, legend_field="body-style", fill_alpha=0.4, size=12,
          marker=factor_mark('body-style', MARKERS, BODY_STYLE),
          color=factor_cmap('body-style', 'Category10_3', BODY_STYLE))

show(p)

## Filtering Data
With the help of BOKEH,we can filter our data as per our requirements.To plot a certain subset of data,we create a CDSView and pass it as a view argument to renderer adding methods.

The CDSView has two properties, source and filters. source is the ColumnDataSource that the view is associated with. filters is a list of Filter objects, listed and described below.

* **IndexFilter:** The IndexFilter is the simplest filter type. It has an indices property which is a list of integers that are the indices of the data you want to be included in the plot.

* **BooleanFilter:** A BooleanFilter selects rows from a data source through a list of True or False values in its booleans property

* **GroupFilter:** The GroupFilter allows you to select rows from a dataset that have a specific value for a categorical variable. The GroupFilter has two properties, column_name, the name of column in the ColumnDataSource, and group, the value of the column to select for.

**IndexFilter Example**

In [8]:
from bokeh.models import  CDSView, IndexFilter

source = ColumnDataSource(auto_df)
view = CDSView(source=source, filters=[IndexFilter([70, 90, 110,130])])

tools = ["box_select", "hover", "reset"]
p = figure(plot_height=300, plot_width=300, tools=tools)
p.circle(x="engine-size", y="wheel-base", size=10, hover_color="red", source=source)

p_filtered = figure(plot_height=300, plot_width=300, tools=tools)
p_filtered.circle(x="engine-size", y="wheel-base", size=10, hover_color="red", source=source, view=view)

show(gridplot([[p, p_filtered]]))


**BooleanFilter Example**

In [9]:
# Boolean Filter 

from bokeh.models import BooleanFilter

booleans = [True if y_val > 110 else False for y_val in source.data['wheel-base']]
view = CDSView(source=source, filters=[BooleanFilter(booleans)])

tools = ["box_select", "hover", "reset"]
p = figure(plot_height=300, plot_width=300, tools=tools)
p.circle(x="engine-size", y="wheel-base", size=10, hover_color="red", source=source)

p_filtered = figure(plot_height=300, plot_width=300, tools=tools,
                    x_range=p.x_range, y_range=p.y_range)
p_filtered.circle(x="engine-size", y="wheel-base", size=10, hover_color="red", source=source, view=view)

show(gridplot([[p, p_filtered]]))

**GroupFilter Example**

In [10]:
#GroupFilter

from bokeh.models import GroupFilter

view1 = CDSView(source=source, filters=[GroupFilter(column_name='body-style', group='hatchback')])
plot_size_and_tools = {'plot_height': 300, 'plot_width': 300,
                        'tools':['box_select', 'reset', 'help']}

p1 = figure(title="Full data set", **plot_size_and_tools)
p1.circle(x='engine-size', y='wheel-base', source=source, color='black')

p2 = figure(title="Sedan and Others only", x_range=p1.x_range, y_range=p1.y_range, **plot_size_and_tools)
p2.circle(x='engine-size', y='wheel-base', source=source, view=view1, color='red')

show(gridplot([[p1, p2]]))

For further tutorials,refer this <a href="https://www.kaggle.com/saurav9786/data-visualization-with-bokeh/notebook">Kaggle Notebook</a>