# Bokeh

Bokeh is an interactive visualization library that targets modern web browsers for presentation. Bokeh is a tool in the Charting Libraries category of a tech stack. It is good for:<br>
- Interactive visualization<br>
- Standalone HTML documents, or server-backed apps<br>
- Expressive and versatile graphics<br>
- Large, dynamic or streaming data<br>
- Easy usage from python (or Scala, or R, or...)



Curently has two main interfaces:<br>
  - bokeh.models -A low-level interface that provides the most flexibility to application develpers.<br>
  - bokeh.ploting - A higher-lever interface centered around composing visual glyphs.<br>
  for instalation -> python package index at python.org and see __installation__ instructions for bokeh.

Alternative to Bokeh:<br>
   - Plotly<br>
   - Matplotlib<br>
   - Tableau etc

According to the StackShare community, Matplotlib has a broader approval, being mentioned in 10 company stacks & 19 developers stacks; compared to Bokeh, which is listed in 4 company stacks and 7 developer stacks.<br>
- __Matplotlib__ is based on the plotting capabilities of MATLAB, the scientific programming, analysis and plotting environment and programming language. It __does not__ provide a way to generate interactive plots that can be viewed in the web browser.<br>
when it comes to Tableau, it's like comparing apple and oranges:<br>
- __Tableau__ can help anyone see and understand their data. Connect to almost any database, drag and drop to create visualizations, and share with a click.<br> 
    - from personal experince i would say troubleshooting is hard when integrating with databases, not sure if it was because i was using a free version. But still i couldn't find 
    enough resource.
- __Bokeh__ - goal is to provide elegant, concise construction of __versatile graphics__, and to extend this capability with __high-performance__ interactivity over very large or streaming datasets. It can help anyone who would like to quickly and easily create interactive plots, dashboards, and data applications.<br>
  - It's not easy to learn each new function and takes time.
  - provides the most important types of graph and they are all provided in line with the modern HTML 5 web standard, which has speed advantage. But, some browser do not support the HTML5 standard.

As of photograpy -->Bokeh, also known as “Boke” is one of the most popular subjects in photography. The reason why it is so popular, is because Bokeh makes photographs visually appealing, forcing us to focus our attention on a particular area of the image.

__to run from the terminal__<br>
bokeh serve --show filename.ipynb command

__ColumnDataSource__  is the core of most Bokeh plots, providing the data that is visualized by the glyphs of the plot. With the ColumnDataSource , it is easy to share data between multiple plots and widgets, such as the DataTable .

dataset -> https://fivethirtyeight.com/features/dear-mona-followup-where-do-people-drink-the-most-beer-wine-and-spirits/<br>
WHO presents the data in liters of pure alcohol consumed in 2010. But to make the comparisons more comprehensible (after all, when was the last time you ordered a liter of pure alcohol?), I’ve taken average alcohol content and average serving size for each beverage and converted those numbers into standard serving sizes. The results show how many glasses of wine, cans of beer and shots of spirits were drunk per person in each country in 2010. Here are the countries that drink the most of each drink in per-capita terms (you can find the full results in the lengthy table at the bottom of this article):refer dataset link provided above.

In [1]:
import pandas as pd

In [2]:
df_drinks = pd.read_csv("/Users/mtessema/Desktop/PY/drinks.csv", skipinitialspace=True, engine='python')

In [3]:
df_drinks

Unnamed: 0,country,beer_servings,spirit_servings,wine_servings,total_litres_of_pure_alcohol
0,Afghanistan,0,0,0,0.0
1,Albania,89,132,54,4.9
2,Algeria,25,0,14,0.7
3,Andorra,245,138,312,12.4
4,Angola,217,57,45,5.9
...,...,...,...,...,...
188,Venezuela,333,100,3,7.7
189,Vietnam,111,2,1,2.0
190,Yemen,6,0,0,0.1
191,Zambia,32,19,4,2.5


__=>__ I have taken the head and tail of the data set to do examples.

In [4]:
from bokeh.io import output_notebook, show
output_notebook()

In [5]:
import numpy as np 
from bokeh.plotting import figure
from bokeh.io import output_file, show
from bokeh.models.tools import HoverTool

In [6]:
import psycopg2
from sqlalchemy import create_engine
# postgres db connection

In [7]:
engine = create_engine('postgresql://postgres:zipcoder@localhost/airflow_test')
# created engine

In [8]:
df_pi=pd.read_sql('select * from drinkers where total_litres_of_pure_alcohol >0.0 and total_litres_of_pure_alcohol < 1.0 order by "total_litres_of_pure_alcohol" ASC;', engine)
# otal_litres_of_pure_alcohol b/n 0.0 and 1.0

In [9]:
pidict = dict(zip(df_pi.country, df_pi.total_litres_of_pure_alcohol))
# created dictionary for the pie chart(data source)

In [10]:
from math import pi
from bokeh.palettes import Category20c, viridis
from bokeh.transform import cumsum

output_file("pie.html")

x = {
 'Timor-Leste': 0.1,
 'Saudi Arabia': 0.1,
 'Indonesia': 0.1,
 'Myanmar': 0.1,
 'Comoros': 0.1,
 'Yemen': 0.1,
 'Niger': 0.1,
 'Iraq': 0.2,
 'Egypt': 0.2,
 'Nepal': 0.2,
 'Guinea': 0.2,
 'Malaysia': 0.3,
 'Senegal': 0.3,
 'Tajikistan': 0.3,
 'Bhutan': 0.4,
 'Chad': 0.4,
 'Morocco': 0.5,
 'Eritrea': 0.5,
 'Jordan': 0.5,
 'Brunei': 0.6,
 'Mali': 0.6,
 'Algeria': 0.7,
 'Ethiopia': 0.7,
 'Oman': 0.7,
 'Madagascar': 0.8,
 'Vanuatu': 0.9,
 'Qatar': 0.9}

data = pd.Series(x).reset_index(name='value').rename(columns={'index':'country'})
data['angle'] = data['value']/data['value'].sum() * 2*pi
data['color'] = viridis(28)[len(x)]

p = figure(plot_height=550, title="Pie Chart", toolbar_location=None,
           tools="hover", tooltips="@country: @value", x_range=(-0.5, 0.5))

p.wedge(x=0, y=1, radius=0.4,
        start_angle=cumsum('angle', include_zero=True), end_angle=cumsum('angle'),
                fill_color='color', 
        source=data)
p.axis.axis_label=None
p.axis.visible=False
p.grid.grid_line_color = None
hover = HoverTool()


p.add_tools(hover)
show(p)

In [11]:
df_sec=pd.read_sql('select * from drinkers where total_litres_of_pure_alcohol >= 6.5 and total_litres_of_pure_alcohol < 10.0 order by "total_litres_of_pure_alcohol" ASC;', engine)
#df_sec

In [12]:
from bokeh.models import ColumnDataSource
from bokeh.transform import factor_cmap

source = ColumnDataSource(df_sec)
output_file('index.html')
country_list = source.data['country'].tolist()

p = figure(
    y_range=country_list,
    plot_width=1000,
    plot_height=1200,
    title='total alcohol consumption per country(6.5-10)',
    x_axis_label ='total_litres_of_pure_alcohol',
    tools = 'pan, box_select, zoom_in, zoom_out, save, reset')
# Render glpyh
p.hbar(
    y = 'country',
    right = 'total_litres_of_pure_alcohol', 
    left = 0,
    height = 0.2, 
    #color = 'orange',
        fill_color = factor_cmap(
        'country',
    palette = viridis(38),
    factors = country_list
    
    ),
    
    fill_alpha = 0.9,
    source = source,
    legend = 'country'
)
# add legend
p.legend.orientation = 'vertical'
p.legend.location = 'bottom_right'
p.legend.label_text_font_size = '6px'
# add tooltips(hoover)
hover = HoverTool()
hover.tooltips = """
    <div>
        <h3>@country</h3>
        <div><strong>tot: </strong>@total_litres_of_pure_alcohol</div>
        <div><strong>spirit_servings: </strong>@spirit_servings</div>
        <div><strong>beer_servings: </strong>@beer_servings</div>
        <div><strong>wine_servings: </strong>@wine_servings</div>
    </div>
"""
p.add_tools(hover)

show(p)