In [1]:
from bokeh.io import show, output_notebook
from bokeh.models import ColumnDataSource, FactorRange
from bokeh.plotting import figure
import pandas as pd

output_notebook()

#### Data for wine sales
Made up data about the number of bottles of wine sold in a store over a 3 year period. The Origin and Type fields are both categorical while the Quantity is the numeric field

In [2]:
sales = pd.read_csv('datasets/wine_sales.csv')
sales

Unnamed: 0,Origin,Type,Quantity
0,France,Red,2500
1,Italy,Red,2100
2,Australia,Red,1300
3,USA,Red,2800
4,Chile,Red,1700
5,France,White,2200
6,Italy,White,2600
7,Australia,White,1500
8,USA,White,2100
9,Chile,White,1700


#### Converting categories to tuples
We create tuples for the primary group and subgroup (in order) by iterating over the Origin and Type columns in the data and forming a list of categories.

In [3]:
categories = [tuple(x) for x in sales[['Origin', 'Type']].values]
categories

[('France', 'Red'),
 ('Italy', 'Red'),
 ('Australia', 'Red'),
 ('USA', 'Red'),
 ('Chile', 'Red'),
 ('France', 'White'),
 ('Italy', 'White'),
 ('Australia', 'White'),
 ('USA', 'White'),
 ('Chile', 'White'),
 ('France', 'Sparkling'),
 ('Italy', 'Sparkling'),
 ('Australia', 'Sparkling'),
 ('USA', 'Sparkling'),
 ('Chile', 'Sparkling')]

#### Define the figure
* The FactorRange object generates a range of values for categorical dimensions. This allows the grouping of the wine origin and type on the X axis of the range
* We disable the toolbar by setting its location to None

In [4]:
p = figure(x_range=FactorRange(*categories), 
           
           plot_height=300, 
           
           title="Wine Sales by Type and Year",
           
           toolbar_location=None
          )

#### Define the bars
* The X axis will convey the categories in our data which is a list of tuples with the group and subgroup. The individual bars will represent the subgroup, and bars of the same group will be placed together
* The bar height, set by the <b>top</b> attribute, will represent the sales quantity
* The bar <b>width</b> sets the width of the bar. The default is 1 which will mean the edges of the bars in the same group will coincide. We introduce some spacing between the bars by setting the width to 0.9
* The <b>bottom</b> attribute allows us to trim the bar from the bottom. The default value for bottom is 0, which we state here explicitly

Notice how the major groups are separated. The labels of the secondary groups overlap with each other which makes it hard to read them

In [5]:
p.vbar(x=categories,
       
       top=sales['Quantity'],
       
       width=0.9,
       
       bottom=0,)

show(p)

#### Format the axes
The following formatting is done to make the plot cleaner
* the bars at the left and right extrems are right on the edge of the figure. We add some padding to the X axis using the <b>x_range.range_padding</b> property
* the x grid lines don't add value, so we remove them by setting the <b>xgrid.grid_line_color</b> property
* we angle the labels of the subgroup (which are the major labels) by specifying a tilt (in radians) with the <b>xaxis.major_label_orientation</b> property
* we format the label of group by setting the <b>xaxis.group_text_color</b>

In [6]:
p.x_range.range_padding = 0.1
p.xgrid.grid_line_color = None

p.xaxis.major_label_orientation = 1

p.xaxis.group_text_color = 'navy'

In [7]:
show(p)

#### Using Factor ColorMap to color the bars
We import a color palette called Spectral3 (containing 3 colors) along with the factor_cmap function. The list of available color palettes is here: <br />
https://bokeh.pydata.org/en/latest/docs/reference/palettes.html

In [8]:
from bokeh.palettes import Spectral3
from bokeh.transform import factor_cmap

#### The Spectral3 color palette has 3 colors

In [9]:
Spectral3

['#99d594', '#ffffbf', '#fc8d59']

#### Redefine the bar graph with the color map
The arguments for the call to factor_cmap include:
* <b>field_name</b> which denotes the field of the vbar for which we are using the color map
* the <b>palette</b> sets the list of colors to use - the Spectral3 palette in our example
* the list of <b>factors</b> which the colors will map to - since we want each wine type to be represented by a color we list out the types of wines in our data set
* our data contains two levels of factors - Origin and Type. The <b>start</b> and <b>end</b> values set here ensure that the colors are used for the second level (the Type) in our data. 


In [10]:
p.vbar(x = categories, 
       top = sales['Quantity'], 
       width = 0.9,
       bottom = 0,
       
       color = factor_cmap(field_name = 'x', 
                           
                           palette = Spectral3, 
                           
                           factors = sales['Type'].unique(), 
                           
                           start = 1, 
                           end = 2
                          )
      )

show(p)

## Stacked Bars
We can create a stacked bar graph using our categorical factors. We will produce a stack for each Origin of the wines with each individual bar representing a type of wine. The stacked bar will represent the total sales of wine from the region with the individual sub-bars in the stack representing each type of wine from that region. 

However, some reformatting of the data is required. 

#### We need all the regions to be listed

In [11]:
origin_list = list(sales['Origin'].unique())
origin_list

['France', 'Italy', 'Australia', 'USA', 'Chile']

#### We produce list for the quantity of wines sold of each type for each region
Each series contains the number of wine bottles sold of that type for the list of regions

In [12]:
red_sales = list(sales['Quantity'][sales['Type'] == 'Red'])
white_sales = list(sales['Quantity'][sales['Type'] == 'White'])
sparkling_sales = list(sales['Quantity'][sales['Type'] == 'Sparkling'])

print(red_sales)
print(white_sales)
print(sparkling_sales)

[2500, 2100, 1300, 2800, 1700]
[2200, 2600, 1500, 2100, 1700]
[1300, 900, 800, 1100, 600]


#### We create a source dictionary with all the lists we just created

In [13]:
data_source = {'Origin': origin_list,
               'Red': red_sales,
               'White': white_sales,
               'Sparkling': sparkling_sales
              }

data_source

{'Origin': ['France', 'Italy', 'Australia', 'USA', 'Chile'],
 'Red': [2500, 2100, 1300, 2800, 1700],
 'White': [2200, 2600, 1500, 2100, 1700],
 'Sparkling': [1300, 900, 800, 1100, 600]}

#### Define the figure
The x_range property will set the X axis to convey the categorical range values

In [14]:
p = figure(x_range = origin_list,
           
           plot_width = 600,
           plot_height=300, 
           
           title="Wine Sales by Type and Year"
          )

#### Define the stacked bars
We call the vbar_stack function for the staked bars with the following arguments:
* stackers defines which fields in the data source contain the values for each component of the stack
* the x parameter is for the field in the data source which is represented on the X axis
* We set the width of each bar in our plot. A width of 1 will cause the bar edges to overlap with the neighbouring bar
* The list of colors to use for each category passed to stackers

In [15]:
p.vbar_stack(stackers = sales['Type'].unique(), 
             
             x = 'Origin', 
             
             width=0.5, 
             
             source = data_source,
             
             color = Spectral3          
            )

show(p)

## Add a legend to the plot
We will add a legend to identify the individual bars representing the types of wine. 

We need to create space for the legend for which we will extend the range of the axes - this requires the Range1d object.

The value function will help us create a valuespec which we shall soon see.

In [16]:
from bokeh.core.properties import value
from bokeh.models import Range1d

#### Define our own color palette
The colors will correspond more closely with the type of wine

In [17]:
wine_colors = ['#800000', '#F0E68C', '#F7E7CE']

#### Declare a valuespec fo the wine types
A valuespec is a list dictionaries, each representing a value. In our example, we need a valuespec containing the values for our legend.

In [18]:
types_valuespec = [value(x) for x in sales['Type'].unique()]
types_valuespec

[{'value': 'Red'}, {'value': 'White'}, {'value': 'Sparkling'}]

#### Define the figure
We set a Y range such that we can place our legend on the top of the plot

In [19]:
p = figure(x_range = origin_list,
           
           y_range = Range1d(0, 7000),
           
           plot_width = 600,
           plot_height=300, 
           
           title="Wine Sales by Type and Year"
          )

#### Define the bars
Here, we pass our own color palette instead of a factor_cmap and pass our valuespec to the legend attribute

In [20]:
p.vbar_stack(stackers = sales['Type'].unique(), 
             x = 'Origin', 
             width=0.5, 
             
             source = data_source,
             
             color = wine_colors,
             
             legend = types_valuespec
            )

[bokeh.models.renderers.GlyphRenderer(
     id='c155a40f-0fd7-4ca4-9157-851901949812',
     data_source=bokeh.models.sources.ColumnDataSource(
         id='383781e5-3896-454b-90c1-b4b50b8fabbb',
         callback=None,
         data={'Sparkling': [1300, 900, 800, 1100, 600],
          'Red': [2500, 2100, 1300, 2800, 1700],
          'Origin': ['France', 'Italy', 'Australia', 'USA', 'Chile'],
          'White': [2200, 2600, 1500, 2100, 1700]},
         js_event_callbacks={},
         js_property_callbacks={},
         name=None,
         selected=bokeh.models.selections.Selection(
             id='e4d84a9c-2508-4575-8614-6d177f018553',
             indices=[],
             js_event_callbacks={},
             js_property_callbacks={},
             line_indices=[],
             multiline_indices={},
             name=None,
             subscribed_events=[],
             tags=[]),
         selection_policy=bokeh.models.selections.UnionRenderers(
             id='c5fdef79-b3e4-4132-ac11-30b

#### Place the legend on the plot
* the orientation can be a vertical list or a horizontal one
* the legend location can be one from a combination of (top, bottom) and (left, right)

In [21]:
p.legend.orientation = 'vertical'
p.legend.location = 'top_right'

In [22]:
show(p)