# Bokeh Charts Attributes

One of Bokeh Charts main contributions is that it provides a flexible interface for applying unique attributes based on the unique values in column(s) of a DataFrame.

Internally, the bokeh chart uses the AttrSpec to define the mapping, but allows the user to pass in their own spec, or utilize a function to produce a customized one.

In [1]:
from bokeh.charts.attributes import AttrSpec, ColorAttr, MarkerAttr

## Simple Examples

The AttrSpec assigns values in the iterable to values in items.

In [2]:
attr = AttrSpec(items=[1, 2, 3], iterable=['a', 'b', 'c'])
attr.attr_map

{(1,): 'a', (2,): 'b', (3,): 'c'}

You will see that the key in the mapping will be a tuple, and it will always be a tuple. The mapping works like this because the AttrSpec(s) are often used with Pandas DataFrames groupby method. The groupby method can return a single value or a tuple of values when used with multiple columns, so this is just making sure that is consistent. 

However, you can still access the values in the following way:

In [3]:
attr[1]

'a'

The `ColorAttr` is just a custom `AttrSpec` that has a default palette as the iterable, but can be customized, and will likely provide some other color generation functionality. 

In [4]:
color = ColorAttr(items=[1, 2, 3])
color.attr_map

{(1,): '#f22c40', (2,): '#5ab738', (3,): '#407ee7'}

Let's assume that you don't know how many unique items you are working with, but you have defined the things that you want to assign the items to. The `AttrSpec` will automatically cycle the iterable for you. This is important for exploratory analysis.

In [5]:
color = ColorAttr(items=list(range(0, 10)))
color.attr_map

{(0,): '#f22c40',
 (1,): '#5ab738',
 (2,): '#407ee7',
 (3,): '#df5320',
 (4,): '#00ad9c',
 (5,): '#c33ff3',
 (6,): '#f22c40',
 (7,): '#5ab738',
 (8,): '#407ee7',
 (9,): '#df5320'}

Because there are only 6 unique colors in the default palette, the palette repeats starting on the 7th item.

## Using with Pandas

In [6]:
from bokeh.sampledata.autompg import autompg as df

In [7]:
df.head()

Unnamed: 0,mpg,cyl,displ,hp,weight,accel,yr,origin,name
0,18.0,8,307.0,130,3504,12.0,70,1,chevrolet chevelle malibu
1,15.0,8,350.0,165,3693,11.5,70,1,buick skylark 320
2,18.0,8,318.0,150,3436,11.0,70,1,plymouth satellite
3,16.0,8,304.0,150,3433,12.0,70,1,amc rebel sst
4,17.0,8,302.0,140,3449,10.5,70,1,ford torino


In [8]:
color_attr = ColorAttr(df=df, columns=['cyl', 'origin'])

In [9]:
color_attr.attr_map

{(3, 3): '#f22c40',
 (4, 1): '#5ab738',
 (4, 2): '#407ee7',
 (4, 3): '#df5320',
 (5, 2): '#00ad9c',
 (6, 1): '#c33ff3',
 (6, 2): '#f22c40',
 (6, 3): '#5ab738',
 (8, 1): '#407ee7'}

You will notice that this is similar to a pandas series with a MultiIndex, which is seen below.

In [10]:
color_attr.series

cyl  origin
3    3         #f22c40
4    1         #5ab738
     2         #407ee7
     3         #df5320
5    2         #00ad9c
6    1         #c33ff3
     2         #f22c40
     3         #5ab738
8    1         #407ee7
dtype: object

You can think of this as a SQL table with 3 columns, two of which are an index. You can imagine how you might join this view data into the original data source to assign these colors to the associated rows.

## Combining with ChartDataSource

In [11]:
from bokeh.charts.data_source import ChartDataSource

In [12]:
fill_color = ColorAttr(columns=['cyl', 'origin'])

ds = ChartDataSource.from_data(df)

In [13]:
ds.join_attrs(fill_color=fill_color).head()

Unnamed: 0,mpg,cyl,displ,hp,weight,accel,yr,origin,name,chart_index,fill_color
0,18.0,8,307.0,130,3504,12.0,70,1,chevrolet chevelle malibu,"((cyl, 8), (origin, 1))",#f22c40
1,15.0,8,350.0,165,3693,11.5,70,1,buick skylark 320,"((cyl, 8), (origin, 1))",#f22c40
2,18.0,8,318.0,150,3436,11.0,70,1,plymouth satellite,"((cyl, 8), (origin, 1))",#f22c40
3,16.0,8,304.0,150,3433,12.0,70,1,amc rebel sst,"((cyl, 8), (origin, 1))",#f22c40
4,17.0,8,302.0,140,3449,10.5,70,1,ford torino,"((cyl, 8), (origin, 1))",#f22c40


### Multiple Attributes

In [14]:
# add new column
df['large_displ'] = df['displ'] >= 350

fill_color = ColorAttr(columns=['cyl', 'origin'])
line_color = ColorAttr(columns=['large_displ'])

ds = ChartDataSource.from_data(df)

ds.join_attrs(fill_color=fill_color, line_color=line_color).head(10)

Unnamed: 0,mpg,cyl,displ,hp,weight,accel,yr,origin,name,large_displ,chart_index,fill_color,line_color
0,18.0,8,307.0,130,3504,12.0,70,1,chevrolet chevelle malibu,False,"((cyl, 8), (origin, 1), (large_displ, False))",#f22c40,#f22c40
1,15.0,8,350.0,165,3693,11.5,70,1,buick skylark 320,True,"((cyl, 8), (origin, 1), (large_displ, True))",#f22c40,#f22c40
2,18.0,8,318.0,150,3436,11.0,70,1,plymouth satellite,False,"((cyl, 8), (origin, 1), (large_displ, False))",#f22c40,#f22c40
3,16.0,8,304.0,150,3433,12.0,70,1,amc rebel sst,False,"((cyl, 8), (origin, 1), (large_displ, False))",#f22c40,#f22c40
4,17.0,8,302.0,140,3449,10.5,70,1,ford torino,False,"((cyl, 8), (origin, 1), (large_displ, False))",#f22c40,#f22c40
5,15.0,8,429.0,198,4341,10.0,70,1,ford galaxie 500,True,"((cyl, 8), (origin, 1), (large_displ, True))",#f22c40,#f22c40
6,14.0,8,454.0,220,4354,9.0,70,1,chevrolet impala,True,"((cyl, 8), (origin, 1), (large_displ, True))",#f22c40,#f22c40
7,14.0,8,440.0,215,4312,8.5,70,1,plymouth fury iii,True,"((cyl, 8), (origin, 1), (large_displ, True))",#f22c40,#f22c40
8,14.0,8,455.0,225,4425,10.0,70,1,pontiac catalina,True,"((cyl, 8), (origin, 1), (large_displ, True))",#f22c40,#f22c40
9,15.0,8,390.0,190,3850,8.5,70,1,amc ambassador dpl,True,"((cyl, 8), (origin, 1), (large_displ, True))",#f22c40,#f22c40


### Custom Iterable

You will see that the output contains the combined chart_index and the columns for both attributes. The values of each are joined in based on the original assignment. For example, line_color only has two colors because the large_displ column only has two values.

If we wanted to change the true/false, we can modify the ColorAttr.

In [15]:
line_color = ColorAttr(df=df, columns=['large_displ'], palette=['Green', 'Red'])
ds.join_attrs(fill_color=fill_color, line_color=line_color).head(10)

Unnamed: 0,mpg,cyl,displ,hp,weight,accel,yr,origin,name,large_displ,chart_index,fill_color,line_color
0,18.0,8,307.0,130,3504,12.0,70,1,chevrolet chevelle malibu,False,"((cyl, 8), (origin, 1), (large_displ, False))",#f22c40,Green
1,15.0,8,350.0,165,3693,11.5,70,1,buick skylark 320,True,"((cyl, 8), (origin, 1), (large_displ, True))",#f22c40,Red
2,18.0,8,318.0,150,3436,11.0,70,1,plymouth satellite,False,"((cyl, 8), (origin, 1), (large_displ, False))",#f22c40,Green
3,16.0,8,304.0,150,3433,12.0,70,1,amc rebel sst,False,"((cyl, 8), (origin, 1), (large_displ, False))",#f22c40,Green
4,17.0,8,302.0,140,3449,10.5,70,1,ford torino,False,"((cyl, 8), (origin, 1), (large_displ, False))",#f22c40,Green
5,15.0,8,429.0,198,4341,10.0,70,1,ford galaxie 500,True,"((cyl, 8), (origin, 1), (large_displ, True))",#f22c40,Red
6,14.0,8,454.0,220,4354,9.0,70,1,chevrolet impala,True,"((cyl, 8), (origin, 1), (large_displ, True))",#f22c40,Red
7,14.0,8,440.0,215,4312,8.5,70,1,plymouth fury iii,True,"((cyl, 8), (origin, 1), (large_displ, True))",#f22c40,Red
8,14.0,8,455.0,225,4425,10.0,70,1,pontiac catalina,True,"((cyl, 8), (origin, 1), (large_displ, True))",#f22c40,Red
9,15.0,8,390.0,190,3850,8.5,70,1,amc ambassador dpl,True,"((cyl, 8), (origin, 1), (large_displ, True))",#f22c40,Red


## Altering Attribute Assignment Order

You may not have wanted to assign the values in the order that occured. So, you would have five options.


1. Pre order the data and tell the attribute not to sort.
2. Make the column a categorical and set the order.
3. Specify the sort options to the `AttrSpec`
4. Manually specify the items in the order you want them to be assigned.
5. Specify the iterable in the order you want.

### 1. Pre order the data

In [16]:
df_sorted = df.sort(columns=['large_displ'], ascending=False)

line_color = ColorAttr(df=df_sorted, columns=['large_displ'], palette=['Green', 'Red'], sort=False)

ds.join_attrs(fill_color=fill_color, line_color=line_color).head()

  if __name__ == '__main__':


Unnamed: 0,mpg,cyl,displ,hp,weight,accel,yr,origin,name,large_displ,chart_index,fill_color,line_color
0,18.0,8,307.0,130,3504,12.0,70,1,chevrolet chevelle malibu,False,"((cyl, 8), (origin, 1), (large_displ, False))",#f22c40,Red
1,15.0,8,350.0,165,3693,11.5,70,1,buick skylark 320,True,"((cyl, 8), (origin, 1), (large_displ, True))",#f22c40,Green
2,18.0,8,318.0,150,3436,11.0,70,1,plymouth satellite,False,"((cyl, 8), (origin, 1), (large_displ, False))",#f22c40,Red
3,16.0,8,304.0,150,3433,12.0,70,1,amc rebel sst,False,"((cyl, 8), (origin, 1), (large_displ, False))",#f22c40,Red
4,17.0,8,302.0,140,3449,10.5,70,1,ford torino,False,"((cyl, 8), (origin, 1), (large_displ, False))",#f22c40,Red


### 2. Make the column a categorical and set the order

We'll show the default sort order of a boolean column, which is ascending.

In [17]:
df.sort(columns='large_displ').head()

  if __name__ == '__main__':


Unnamed: 0,mpg,cyl,displ,hp,weight,accel,yr,origin,name,large_displ
0,18.0,8,307.0,130,3504,12.0,70,1,chevrolet chevelle malibu,False
264,30.0,4,98.0,68,2155,16.5,78,1,chevrolet chevette,False
263,17.5,8,318.0,140,4080,13.7,78,1,dodge magnum xe,False
262,18.1,8,302.0,139,3205,11.2,78,1,ford futura,False
261,17.7,6,231.0,165,3445,13.4,78,1,buick regal sport coupe (turbo),False


In [18]:
import pandas as pd
df_cat = df.copy()

# create the categorical and set the default (ascending)
df_cat['large_displ'] = pd.Categorical.from_array(df.large_displ).reorder_categories([True, False])

# we don't have to sort here, but doing it so you can see the order that the attr spec will see
df_cat.sort(columns='large_displ').head()



Unnamed: 0,mpg,cyl,displ,hp,weight,accel,yr,origin,name,large_displ
39,14.0,8,351.0,153,4154,13.5,71,1,ford galaxie 500,True
288,15.5,8,351.0,142,4054,14.3,79,1,ford country squire (sw),True
287,16.9,8,350.0,155,4360,14.9,79,1,buick estate wagon (sw),True
68,12.0,8,350.0,160,4456,13.5,72,1,oldsmobile delta 88 royale,True
285,16.5,8,351.0,138,3955,13.2,79,1,mercury grand marquis,True


In [19]:
line_color = ColorAttr(df=df_cat, columns=['large_displ'], palette=['Green', 'Red'])

ds.join_attrs(fill_color=fill_color, line_color=line_color).head()

Unnamed: 0,mpg,cyl,displ,hp,weight,accel,yr,origin,name,large_displ,chart_index,fill_color,line_color
0,18.0,8,307.0,130,3504,12.0,70,1,chevrolet chevelle malibu,False,"((cyl, 8), (origin, 1), (large_displ, False))",#f22c40,Red
1,15.0,8,350.0,165,3693,11.5,70,1,buick skylark 320,True,"((cyl, 8), (origin, 1), (large_displ, True))",#f22c40,Green
2,18.0,8,318.0,150,3436,11.0,70,1,plymouth satellite,False,"((cyl, 8), (origin, 1), (large_displ, False))",#f22c40,Red
3,16.0,8,304.0,150,3433,12.0,70,1,amc rebel sst,False,"((cyl, 8), (origin, 1), (large_displ, False))",#f22c40,Red
4,17.0,8,302.0,140,3449,10.5,70,1,ford torino,False,"((cyl, 8), (origin, 1), (large_displ, False))",#f22c40,Red


### 3. Specify the sort options to the `AttrSpec`

In [20]:
# the items will be sorted descending (uses same sorting options as pandas)
line_color = ColorAttr(df=df, columns=['large_displ'], palette=['Green', 'Red'], sort=True, ascending=False)

ds.join_attrs(fill_color=fill_color, line_color=line_color).head()

Unnamed: 0,mpg,cyl,displ,hp,weight,accel,yr,origin,name,large_displ,chart_index,fill_color,line_color
0,18.0,8,307.0,130,3504,12.0,70,1,chevrolet chevelle malibu,False,"((cyl, 8), (origin, 1), (large_displ, False))",#f22c40,Red
1,15.0,8,350.0,165,3693,11.5,70,1,buick skylark 320,True,"((cyl, 8), (origin, 1), (large_displ, True))",#f22c40,Green
2,18.0,8,318.0,150,3436,11.0,70,1,plymouth satellite,False,"((cyl, 8), (origin, 1), (large_displ, False))",#f22c40,Red
3,16.0,8,304.0,150,3433,12.0,70,1,amc rebel sst,False,"((cyl, 8), (origin, 1), (large_displ, False))",#f22c40,Red
4,17.0,8,302.0,140,3449,10.5,70,1,ford torino,False,"((cyl, 8), (origin, 1), (large_displ, False))",#f22c40,Red


### 4. Manually specify the items in the order you want them

In [21]:
# remove df so the items aren't auto-calculated
# still need column name for when palette is joined into the dataset
line_color = ColorAttr(columns=['large_displ'], items=[True, False], palette=['Green', 'Red'])

ds.join_attrs(fill_color=fill_color, line_color=line_color).head()

Unnamed: 0,mpg,cyl,displ,hp,weight,accel,yr,origin,name,large_displ,chart_index,fill_color,line_color
0,18.0,8,307.0,130,3504,12.0,70,1,chevrolet chevelle malibu,False,"((cyl, 8), (origin, 1), (large_displ, False))",#f22c40,Red
1,15.0,8,350.0,165,3693,11.5,70,1,buick skylark 320,True,"((cyl, 8), (origin, 1), (large_displ, True))",#f22c40,Green
2,18.0,8,318.0,150,3436,11.0,70,1,plymouth satellite,False,"((cyl, 8), (origin, 1), (large_displ, False))",#f22c40,Red
3,16.0,8,304.0,150,3433,12.0,70,1,amc rebel sst,False,"((cyl, 8), (origin, 1), (large_displ, False))",#f22c40,Red
4,17.0,8,302.0,140,3449,10.5,70,1,ford torino,False,"((cyl, 8), (origin, 1), (large_displ, False))",#f22c40,Red


### 5. Change the order of the iterable

In [22]:
line_color = ColorAttr(df=df, columns=['large_displ'], palette=['Red', 'Green'])

ds.join_attrs(fill_color=fill_color, line_color=line_color).head()

Unnamed: 0,mpg,cyl,displ,hp,weight,accel,yr,origin,name,large_displ,chart_index,fill_color,line_color
0,18.0,8,307.0,130,3504,12.0,70,1,chevrolet chevelle malibu,False,"((cyl, 8), (origin, 1), (large_displ, False))",#f22c40,Red
1,15.0,8,350.0,165,3693,11.5,70,1,buick skylark 320,True,"((cyl, 8), (origin, 1), (large_displ, True))",#f22c40,Green
2,18.0,8,318.0,150,3436,11.0,70,1,plymouth satellite,False,"((cyl, 8), (origin, 1), (large_displ, False))",#f22c40,Red
3,16.0,8,304.0,150,3433,12.0,70,1,amc rebel sst,False,"((cyl, 8), (origin, 1), (large_displ, False))",#f22c40,Red
4,17.0,8,302.0,140,3449,10.5,70,1,ford torino,False,"((cyl, 8), (origin, 1), (large_displ, False))",#f22c40,Red
