<table style="float:left; border:none">
   <tr style="border:none">
       <td style="border:none">
           <a href="http://bokeh.pydata.org/">     
           <img 
               src="assets/images/bokeh-transparent.png" 
               style="width:50px"
           >
           </a>    
       </td>
       <td style="border:none">
           <h1>Bokeh Tutorial</h1>
       </td>
   </tr>
</table>

<div style="float:right;"><h2>05. Data Transformations</h2></div>

Data transformations allow you to specify a way to transform data instead of doing the transform yourself. This means that you can send your raw data to the client and the transformation will happen client side. This can be efficient becuase:
* your numerical data might be smaller than it's color representation
* you maybe using multiple transforms on the same data
* you have to write less code to get the plot you want

In bokeh there are Transforms and ColorMappers that fall into this category:
* Jitter
* LinearInterpolator
* StepInterpolator
* [LinearColorMapper](http://bokeh.pydata.org/en/latest/docs/reference/models/mappers.html#bokeh.models.mappers.LinearColorMapper)
* [LogColorMapper](http://bokeh.pydata.org/en/latest/docs/reference/models/mappers.html#bokeh.models.mappers.LogColorMapper)

In [3]:
from bokeh.sampledata.autompg import autompg
autompg

Unnamed: 0,mpg,cyl,displ,hp,weight,accel,yr,origin,name
0,18.0,8,307.0,130,3504,12.0,70,1,chevrolet chevelle malibu
1,15.0,8,350.0,165,3693,11.5,70,1,buick skylark 320
2,18.0,8,318.0,150,3436,11.0,70,1,plymouth satellite
3,16.0,8,304.0,150,3433,12.0,70,1,amc rebel sst
4,17.0,8,302.0,140,3449,10.5,70,1,ford torino
5,15.0,8,429.0,198,4341,10.0,70,1,ford galaxie 500
6,14.0,8,454.0,220,4354,9.0,70,1,chevrolet impala
7,14.0,8,440.0,215,4312,8.5,70,1,plymouth fury iii
8,14.0,8,455.0,225,4425,10.0,70,1,pontiac catalina
9,15.0,8,390.0,190,3850,8.5,70,1,amc ambassador dpl


In [2]:
from bokeh.io import output_notebook, show
output_notebook()

In [5]:
from bokeh.models import ColumnDataSource
source = ColumnDataSource(autompg)
source.column_names

['origin',
 'index',
 'hp',
 'mpg',
 'displ',
 'yr',
 'name',
 'weight',
 'cyl',
 'accel']

In [4]:
from bokeh.plotting import figure

In [11]:
p = figure(height=400, x_axis_label='year', y_axis_label='mpg')
t = p.circle(x='yr', y='mpg', alpha=0.6, size=10, source=source)
t.glyph.fill_color = "green"
#p.circle([1],[2])
show(p)

### Use Jitter to see distribution a little better

We use the explicit field specification to spell the data transform

In [13]:
from bokeh.models import Jitter
p = figure(height=400, width=800, x_axis_label='year', y_axis_label='mpg')
p.circle(x={'field': 'yr', 'transform': Jitter(width=0.5)}, y=autompg.mpg, alpha=0.6, size=15, source=source)
show(p)

Supplying a user-defined data source AND iterable values to glyph methods is deprecated.

See https://github.com/bokeh/bokeh/issues/2056 for more information.

  warn(message)


### Use a Linear Interpolator to size by horsepower

In [15]:
from bokeh.models import LinearInterpolator

size_mapper = LinearInterpolator(
    x=[autompg.hp.min(), autompg.hp.max()],
    y=[3, 50]
)
p = figure(height=400, width=800, x_axis_label='year', y_axis_label='mpg')
p.circle(x='yr', y=autompg.mpg, alpha=0.6, size={'field': 'hp', 'transform': size_mapper}, source=source)
show(p)

Supplying a user-defined data source AND iterable values to glyph methods is deprecated.

See https://github.com/bokeh/bokeh/issues/2056 for more information.

  warn(message)


### Exercise: Use a Color Mapper to color by weight

In [None]:
from bokeh.models import LinearColorMapper
from bokeh.palettes import Viridis256
color_mapper = LinearColorMapper