# Chapter 1 : Plotting with glyphs

## Interactive DATA Visualization with BOKEH

### 1-Plotting with glyphs
- Bokeh.plotting interface : This interface gives you a basic empty plot with sensible (but customizable) defaults for things like axes, grids, and tools. Into this plot, you can add glyphs that connect visual properties directly to your data.

### 2- What are Glyphs

2. What are Glyphs
- So what are glyphs? Glyphs are visual shapes that can be drawn on the screen. - These can be simple point-like markers such as circles, squares, or triangles - Or more sophisticated shapes such as rectangles, lines, patches, wedges, and others In every case, these shapes have visual properties that can include things like: - position or x and y coordinates for locating a shape on the plot - size or radius, fill and outline colors, or transparency (also called alpha) Let's see what this actually looks like in typical usage.

### 3 - let's see the code now

- First steps and functions to use bokeh : 
    - output_file , show : These two functions make it easy to save the plots we make in an HTML file and to open up a browser to display the file. 
    - output_notebook : in order to display plots inline in a Jupyter notebook.
    - figure : function is what creates the basic empty plot with sensible defaults I mentioned earlier


In [2]:
from bokeh.io import output_notebook, push_notebook, show, output_file
from bokeh.plotting import figure
output_notebook()

- the figure function with some arguments that control general properties of a plot. In this case, we have passed "plot_width=400" to specify that the overall canvas for drawing the plot should be 400 pixels wide. There is a corresponding plot_height if you want to change the default canvas height. We have also passed in the argument "tools=pan,box_zoom". The tools parameter can accept a list of actual tool objects, or more commonly, a comma-separated string that lists names of built-in tools. In this case, we choose to add a tool for panning the plot region, and a tool for drawing rectangular regions to zoom in on. 

- We call the "(dot) circle" method on the plot returned by figure. In this example, we pass two python lists the represent the x- and y- coordinates of the circles, respectively. All other visual properties (such as size, color, etc) will take on default values. 

- If we want to display the plot in the notebook, we should call the `show` function, set the `notebook_handle` to True then call the `push_notebook` and set the `handle` to handle (the output of the `show()` function)

- Finally, we call some functions to display the results. We call `output_file` to specify that we want to save the output to an HTML file. Then we call show(), which saves the file and conveniently opens a browser to display it.

In [5]:
plot = figure(plot_width = 400, tools = 'pan,box_zoom')
plot.circle([1,2,3,4,5],[8,6,5,2,3])
handle = show(plot, notebook_handle=True)
plot.title.text = "New Title"
push_notebook(handle=handle)
output_file('circle.html')
#show(plot)

### 4 - Glyph properties

- What kinds of values can be attached to glyph properties? We have seen that Python lists can be passed in, but more generally, any sequence type will do: tuples, arrays, columns from DataFrames all work as well. It's also possible to configure properties with a single fixed value. In that case, however many glyphs are actually drawn; they will all have the same value for that property. This already happened implicitly in the previous example: we supplied lists for the x and y coordinates, but the size, color, and transparency had single default values that carried over to every circle that was actually drawn. In the example code here, we have set the x value to 10, but given lists y and for size. In the output we can see that all the circles are centered at x=10, but the y values and sizes vary according to the lists that we passed in.

In [8]:
plot = figure(plot_width = 400, tools = 'pan,box_zoom')
plot.circle(x = 10, y = [1,2,3,4], size = [10,20,30,40])
show(plot)

### 5. Markers
- The first set of exercises will give us practice with marker-like glyphs. We will use circles throughout, but I should mention that Bokeh comes with many standard maker shapes built-in. Here is a list of all the built-in markers.
    - asterisk()
    
    - circle()
    - circle_cross()
    - circle_x()
    
    - cross()
    
    - diamond()
    - diamond_cross()
    
    - inverted_triangle()
    
    - square()
    - square_cross()
    - square_x()
    
    - triangle()
    - x()
    
    - annulus()
    - annular_wedge()
    - wedge()
    - rect()
    - quad()
    - vbar()
    - hbar()
    - image()
    - image_rgba()
    - image_url()
    - patch()
    - patches()
    - line()
    - multi_line()
    - circle()
    - oval()
    - ellipse()
    - arc()
    - quadratic()
    - bezier()
    
    
    
- Customizing your scatter plots
    - The three most important arguments to customize scatter glyphs are color, size, and alpha. Bokeh accepts colors as hexadecimal strings, tuples of RGB values between 0 and 255, and any of the 147 CSS color names. Size values are supplied in screen space units with 100 meaning the size of the entire figure.

    - The alpha parameter controls transparency. It takes in floating point numbers between 0.0, meaning completely transparent, and 1.0, meaning completely opaque.
    
    
### Lines

In [12]:
from bokeh.io import output_file, output_notebook, push_notebook
from bokeh.plotting import figure

x = [1,2,3,4,5]
y = [8,6,4,2,3]
p = figure()

p.line(x,y,color = 'red',line_width=3)
p.circle(x,y,fill_color = 'black', size = 10, color = 'width')
output_file('line.html')
show(p)

ERROR:bokeh.core.validation.check:E-1001 (BAD_COLUMN_NAME): Glyph refers to nonexistent column name. This could either be due to a misspelling or typo, or due to an expected column being missing. : key "line_color" value "width" [renderer: GlyphRenderer(id=2797, glyph=Circle(id='2795', ...), ...)]


ERROR:bokeh.core.validation.check:E-1001 (BAD_COLUMN_NAME): Glyph refers to nonexistent column name. This could either be due to a misspelling or typo, or due to an expected column being missing. : key "line_color" value "width" [renderer: GlyphRenderer(id=2797, glyph=Circle(id='2795', ...), ...)]


### Patches
- is a useful Glyphs. Used to draw multiple polygone shapes in a single plot
- useful to draw countries or states
- Data given as list of lists
- each sublist contains x and y coordinate of one patch

Patches
- In Bokeh, extended geometrical shapes can be plotted by using the patches() glyph function. The patches glyph takes as input a list-of-lists collection of numeric values specifying the vertices in x and y directions of each distinct patch to plot.


In [14]:
from bokeh.io import output_file, show, output_notebook
from bokeh.plotting import figure

p = figure()
xs = [[1,1,2,2],[2,2,4],[2,2,3,3]]
ys = [[2,5,5,2],[3,5,5],[2,3,4,2]]

p.patches(xs,ys,fill_color=['red','blue','green'],line_color='black')
output_file('patches.html')
show(p)

Notes
- Since we are plotting dates on the x-axis, you must add x_axis_type='datetime' when creating the figure object.
`p = figure(x_axis_type='datetime', x_axis_label='Date', y_axis_label='US Dollars')`

- line_color, line_width, ...


## Data Formats
- Python Built-in List
- Numpy Arrays is a python library for dealing with multidimentional array of data
- Pandas provides dataframe structure (tabular set, timeseries)
- Column Data Source : 
    - used extensively in Bokeh, takes data from python to a final js and html document to be display it to the users
    - Maps string column names to sequences of data
    - often created automatically for u
    - Can be shared between glyphs to link selections
    - extra columns can be used with hover tooltips
    - The Bokeh ColumnDataSource
    - The ColumnDataSource is a table-like data object that maps string column names to sequences (columns) of data. It is the central and most common data structure in Bokeh.
    - You can create a ColumnDataSource object directly from a Pandas DataFrame by passing the DataFrame to the class initializer.
    


In [16]:
# NUMPY

from bokeh.io import output_file,show
from bokeh.plotting import figure
import numpy as np

p = figure()
# Create a numpy of 1000 items from 0 to 10
x = np.linspace(0,10,1000)
y = np.sin(x) + np.random.random(1000)*0.2

p.line(x,y)
#output_file(numpy.html)
show(p)

In [18]:
# PANDAS
# Flowers is a pandas dataframe
from bokeh.sampledata.iris import flowers
p = figure()
p.circle(flowers['petal_length'],flowers['sepal_length'],size = 10)
#output_file('pandas.html')
show(p)

In [19]:
# Column Data Source
from bokeh.models import ColumnDataSource
x = [1,2,3,4,5]
y = [8,6,4,2,3]
source = ColumnDataSource(data={'x':x,'y':y})
source.data

{'x': [1, 2, 3, 4, 5], 'y': [8, 6, 4, 2, 3]}

In [28]:
from bokeh.sampledata.iris import flowers as df
from bokeh.models import ColumnDataSource
#print(df.head(5))
source = ColumnDataSource(df)
source
#p.circle(source.data['Year'],source.data['Time'],size = 8, color = source.data['color'])
#p.circle('Year','Time',size = 8, color = 'color', source = source)

In [33]:
source.data['petal_length']

array([1.4, 1.4, 1.3, 1.5, 1.4, 1.7, 1.4, 1.5, 1.4, 1.5, 1.5, 1.6, 1.4,
       1.1, 1.2, 1.5, 1.3, 1.4, 1.7, 1.5, 1.7, 1.5, 1. , 1.7, 1.9, 1.6,
       1.6, 1.5, 1.4, 1.6, 1.6, 1.5, 1.5, 1.4, 1.5, 1.2, 1.3, 1.4, 1.3,
       1.5, 1.3, 1.3, 1.3, 1.6, 1.9, 1.4, 1.6, 1.4, 1.5, 1.4, 4.7, 4.5,
       4.9, 4. , 4.6, 4.5, 4.7, 3.3, 4.6, 3.9, 3.5, 4.2, 4. , 4.7, 3.6,
       4.4, 4.5, 4.1, 4.5, 3.9, 4.8, 4. , 4.9, 4.7, 4.3, 4.4, 4.8, 5. ,
       4.5, 3.5, 3.8, 3.7, 3.9, 5.1, 4.5, 4.5, 4.7, 4.4, 4.1, 4. , 4.4,
       4.6, 4. , 3.3, 4.2, 4.2, 4.2, 4.3, 3. , 4.1, 6. , 5.1, 5.9, 5.6,
       5.8, 6.6, 4.5, 6.3, 5.8, 6.1, 5.1, 5.3, 5.5, 5. , 5.1, 5.3, 5.5,
       6.7, 6.9, 5. , 5.7, 4.9, 6.7, 4.9, 5.7, 6. , 4.8, 4.9, 5.6, 5.8,
       6.1, 6.4, 5.6, 5.1, 5.6, 6.1, 5.6, 5.5, 4.8, 5.4, 5.6, 5.1, 5.1,
       5.9, 5.7, 5.2, 5. , 5.2, 5.4, 5.1])

### Customizing glyphs

#### Selection Appearance : to select point 
    - box_select : select point by drawing a rectangle region over the plot
    - lasso_select : seelct points by drawing a free form curve


In [37]:
from bokeh.io import output_notebook, output_file, show
from bokeh.plotting import figure
from bokeh.sampledata.iris import flowers

p = figure(tools = "box_select, lasso_select")
p.circle("petal_length", "sepal_length", source= flowers, selection_color = "red",nonselection_fill_alpha=0.2,nonselection_fill_color="grey")
show(p)

#### Hover Appearance
```
from bokeh.models import HoverTool

# Add circle glyphs to figure p
p.circle(x, y, size=10,
         fill_color='grey', alpha=0.1, line_color=None,
         hover_fill_color='firebrick', hover_alpha=0.5,
         hover_line_color='white')

# Create a HoverTool: hover
hover = HoverTool(tooltips = None,mode='vline')

# Add the hover tool to the figure p
p.add_tools(hover)

```

In [44]:
from bokeh.models import HoverTool
hover = HoverTool(tooltips = None, mode = 'vline')#'hline'
p = figure(tools=[hover,'crosshair'])
p.circle("petal_length", "sepal_length", source= flowers,hover_color='red')
show(p)

#### Color Mapping

```
#Import CategoricalColorMapper from bokeh.models
from bokeh.models import CategoricalColorMapper

# Convert df to a ColumnDataSource: source
source = ColumnDataSource(df)

# Make a CategoricalColorMapper object: color_mapper
color_mapper = CategoricalColorMapper(factors=['Europe', 'Asia', 'US'],
                                      palette=['red', 'green', 'blue'])

# Add a circle glyph to the figure p
p.circle('weight', 'mpg', source=source,
            color=dict(field='origin',transform=color_mapper),legend='origin')

# Specify the name of the output file and show the result
output_file('colormap.html')
show(p)


```

In [49]:
from bokeh.models import CategoricalColorMapper

mapper = CategoricalColorMapper(factors = ['setosa','virginica','versicolor'], palette = ['red','blue','green'])
p = figure()
p.circle('petal_length','sepal_length',source = flowers,color={'field':'species','transform':mapper},legend = 'species')
show(p)

