# More with `bokeh`

There are two libraries that are primarily used for web-based visualization via Python: `bokeh` and `plotly`. While `plotly` has its merits and can be exceptionally useful *namely with 3D plotting*, we will be focusing entirely on `bokeh` and its associated technologies in this lecture. Tools like `panel` and `hvPlot`, while supporting many different tools and libraries, use `bokeh` as its default visualization library. `bokeh` has a number of methods for plotting data, and so we will take a look at these methods to start building an understanding on how they work.

As we saw previously, we can also use `Holoviews` to abstract away from the specific plotting backend if we wish. It is useful to know how to try out all of these methods in practice. In any open source community, there will always be many (sometimes competing) tools available at any given time. It is advisable to test out each for a short period of time, note what makes it useful or less so for your project, and then make an informed decision. Avoid blindly committing to a tool for a particular project, but also avoid spending more time testing out different tools than you spend actually working on the core functionality of your project. Make a quick informed decision keeping in mind the unique considerations of your project, where it needs to function, who needs to use it, and what your goals are then then commit to a tool.

## Getting Started

The first thing we need to do is tell `bokeh` that it is going to be outputting plots for a Jupyter environment. Depending on where the plots are viewed they need to be rendered differently. We should see a note that `bokeh` was loaded properly a notebook.

In [None]:
from bokeh.io import (
    output_notebook,
)  # import the utility to setup bokeh for Jupyter

output_notebook()  # load the bokeh/Jupyter resources

Now we are going to create a simple plot. We need to import some utility functions that allow us to create a figure to plot on and another method to render, or show, the plot.

In [None]:
from bokeh.plotting import (
    figure,
    show,
)  # import utilities for plotting/displaying

p = figure()  # create a figure
p.scatter(
    [1, 2, 3, 4, 5], [1, 4, 9, 16, 25]
)  # create a scatter plot of circles at the given x and y coordinates
show(p)  # show the plot

We can achieve the exact same behavior specifying a data frame in place of the data directly, and then specifying column names to use for the axes.

In [None]:
import numpy as np  # import numpy for making up some data
import pandas as pd  # import pandas for data frames

df = pd.DataFrame(
    {  # create a data frame, break into multiple lines for easier reading
        "x": np.arange(0, 100, 1),
        "y": np.random.uniform(0, 10, 100),
    }
)

p = figure()  # create a figure
p.scatter(
    x="x", y="y", source=df
)  # create a scatter plot of circles at the given x and y coordinates, using the data frame. note the use of labels to specify the x and y values.
show(p)  # show the plot

We can also achieve the exact same behavior using a tad more code. This code below uses a `ColumnDataSource` for specifying the data and axes to plot. The benefit here is that `bokeh` is built on top of the column data source structure, and so by constructing it ourselves we can control `bokeh` much more (generally with respect to interactivity).

In [None]:
from bokeh.models import (
    ColumnDataSource,
)  # import ColumnDataSource for setting up data

data = {  # create a dictionary of data, mapping 'x' and 'y' to arrays of data, break into multiple lines for easier reading
    "x": np.arange(0, 100, 1),
    "y": np.random.uniform(0, 10, 100),
}

source = ColumnDataSource(
    data=data
)  # create a column data source from the dictionary by setting the data attribute. we can also create this from a data frame!

p = figure()  # create a figure
p.scatter(
    x="x", y="y", source=source
)  # create a scatter plot of circles at the given x and y coordinates, using the column data source. note the use of labels to specify the x and y values.
show(p)  # show the plot

## Common Actions with `bokeh`

It would be good to cover a few of the common things we will usually perform - like changing colors, sizes, marker types, and plot types.

### Sizes

Just as before we can adjust the sizes of our point by specifying `size`.

In [None]:
p = figure()  # create a figure
p.scatter(
    df.x, df.y, size=5
)  # create a scatter plot of circles at the given x and y coordinates, all sizes set to 5
show(p)

Note that as we zoom in and out the size of our points retain the same size. This may not be entirely desirable as we may lose some context for the sizes. This is espcially important when the size of our markers are inherently related to the scales in which they are plotted!

In [None]:
sizes = np.random.uniform(0, 75, 100)  # create random set of sizes
p = figure()  # create a figure
p.scatter(
    df.x, df.y, alpha=0.6, size=sizes
)  # create a scatter plot of circles at the given x and y coordinates
show(p)

In the case where the size of the scatter points should be within the same scale as the axes, instead of specifying `size` we can specify `radius`. This tells `bokeh` to render the points in *data units*, and so it will match the scale of the data. Be careful though, as the data units are very different from *screen units*!

In [None]:
sizes = np.random.uniform(0, 5, 100)  # create random set of sizes
p = figure()  # create a figure
p.scatter(
    df.x, df.y, alpha=0.6, radius=sizes
)  # create a scatter plot of circles at the given x and y coordinates
show(p)

### Colors

Colors in `bokeh` can be handled a number of ways - one these ways is using RGB triplets to define them. We can also use HSL format, hex codes, and named colors.

In [None]:
colors = np.random.randint(
    0, 256, (100, 3), dtype="uint8"
)  # generate random colors as RGB triplets
p = figure()  # create a figure
p.scatter(
    df.x, df.y, radius=sizes, color=colors
)  # create a scatter plot of circles at the given x and y coordinates
show(p)  # show the plot

In [None]:
colors = np.random.choice(
    ["#123456", "#789ABC", "#CEF012"], 100
)  # generate random colors as hex codes
p = figure()  # create a figure
p.scatter(
    df.x, df.y, radius=sizes, color=colors, alpha=0.6
)  # create a scatter plot of circles at the given x and y coordinates, with colors and an alpha
show(p)  # show the plot

In [None]:
colors = np.random.choice(
    ["red", "blue", "green"], 100
)  # generate random colors as named colors
p = figure()  # create a figure
p.scatter(
    df.x, df.y, radius=sizes, color=colors, alpha=0.6
)  # create a scatter plot of circles at the given x and y coordinates, with colors and an alpha
show(p)  # show the plot

### Fill Color, Border Color, Hatches

We can control face colors and line colors separately, as well as apply patterned fills to glyphs.

In [None]:
p = figure()  # create a figure
p.scatter(  # create a scatter plot, break into multiple lines for easier reading
    df.x,  # set the x data
    df.y,  # set the y data
    radius=sizes,  # set the sizes to use for the radii
    fill_color=colors,  # set the fill color separately to our list of colors
    line_color="black",  # set the line color separately to black
    alpha=0.6,
)
show(p)

In [None]:
p = figure()  # create a figure
p.scatter(  # create a scatter plot, break into multiple lines for easier reading
    df.x,  # set the x data
    df.y,  # set the y data
    radius=sizes,  # set the sizes to use for the radii
    fill_color=colors,  # set the fill color separately to our list of colors
    fill_alpha=0.6,  # set the fill alpha separately (line alpha can also be set)
    line_color="black",  # set the line color separately to black
)
show(p)

In [None]:
hatches = np.random.choice(
    ["dot", "horizontal_line", "vertical_line"], 100
)  # generate random hatch patterns
p = figure()  # create a figure
p.scatter(  # create a scatter plot, break into multiple lines for easier reading
    df.x,  # set the x data
    df.y,  # set the y data
    radius=sizes,  # set the sizes to use for the radii
    fill_color=colors,  # set the fill color separately to our list of colors
    fill_alpha=0.6,  # set the fill alpha separately (line alpha can also be set)
    line_color="black",  # set the line color separately to black
    hatch_pattern=hatches,  # set the hatch patterns to our list of patterns
)
show(p)

### Themes

There are some preset themes to handle how our cavas is colored. We need to import a utility to access the "document" that `bokeh` operates on, and with that we can select a prebuilt theme. Built in themes are [documented here](https://docs.bokeh.org/en/latest/docs/reference/themes.html#bokeh-themes), with the default being `"caliber"`

In [None]:
from bokeh.io import curdoc  # import utility to get the bokeh document

curdoc().theme = (
    "night_sky"  # set the theme to "night_sky" on the current document
)

p = figure()  # create a figure
p.scatter(  # create a scatter plot, break into multiple lines for easier reading
    df.x,  # set the x data
    df.y,  # set the y data
    radius=sizes,  # set the sizes to use for the radii
    fill_color=colors,  # set the fill color separately to our list of colors
    fill_alpha=0.6,  # set the fill alpha separately (line alpha can also be set)
    line_color="black",  # set the line color separately to black
    hatch_pattern=hatches,  # set the hatch patterns to our list of patterns
)
show(p)

In [None]:
# reset the theme
curdoc().theme = "caliber"

### Marker Types

As expected we also have the ability to change the marker type.

In [None]:
p = figure()  # create a figure
p.scatter(  # create a scatter plot, break into multiple lines for easier reading
    df.x,  # set the x data
    df.y,  # set the y data
    size=sizes,  # note the use of sizes instead of radius here!
    fill_color=colors,  # set the fill color separately to our list of colors
    fill_alpha=0.6,  # set the fill alpha separately (line alpha can also be set)
    line_color="black",  # set the line color separately to black
    marker="square",  # set the marker type to squares
)
show(p)

In [None]:
markers = np.random.choice(
    ["diamond", "hex", "inverted_triangle"], 100
)  # generate random markers
p = figure()  # create a figure
p.scatter(  # create a scatter plot, break into multiple lines for easier reading
    df.x,  # set the x data
    df.y,  # set the y data
    size=sizes,  # note the use of sizes instead of radius here!
    fill_color=colors,  # set the fill color separately to our list of colors
    fill_alpha=0.6,  # set the fill alpha separately (line alpha can also be set)
    line_color="black",  # set the line color separately to black
    marker=markers,  # set the marker type to squares
)
show(p)

This should have worked! It appears we have found a bug in `bokeh`! While rare with librarie that are as popular and mature as `bokeh`, it is possible we will run into bugs! We can work around this using a `ColumnDataSource` to specify all of our options.

In [None]:
df2 = df.copy()  # copy our old data frame
df2["size"] = 10  # add a size column with constant value
df2["color"] = colors  # set the colors
df2["marker"] = markers  # set the marker

source = ColumnDataSource(data=df2)  # create a column data source

p = figure()  # create a figure
p.scatter(  # create a scatter plot, break into multiple lines for easier reading
    x="x",  # specify which column to use for the x data
    y="y",  # specify which column to use for the y data
    size="size",  # specify which column to use for the marker size
    color="color",  # specify which column to use for the marker color
    marker="marker",  # specify which column to use for the marker type
    source=source,  # specify the column data source to visualize
)
show(p)

The set of markers available through `bokeh` can be [found here](https://docs.bokeh.org/en/latest/docs/reference/models/glyphs/scatter.html#scatter)

### Plot Types

And of course, we have many different plot types readily available to us. We will not cover them here, but will instead explore them as we need to. We have the basics covered though, and even have a few more highly convenient plotting methods.

* scatter
* line (methods for single, multi, stacked, step, and more) 
* bar (vertical, horizontal)
* hexb tiling
* polygons, patches, ellipses, wedges, and more
