# Plotting with Palmer penguins {#exr-palmer-penguins-plotting}

[Data set download](https://s3.amazonaws.com/bebi103.caltech.edu/data/penguins_subset_tidy.csv)

<hr />

In a @exr-palmer-penguins-split-apply-combine, you tidied a subset of the [Palmer penguins data set](https://s3.amazonaws.com/bebi103.caltech.edu/data/penguins_subset.csv) and saved the result in a file name [penguins_subset_tidy.csv](https://s3.amazonaws.com/bebi103.caltech.edu/data/penguins_subset_tidy.csv). Use that tidied data set to make a scatter plot of bill length versus flipper length with the glyphs colored by species. 

## Solution

<hr>

In [1]:
# Colab setup ------------------
import os, sys, subprocess
if "google.colab" in sys.modules:
    cmd = "pip install --upgrade polars bebi103 watermark"
    process = subprocess.Popen(cmd.split(), stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    stdout, stderr = process.communicate()
    data_path = "https://s3.amazonaws.com/bebi103.caltech.edu/data/"
else:
    data_path = "../data/"
# ------------------------------

import polars as pl

import bokeh.models
import bokeh.plotting
import bokeh.io

bokeh.io.output_notebook()

I will take a more verbose approach to plotting to enable clickable legends. Alternatively, I could use a column data source with a color column.

In [2]:
# Load in the data frame
df = pl.read_csv(os.path.join(data_path, 'penguins_subset_tidy.csv'))

# Colors I would like to use
colors = ["#66c2a5", "#fc8d62", "#8da0cb"]

# Axes for plot
p = bokeh.plotting.figure(
    frame_height=200,
    frame_width=200,
    x_axis_label="flipper length (mm)",
    y_axis_label="bill length (mm)",
    toolbar_location="above",
)

# Add glyphs, adding to items for legend creation
items = []
for color, (species, g) in zip(colors, df.group_by(["species"])):
    items.append(
        (
            species[0],
            [
                p.scatter(
                    source=g.to_dict(),
                    x="flipper_length_mm",
                    y="bill_length_mm",
                    color=color,
                )
            ],
        )
    )

# Add the legend
p.add_layout(
    bokeh.models.Legend(items=items, click_policy="hide", location="right"), "right"
)

bokeh.io.show(p)

## Computing environment

In [3]:
%load_ext watermark
%watermark -v -p numpy,polars,jupyterlab

Python implementation: CPython
Python version       : 3.12.9
IPython version      : 9.1.0

numpy     : 2.1.3
polars    : 1.29.0
jupyterlab: 4.3.7

