# GoFish Python Wrapper - Test Notebook

This notebook demonstrates the Python wrapper for GoFish graphics library.


## Setup

First, let's import the necessary libraries and create some sample data.


In [1]:
import pandas as pd
from gofish import chart, spread, stack, derive, rect, circle

# Create sample seafood data
seafood_data = [
  {
    "lake": "Lake A",
    "species": "Bass",
    "count": 23,
  },
  {
    "lake": "Lake A",
    "species": "Trout",
    "count": 31,
  },
  {
    "lake": "Lake A",
    "species": "Catfish",
    "count": 29,
  },
  {
    "lake": "Lake A",
    "species": "Perch",
    "count": 12,
  },
  {
    "lake": "Lake A",
    "species": "Salmon",
    "count": 8,
  },
  {
    "lake": "Lake B",
    "species": "Bass",
    "count": 25,
  },
  {
    "lake": "Lake B",
    "species": "Trout",
    "count": 34,
  },
  {
    "lake": "Lake B",
    "species": "Catfish",
    "count": 41,
  },
  {
    "lake": "Lake B",
    "species": "Perch",
    "count": 21,
  },
  {
    "lake": "Lake B",
    "species": "Salmon",
    "count": 16,
  },
  {
    "lake": "Lake C",
    "species": "Bass",
    "count": 15,
  },
  {
    "lake": "Lake C",
    "species": "Trout",
    "count": 25,
  },
  {
    "lake": "Lake C",
    "species": "Catfish",
    "count": 31,
  },
  {
    "lake": "Lake C",
    "species": "Perch",
    "count": 22,
  },
  {
    "lake": "Lake C",
    "species": "Salmon",
    "count": 31,
  },
  {
    "lake": "Lake D",
    "species": "Bass",
    "count": 12,
  },
  {
    "lake": "Lake D",
    "species": "Trout",
    "count": 17,
  },
  {
    "lake": "Lake D",
    "species": "Catfish",
    "count": 23,
  },
  {
    "lake": "Lake D",
    "species": "Perch",
    "count": 23,
  },
  {
    "lake": "Lake D",
    "species": "Salmon",
    "count": 41,
  },
  {
    "lake": "Lake E",
    "species": "Bass",
    "count": 7,
  },
  {
    "lake": "Lake E",
    "species": "Trout",
    "count": 9,
  },
  {
    "lake": "Lake E",
    "species": "Catfish",
    "count": 13,
  },
  {
    "lake": "Lake E",
    "species": "Perch",
    "count": 20,
  },
  {
    "lake": "Lake E",
    "species": "Salmon",
    "count": 40,
  },
  {
    "lake": "Lake F",
    "species": "Bass",
    "count": 4,
  },
  {
    "lake": "Lake F",
    "species": "Trout",
    "count": 7,
  },
  {
    "lake": "Lake F",
    "species": "Catfish",
    "count": 9,
  },
  {
    "lake": "Lake F",
    "species": "Perch",
    "count": 21,
  },
  {
    "lake": "Lake F",
    "species": "Salmon",
    "count": 47,
  },
]

seafood = pd.DataFrame(seafood_data)
print(seafood.head())


     lake  species  count
0  Lake A     Bass     23
1  Lake A    Trout     31
2  Lake A  Catfish     29
3  Lake A    Perch     12
4  Lake A   Salmon      8


## Example 1: Basic Bar Chart

A simple bar chart using the `spread` operator to arrange bars horizontally.


In [2]:
(
    chart(seafood)
    .flow(spread("lake", dir="x"))
    .mark(rect(h="count"))
    .render(w=500, h=300, axes=True)
)


## Example 2: Stacked Bar Chart

A stacked bar chart using both `spread` and `stack` operators.


In [3]:
(
    chart(seafood)
    .flow(
        spread("lake", dir="x"),
        stack("species", dir="y", label=False)
    )
    .mark(rect(h="count", fill="species"))
    .render(w=500, h=300, axes=True)
)


## Example 3: Stacked Bar Chart with Derive

Using the `derive` operator to sort data before stacking.


In [4]:
(
    chart(seafood)
    .flow(
        spread("lake", dir="x"),
        derive(lambda d: d.sort_values("count")),
        stack("species", dir="y", label=False)
    )
    .mark(rect(h="count", fill="species"))
    .render(w=500, h=300, axes=True)
)


[DEBUG] Executing derive lambda 1174fc61-2989-4a52-ac6a-5c9920e8b270
[DEBUG] DataFrame shape: (5, 3)
[DEBUG] DataFrame columns: ['lake', 'species', 'count']
[DEBUG] DataFrame dtypes:
lake       category
species    category
count       float64
dtype: object
[DEBUG] DataFrame head:
     lake  species  count
0  Lake A     Bass   23.0
1  Lake A    Trout   31.0
2  Lake A  Catfish   29.0
3  Lake A    Perch   12.0
4  Lake A   Salmon    8.0
[DEBUG] DataFrame info:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Data columns (total 3 columns):
 #   Column   Non-Null Count  Dtype   
---  ------   --------------  -----   
 0   lake     5 non-null      category
 1   species  5 non-null      category
 2   count    5 non-null      float64 
dtypes: category(2), float64(1)
memory usage: 506.0 bytes

[DEBUG] Result DataFrame shape: (5, 3)
[DEBUG] Result DataFrame columns: ['lake', 'species', 'count']
[DEBUG] Result DataFrame head:
     lake  species  count
4  Lake A   Salmon    8.0


## Notes

- The `derive` function allows you to run arbitrary Python code on DataFrames
- Data is converted to Apache Arrow format for efficient transfer to JavaScript
- Charts are rendered using the GoFish JavaScript library via a Node.js bridge
- In Jupyter notebooks, charts are displayed inline using HTML
- In standalone Python scripts, charts are saved to HTML files and opened in a browser
