# Foxtrot-Core · Interactive *Bitstream Mapping* Tutorial

This hands-on notebook walks you through the *Bitmap* module’s interactive clustering pipeline.
You will learn to load raw bit-offset data, sweep DBSCAN / K-Means parameters, monitor runs with a live dashboard, and drill into the resulting clusters – **without writing boiler-plate code**.

> **Setup (do this first)**
> **Required:** install **foxtrot-core with the _analysis_ extra**  
> `pip install -U "foxtrot-core[analysis]"`
>
> This pulls in the data-science stack used by `bitmap.*` (NumPy, pandas, scikit-learn, ipywidgets, ipympl, matplotlib, etc.), so the widgets and plots in this notebook work out-of-the-box.
>
> **Runtime prerequisites** (installed by the line above):  
> • `ipywidgets` ≥ 7.6  ·  • `ipympl` (for `%matplotlib widget`)
>
> **Optional (GPU acceleration):** RAPIDS cuML  
> `pip install --extra-index-url https://pypi.nvidia.com "foxtrot-core[analysis,rapids]"`  
> or with TensorFlow CUDA as well:  
> `pip install --extra-index-url https://pypi.nvidia.com "foxtrot-core[analysis,gpu,rapids]"`

---

## Quick Navigation

|  Step | Widget             | What you’ll master                                   |
| :---: | :----------------- | :--------------------------------------------------- |
|  1    | Backend setup      | Verify install, enable live widgets & hot-reloading  |
|  2    | `file_picker`      | Load `.off` offsets + optional one-hot feature masks |
|  3    | `clustering_grid`  | Build a DBSCAN / K-Means sweep declaratively         |
|  4    | `metric_selector`  | Toggle CSR, silhouette and other metrics on/off      |
|  5    | `frame_grid`       | Sweep multiple `frame_size` candidates               |
|  6    | `dashboard_runner` | Launch the full experiment with progress bars        |
|  7    | `metric_plot`      | Visualise metric trends across the sweep             |
|  8    | `cluster_viewer`   | Inspect individual clusters interactively            |
|  9    | Power tips         | Caching, sub-sampling, custom metrics                |


## 1 · Backend & Imports

A couple of magic commands make life easier:

In [None]:
# │ codecell 1 │
# Interactive backend
%matplotlib widget

# Fallback (if ipympl is unavailable)
# %matplotlib tk

from foxtrot_core import __version__ as fc_version
print(f"Foxtrot-Core {fc_version} loaded ✓")


## 2 · Load Bit‑Offset Data `(file_picker)`

Run the widget below and point it to **one** `.off` file plus any number of *feature‑mask* files.
Each mask is converted into a one‑hot column.

In [None]:
# │ codecell 2 │
from foxtrot_core.bitmap.ui.sources import file_picker

ui_source, data = file_picker()   # ⇦ widget + live DataDict

A green ✓ message confirms that `data` now contains:

```text
{
  "offsets"      : list[int],
  "feature_files": list[{"description": str, "offsets": set[int]}],
  "data_hash"    : str   # SHA‑256, used by the cache layer
}
```

## 3 · Build a Parameter Grid `(clustering_grid)`

`clustering_grid` discovers every algorithm under `foxtrot_core.bitmap.algorithms` that exposes a `__param_schema__`.  Each parameter row supports:

* **static** value *or* **sweep** over a numeric range
* categorical drop‑downs / multi‑selects
* conditional visibility via *show\_if* rules

> **DBSCAN quick‑start**
> • Xilinx 7‑Series → `eps = 1`, `min_samples = 5`
> • Intel Flex 10K20 → `eps = 4`, `min_samples = 6`

In [None]:
# │ codecell 3 │
from foxtrot_core.bitmap.ui.params import clustering_grid

ui_params, grid_maker = clustering_grid()

## 4 · Select Quality Metrics `(metric_selector)`

The toggle widget is populated directly from the *metrics registry* – see `foxtrot_core.bitmap.metrics`.

In [None]:
# │ codecell 4 │
from foxtrot_core.bitmap.ui.params import metric_selector

ui_metrics, get_metrics = metric_selector()

## 5 · Frame‑Size & Mapping Sweep `(frame_grid)`

In [None]:
# │ codecell 5 │
from foxtrot_core.bitmap.ui.params import frame_grid

ui_frames, fmap_grid = frame_grid()

> **Typical frame sizes**
> • Intel Flex → `L = 334`  ·  • Xilinx 7‑Series → `L = 3232`
> Only the `row_major` mapping is validated at the moment.

## 6 · Run the Experiment `(dashboard_runner)`

The dashboard executes the full grid **in parallel** (default = 8 threads) and streams a live progress bar.


In [None]:
# │ codecell 6 │
from foxtrot_core.bitmap.ui.runners import dashboard_runner

ui_dash, get_results = dashboard_runner(
    data,
    grid_maker,
    workers=24,          # tune to your CPU
    live_logging=False,  # verbose per‑job logs
)

**Options:**

|  Flag                 |  Purpose                                                                                      |
| :-------------------- | :-------------------------------------------------------------------------------------------- |
|  Keep clustering data | Stores every heavy `clustered_df`. Enable for inspection; disable for long sweeps to save RAM |
|  Save metrics         | Writes slim Parquet + CSV files to `./metrics/` right after the run                           |
|  Crop window          | Limit the processed region either **by Y‑max** or **by #points**  (see widget tool‑tips)      |

## 7 · Plot Metrics `(metric_plot)`

In [None]:
# │ codecell 7 │
from foxtrot_core.bitmap.ui.posts import metric_plot

ui_plot = metric_plot(get_results)

`metric_plot` can draw directly from the **last sweep** (*fast*) or from any saved Parquet file (*post‑analysis*).  Enable **Target L** to draw a red dashed guideline, and **Top‑20 table** to print the best configurations by your chosen metric (default = CSR Index).


## 8 · Interactive Cluster Viewer `(cluster_viewer)`

In [None]:
# │ codecell 8 │
from foxtrot_core.bitmap.ui.posts import cluster_viewer

ui_viewer, refresh_viewer = cluster_viewer(get_results)

Select a run, then click **Plot** to open a zoomed‑in scatter plot of its clusters.  If you forgot to tick *Keep clustering data* the widget will warn you instead of failing silently.

## 9 · Tips

* **RAM usage** – disable *Keep clustering data* once you are happy with the metrics; `clustered_df` can reach many GB.
* **Sub‑sampling** – use the **Max pts** crop mode to explore a giant bitstream quickly.
* **Parquet reuse** – any Parquet generated by *Save metrics* can be re‑plotted later, shared with collaborators, or version‑controlled.
* **Re‑entrancy** – spawn multiple dashboards in one kernel; each keeps its own thread‑pool and widgets.
* **GPU acceleration** – when RAPIDS is installed, choose *implementation = gpu* in either DBSCAN or K‑Means for a 10‑50× speed‑up.