LLM-first tools for operating on Jupyter notebooks.
notebook-tools is a Python CLI that lets an LLM or another agent work with .ipynb notebooks as notebooks, not as raw JSON blobs.
An .ipynb file is not a text file. It is a JSON document that stores code, markdown, outputs, metadata, and execution state in a deeply nested structure. When an LLM agent treats a notebook like a generic file, everything becomes expensive, fragile, and error-prone.
A real student submission notebook is 924KB with 93 cells. The raw JSON is 740,959 characters.
A simple task: "What does the data loading cell output?"
Without notebook-tools:
- Read the entire 924KB file into context (~185,000 tokens)
- Manually navigate nested JSON to find the right cell
- Extract output from
cells[4].outputs[0].textburied inside arrays of arrays - Parse through base64 images, HTML renderers, and Colab metadata
With notebook-tools:
notebook-tools cell-output --index 4 --notebook notebook.ipynbOne command. ~500 tokens. 370x less context.
Read amplification. To find a cell by keyword, a generic agent reads the entire file (740K chars). search-cells returns only matches (~1K chars). To understand notebook structure, it reads everything. list-cells returns an outline (~3K chars).
Edit fragility. To insert a markdown cell, an agent must read the entire file, manually construct a valid cell object with correct id and metadata, splice it into the JSON array at the right index, and rewrite the entire 924KB file. One misplaced comma and the notebook is corrupted. insert-cell handles cell ID generation, metadata, and JSON validity automatically.
No concept of notebook structure. A generic file editor sees nested JSON. It does not understand which cells depend on which, which outputs are stale, what section a cell belongs to, or whether a cell has errors. Notebook-tools provides this through get-dependencies, summarize, and cell-output.
No safe mutation. Generic text edits on notebooks are inherently dangerous. Notebook-tools enforces revision preconditions on every mutation, requires confirmation tokens for deletions, provides structured diff summaries, and auto-generates cell IDs.
Tested on a 93-cell data science notebook (924KB):
| Metric | Without tools | With tools | Improvement |
|---|---|---|---|
| Tokens to find a cell | ~185,000 | ~500 | 370x |
| Tokens to get an output | ~185,000 | ~500 | 370x |
| Tokens to understand structure | ~185,000 | ~3,000 | 62x |
| Notebook corruption risk | High (manual JSON surgery) | Zero (structured mutations) | Eliminated |
| Edit precision | File-level | Cell-level | Granular |
| Dependency tracing | Manual, error-prone | Automated, derived | Reliable |
| Command | What it does |
|---|---|
list-cells |
Cell outline with summaries, status, section labels |
read-cells |
Read specific cells with budget controls |
search-cells |
Find cells by keyword in source or markdown |
cell-output |
Summarized output, not raw JSON |
get-dependencies |
Dependency graph between cells |
summarize |
Workflow overview, key variables, open issues |
| Command | What it does |
|---|---|
edit-cell |
Replace, append, or prepend cell source |
insert-cell |
Add new cell at any position |
delete-cell |
Remove a cell (requires confirmation) |
move-cell |
Reposition a cell |
split-cell |
Split a cell at a line number |
merge-cells |
Merge adjacent cells |
| Command | What it does |
|---|---|
run-cells |
Execute specific cells or ranges |
kernel-state |
Inspect runtime environment |
list-variables |
Bounded variable inventory |
inspect-variable |
Type-aware variable previews |
inspect-dataframe |
Shape, dtypes, null counts, samples |
interrupt |
Interrupt running kernel |
restart-kernel |
Restart kernel session |
shutdown-kernel |
Shutdown kernel session |
- Python 3.11+
# Clone the repo
git clone https://github.com/Fariz36/notebook-tools.git
cd notebook-tools
# Create and activate a virtual environment
python3 -m venv venv
source venv/bin/activate
# Install in editable mode
pip install -e .
# Set the runtime directory
export NOTEBOOK_TOOLS_RUNTIME_DIR="$HOME/.local/state/notebook-tools"notebook-tools list-cells --notebook examples/demo.ipynb --prettyThe examples/ directory contains notebooks for trying out the tools:
| Notebook | Purpose |
|---|---|
examples/demo.ipynb |
Clean data science workflow — imports, EDA, plot, model |
examples/error_demo.ipynb |
Intentionally broken — test error triage and debugging |
notebook-tools <command> --notebook /absolute/path/to/notebook.ipynb [options]notebook-tools list-cells --notebook examples/demo.ipynb --prettynotebook-tools read-cells --notebook examples/demo.ipynb --index 2 --include-outputs --prettynotebook-tools search-cells --notebook examples/demo.ipynb --query "df" --prettynotebook-tools get-dependencies --notebook examples/demo.ipynb --index 3 --prettynotebook-tools cell-output --notebook examples/demo.ipynb --index 1 --prettynotebook-tools summarize --notebook examples/demo.ipynb --prettynotebook-tools edit-cell --notebook examples/demo.ipynb --index 5 --edit-mode replace --content "new source" --prettynotebook-tools insert-cell --notebook examples/demo.ipynb --position 3 --cell-type markdown --content "## New Section" --prettynotebook-tools run-cells --notebook examples/demo.ipynb --index 0 --prettynotebook-tools list-variables --notebook examples/demo.ipynb --start-if-missing --prettynotebook-tools skills --prettyRecommended LLM setup:
- Expose each CLI command as a tool.
- Make the tool wrapper return parsed JSON, not raw terminal text.
- Keep the CLI machine-oriented.
- Point the agent to
AGENTS.mdfor operating policy, or usenotebook-tools skillsto load them programmatically. - Use the installed executable or a fixed venv executable path.
- Prefer absolute notebook paths.
- Set
NOTEBOOK_TOOLS_RUNTIME_DIRif multiple clients should share the same runtime state location.
Skills define higher-level workflows that orchestrate tools. There are three ways to integrate them:
This project ships with agent skills for multiple coding agents. Skills are auto-discovered from .claude/skills/ (Claude Code), .codex/skills/ (Codex CLI), and .opencode/skills/ (OpenCode).
Available skills:
| Skill | Description |
|---|---|
notebook-orientation |
Explain what a notebook does without reading everything |
error-triage |
Diagnose and minimally fix failing cells |
eda-copilot |
Explore datasets and summarize dataframes |
visualization |
Improve or debug plots |
notebook-cleanup |
Make notebooks readable and shareable |
reproducibility-audit |
Detect hidden state and rerun risks |
Use the skills command to load skill definitions as JSON:
# List all available skills
notebook-tools skills
# Get a specific skill
notebook-tools skills --skill "Error Triage"
# Get raw skills.md content
notebook-tools skills --rawPoint the LLM to AGENTS.md in the repo as system context or a reference document.
For an LLM agent, wrap commands like:
/absolute/path/to/venv/bin/notebook-tools <command> ...The wrapper should:
- pass arguments predictably
- parse JSON stdout
- surface
ok,errors, andwarnings - avoid free-form human formatting
The LLM should:
- start with
list-cells - use
search-cellsandget-dependenciesbefore broad reads - use
cell-outputinstead of reading raw output JSON - use the smallest possible edit
- rerun the smallest possible slice
- report cells read, changed, and run
See:
AGENTS.mdCLI_CONTRACT.md
Live commands require:
ipykerneljupyter_client
These are declared in pyproject.toml.
Important:
- the interpreter running
notebook-toolsmust be the same environment that has these packages installed - in this repo,
venv/bin/pythonis the safest choice for live commands
Works on notebook files and stored outputs.
Good for:
- structure inspection
- static edits
- stored output review
- summarization
- dependency analysis
Attaches to a kernel session.
Good for:
- execution
- runtime variable inspection
- dataframe inspection
- targeted validation after edits
Run the test suite:
python -m unittest discover -s tests -p "test_*.py"See CONTRIBUTING.md for development setup, code style, and how to add new commands or skills.
MIT. See LICENSE for details.