CodePrism

Local-first context saving and focused codebase maps for AI coding agents.

CodePrism gives an agent a task-sized map before it reads your code. It scans files, symbols, imports, routes, docs, and hierarchy into a local graph, then writes compact task slices and exact retrieval handles for the work in front of you.

The goal is simple: map first, slice next, retrieve exactly, and read raw files only when they matter. Visuals are there to inspect and replay the map, not to replace the token-saving workflow.

Quick Commands

pip install -e ".[dev]"
codeprism setup
codeprism prime "server boot path"
codeprism gain
codeprism mcp --list-tools

codeprism setup installs Codex/Claude/Copilot helper files and verifies them with codeprism doctor. codeprism prime maps the repo, writes a focused slice, and prints estimated token savings. codeprism gain reports estimated savings again later and warns when the map is stale. codeprism mcp --list-tools shows the MCP tools available to agent clients.

The legacy contextopt command remains available for existing scripts while the public CLI moves to codeprism. Internal Python imports still use the contextopt package.

For external or read-only repositories:

codeprism prime "server boot path" --root PATH_TO_REPO --artifact-dir PATH_TO_ARTIFACTS --readonly-root

Why It Exists

Agents waste context when they brute-read file trees, repeated shell output, generated folders, and source files that are only loosely related to the current task. CodePrism is the local preflight layer: it gives the agent a compact, inspectable starting point so the expensive reasoning window stays focused.

What It Does

Builds a local SQLite graph of your repository.
Extracts deterministic structure from Python, Markdown, JavaScript/TypeScript, and Java files.
Uses a broad static fallback for common languages such as C/C++, C#, Go, Rust, Ruby, PHP, Kotlin, Swift, shell, PowerShell, and Lua.
Reuses file hashes so unchanged files are not rescanned.
Runs codeprism prime "<task>" to map the repo and write a targeted slice in one step.
Caps generated slice Markdown to about 8K estimated tokens by default so the slice itself does not swamp the agent context window.
Writes a tiny .brief.md beside each slice so agents can resume after compaction without reloading the full slice.
Fetches exact mapped source with codeprism get <node-id> so agents can inspect one symbol or file before opening broader code.
Reads files through token-aware modes with codeprism read <path> --mode map|signatures|diff|full.
Shows graph references with codeprism references <node-id>.
Estimates context size and creates targeted Markdown slices for focused work.
Reports estimated saved tokens and stale-map status with codeprism gain.
Warns when context-reading commands see a stale map, with --refresh and --strict-fresh for enforced freshness.
Serializes map refreshes with an inspectable context.lock file so concurrent agents do not rewrite the graph at the same time.
Provides codeprism watch for lightweight polling refresh loops during active multi-agent work.
Writes local project memory with codeprism onboard and codeprism memory.
Produces reproducible savings reports with codeprism benchmark and codeprism benchmark-suite.
Routes generated artifacts outside a target repo with --artifact-dir and --readonly-root.
Exports Markdown, JSON, DOT, and static browser visualizations.
Generates stable graph data for tool integration and optional visual inspection.
Replays JSONL activity streams over the graph with agent markers, trails, timeline controls, and token estimates.
Writes a lightweight local Live Trace at .codeprism/live-trace.jsonl so the viewer can replay CodePrism commands without reading private agent logs.

Status

CodePrism is alpha software. The current release is meant for local development workflows where you want a smaller, inspectable starting point before handing a codebase to an AI assistant.

The core loop is usable today:

map a repository into a local SQLite graph
estimate context size and generate focused slices
check whether the graph is stale before trusting map output
keep the graph current with refresh locks and codeprism watch --once or a polling watch loop
expose core context tools through an optional MCP server
install and verify Codex/Claude/Copilot helpers that nudge agents toward slice-first exploration
export Markdown, JSON, DOT, and static HTML views
inspect/search/filter the visual map as a bonus layer
replay safe JSONL activity events

Token counts are local estimates based on text length; the benchmark section below explains how to read them.

Benchmark Snapshot

CodePrism token counts are local estimates based on text length. They are useful for comparing source, context-pack, graph, and focused-slice sizes, but they are not billing-grade token measurements.

Reproducible Fixture Suite

The checked-in fixture suite is the public baseline. It is intentionally small enough for CI and release checks, so it is a conservative proof of the workflow rather than a best-case demo.

Scope	Languages	Files	Source estimate	Focused slice estimate	Estimated reduction
8 checked-in fixtures	Python, TypeScript, Java, Kotlin	40	15,291 tokens	4,801 tokens	68.75% average per fixture

Run the same suite locally:

codeprism benchmark-suite examples/benchmarks --out .codeprism/benchmarks/suite.json
python scripts/render_benchmark_chart.py .codeprism/benchmarks/suite.json --out docs/assets/benchmark-snapshot.svg
python scripts/render_benchmark_chart.py .codeprism/benchmarks/suite.json --out docs/assets/benchmark-snapshot.svg --check

The suite writes per-fixture JSON reports plus .codeprism/benchmarks/suite.md. The chart renderer turns that JSON into the checked-in SVG above. See docs/benchmarks.md for the full table, caveats, and release-review workflow.

Field Notes

These are early local runs on real development repositories. They are included to show the shape of the workflow on larger codebases, not to guarantee identical savings on every project.

Project shape	Source estimate	CodePrism output	Estimated reduction
Java server, 1,040 files	~2.22M tokens	~80.7K-token context pack	~96% smaller
Android app, 197 files	~351K tokens	~77.4K-token focused slice	~78% smaller
Game/tooling repo	~185K tokens	~7.5K-token focused slice	~96% smaller
Large generated/noisy repo	~13.2M tokens	~8K capped focused slice	~99% smaller

Actual savings depend on when the agent invokes CodePrism, query scope, file churn, and how much raw source the agent reads outside the map. The best results come from using codeprism prime early, then following with targeted query, get, references, and token-aware read commands.

To audit whether an agent actually followed that pattern, point CodePrism at a local Codex JSONL session:

codeprism audit-session SESSION_ID_OR_JSONL --out .codeprism/session-audit.md

The audit reports CodePrism command timing, raw reads, search commands, compaction mentions, large outputs, and savings observed in the session. It is local-only and reads a session log only when you explicitly pass one.

To compare fixture output between releases:

codeprism benchmark-suite examples/benchmarks --out .codeprism/benchmarks/suite.json
codeprism benchmark-compare previous-suite.json .codeprism/benchmarks/suite.json --out .codeprism/benchmarks/comparison.md
python scripts/benchmark_trend.py --baseline-suite previous-suite.json
python scripts/pre_release_proof.py --baseline-suite previous-suite.json

The suite command writes per-fixture JSON reports plus .codeprism/benchmarks/suite.md, a Markdown summary table suitable for release notes or README updates. The compare command writes a Markdown delta report between two suite JSON files. The trend helper wraps the release-review flow. If you omit --baseline-suite, it uses the GitHub CLI to download the latest successful benchmark artifact from the public workflow. The pre-release proof helper writes a local .codeprism/pre-release/ packet with benchmark trend output, sample session audit output, test/lint logs, and a public hygiene scan.

Current checked-in fixture suite: 8 Python, TypeScript, Java, and Kotlin fixtures with a 68.75% average estimated source-to-slice reduction.

Install From Source

python -m venv .venv
source .venv/bin/activate   # Windows: .venv\Scripts\activate
pip install -e ".[dev]"

MCP support uses the official Python MCP SDK and is optional:

pip install -e ".[mcp]"
codeprism mcp --list-tools

codeprism init creates .codeprism/config.toml for local settings. Generated .codeprism/ files are ignored by Git; see examples/contextopt.config.example.toml for the default config shape. Existing .contextopt/ artifact folders are still read as a legacy fallback, but new artifacts use .codeprism/.

Quick Start

codeprism init
codeprism prime "main"
codeprism visualize --context .codeprism/slices/main.json --outdir .codeprism/visual

Read the generated .codeprism/slices/main.brief.md before opening broad raw file trees. Open .codeprism/slices/main.md only when the brief is insufficient. codeprism prime also appends a safe local event to .codeprism/live-trace.jsonl; codeprism visualize auto-loads that trace when no explicit --activity file is supplied. Open .codeprism/visual/index.html in a browser when you want the optional graph view. See docs/demo.md for the full activity replay and context-overlay walkthrough.

For read-only checkouts or CI jobs, route generated artifacts outside the repository:

codeprism prime "server boot" --root PATH_TO_REPO --artifact-dir PATH_TO_ARTIFACTS --readonly-root

Use a project-specific temp or output directory for PATH_TO_ARTIFACTS on Windows, macOS, or Linux.

Agent Install

Install local helper prompts and skills so Codex, Claude, and Copilot start with CodePrism before broad exploration:

codeprism setup
codeprism doctor

This copies project helpers into .claude/commands/ and .github/copilot-instructions.md, and installs the CodePrism skill into local Codex and Claude skill folders. codeprism doctor reports whether those files are installed and current. Restart Codex/Claude after installing global skills.

Live Trace And Activity Replay

CodePrism records its own command-level events in .codeprism/live-trace.jsonl. The trace is local JSONL, uses the same public activity schema, and contains only explicit command metadata such as event type, touched path or node ID, estimated tokens, and slice statistics. Set CODEPRISM_TRACE=0 to disable writing it.

Generate a viewer from the current local trace:

codeprism prime "current task" --changed
codeprism visualize --outdir .codeprism/visual

CodePrism can also normalize a supplied JSONL event stream and replay touched nodes in the viewer:

codeprism activity adapt-tool-log examples/tool-events.sample.jsonl --out .codeprism/activity-events.jsonl
codeprism activity normalize examples/activity-stream.sample.jsonl --out .codeprism/activity-stream.json
codeprism visualize --activity examples/activity-stream.sample.jsonl --outdir .codeprism/visual

Activity rows can reference node_id, from_node_id, to_node_id, or path. Malformed rows are skipped and reported as warnings in the generated activity file. Optional estimated_tokens and actual_tokens fields power the replay HUD without requiring CodePrism to read private agent session logs. The viewer activity panel includes local event search, run/agent filters, jump-to-node, and a touched-only map mode.

CLI Commands

Command	Purpose
`codeprism init`	Create a local `.codeprism/config.toml` file.
`codeprism map .`	Scan the repo and update the SQLite graph.
`codeprism watch . --once`	Refresh the graph only when the current map is stale.
`codeprism watch .`	Poll for changes and refresh the graph with a local map lock.
`codeprism export --format md`	Export a Markdown context pack.
`codeprism export --format json`	Export stable graph JSON.
`codeprism export --format dot`	Export DOT graph data.
`codeprism prime "topic"`	Map the repo, write a focused slice plus a small slice brief, and print a savings report.
`codeprism prime "topic" --changed`	Seed the slice with changed, staged, and untracked Git files.
`codeprism prime "topic" --artifact-dir <dir> --readonly-root`	Write prime artifacts outside the target repo and refuse root writes.
`codeprism prime "topic" --max-tokens 6000`	Tighten the generated slice budget for short task loops.
`codeprism prime "topic" --allow-large-context --max-tokens 24000`	Explicitly opt into a large slice when the task genuinely needs it.
`codeprism visualize --outdir <dir>`	Generate a static browser viewer and auto-load `.codeprism/live-trace.jsonl` when present.
`--refresh`	Incrementally refresh the map before a context-consuming command.
`--strict-fresh`	Fail when the map is stale instead of warning.
`--lock-timeout <seconds>`	Wait for another map refresh before failing with a lock error.
`codeprism get <node-id>`	Print exact source for a mapped file, doc, or symbol node.
`codeprism references <node-id>`	Show incoming and outgoing graph references for a node.
`codeprism read <path> --mode map`	Print mapped nodes for a file without source bodies.
`codeprism read <path> --mode signatures`	Print mapped symbols/headings/routes without source bodies.
`codeprism read <path> --mode diff`	Print only the working-tree diff for one path.
`codeprism read <path> --mode full`	Explicitly print the full file.
`codeprism activity adapt-tool-log`	Convert simple safe tool-event JSONL into CodePrism activity JSONL.
`codeprism activity normalize`	Normalize safe JSONL activity events into replay JSON.
`codeprism query "topic"`	Rank relevant files and symbols.
`codeprism stats`	Estimate source, graph, and pack token sizes.
`codeprism gain`	Report estimated token savings and map freshness.
`codeprism slice <target>`	Export focused Markdown plus a JSON context overlay manifest.
`codeprism benchmark <root>`	Write a reproducible local token-savings report.
`codeprism benchmark-suite <fixtures>`	Run all local benchmark fixtures and write JSON plus a Markdown table.
`codeprism benchmark-compare <old> <new>`	Compare two benchmark-suite JSON reports and flag savings regressions.
`codeprism audit-session <session>`	Audit a local Codex JSONL session for CodePrism adoption and context risk.
`codeprism onboard`	Write local project memory under `.codeprism/memory/`.
`codeprism memory list/read/write`	Manage inspectable local memory files.
`codeprism mcp --list-tools`	List optional MCP tools for agent clients.
`codeprism setup`	Install and verify agent helper files in one step.
`codeprism install-integrations`	Install local Codex/Claude/Copilot helper files.
`codeprism doctor`	Check whether installed helper files are present and current.

Token-Saving Workflow

Use CodePrism as a preflight step before broad code reading:

codeprism prime "billing webhook"
codeprism gain
codeprism read src/app.py --mode signatures
codeprism get function::src/app.py::billing_webhook
codeprism references function::src/app.py::billing_webhook
codeprism visualize --context .codeprism/slices/billing-webhook.json --outdir .codeprism/visual

That gives an assistant a smaller, inspectable starting point. The prime command maps the repo, writes Markdown for the assistant, writes a JSON manifest for the viewer, and prints source/full-context/slice token estimates plus estimated savings. The gain command repeats the savings report and warns if files changed after the last map. The read command lets an agent inspect file shape before bodies, and the get command uses stable node IDs from slices, query results, or graph JSON to return only the requested source span.

Generated slices are capped to about 8K estimated tokens by default, with a safe ceiling of 16K unless --allow-large-context is supplied. Each slice also gets a compact .brief.md file for resumes and compacted conversations. If a slice says it was capped, keep the next step narrow: use codeprism query, codeprism get, codeprism references, or codeprism read --mode signatures/diff instead of raising the budget reflexively. Do not rerun a broad prime only because the conversation compacted; read the existing brief and continue from its node IDs and paths.

During active edits, seed the slice from Git changes:

codeprism prime "what I am changing" --changed

The assistant should read the slice first, then verify important details in raw source files before editing.

To keep context consistent as the repo evolves, use freshness-aware reads:

codeprism watch . --once
codeprism read src/app.py --mode signatures --refresh
codeprism get function::src/app.py::billing_webhook --strict-fresh
codeprism visualize --refresh --outdir .codeprism/visual

Context-consuming commands warn when the map is stale. Use --refresh to incrementally remap before serving context, codeprism watch . --once when an agent wants to refresh only if needed, or --strict-fresh in CI/agent guardrails to fail instead of reading stale graph state. Map-writing commands use a local context.lock file beside the graph database. If another agent is already refreshing, CodePrism waits up to --lock-timeout seconds and exits with code 4 if the lock remains.

For a repository that should not receive generated files, use:

codeprism prime "what I need" --root PATH_TO_REPO --artifact-dir PATH_TO_ARTIFACTS --readonly-root

This writes context.db, Markdown slices, and JSON manifests under the artifact directory instead of .codeprism/ in the target repo.

For an MCP client, install the optional extra and launch the local stdio server:

pip install -e ".[mcp]"
codeprism mcp --root PATH_TO_REPO

The MCP server currently exposes prime, gain, query, read, get, and references. It is intentionally local-first and does not call external APIs.

To create a local project memory file for future agent sessions:

codeprism onboard --notes "Build/test commands, project purpose, and safety notes."
codeprism memory read project

To reproduce token-saving examples:

codeprism benchmark examples/benchmarks/basic-python --query report --out .codeprism/benchmarks/basic-python.json

Privacy Model

No network calls are made by default.
Generated artifacts live under .codeprism/ by default, or under --artifact-dir when supplied.
Outputs are inspectable text, JSON, DOT, HTML, or SQLite.
Optional LLM summarization is intentionally out of scope for the default path.

Project Layout

src/contextopt/        Python package and CLI
src/contextopt/exporters/
src/contextopt/extractors/
apps/brain-viz/       Future browser app scaffold
apps/pixel-brain/     Future replay renderer scaffold
docs/                 Architecture, roadmap, and schemas
examples/             Sample project and activity stream
integrations/         Claude, Codex, and Copilot templates
scripts/              Maintainer helpers for benchmark and release review
tests/                Regression tests

Development

pip install -e ".[dev]"
pytest
ruff check .
codeprism map .
codeprism watch . --once
codeprism export --format json --out .codeprism/context-pack.json
codeprism read README.md --mode signatures
codeprism get "heading::README.md::Quick Start"
codeprism gain
codeprism benchmark-suite examples/benchmarks --out .codeprism/benchmarks/suite.json
codeprism benchmark-compare .codeprism/benchmarks/suite.json .codeprism/benchmarks/suite.json --out .codeprism/benchmarks/comparison.md
python scripts/benchmark_trend.py --baseline-suite .codeprism/benchmarks/suite.json
python scripts/pre_release_proof.py --baseline-suite .codeprism/benchmarks/suite.json
codeprism visualize --activity examples/activity-stream.sample.jsonl --outdir .codeprism/visual
codeprism slice main --out .codeprism/slices/main.md
codeprism visualize --context .codeprism/slices/main.json --outdir .codeprism/visual
codeprism setup --target project
codeprism doctor

CI runs tests, Ruff, and a CLI smoke path across Python 3.10, 3.11, and 3.12.

Contributing

See CONTRIBUTING.md for local setup, pull request expectations, benchmark guidance, and privacy rules for public issues and fixtures.

Security reports should follow SECURITY.md. Please do not post exploit details, secrets, private source, or raw session logs in public issues.

Release

Use docs/release.md before drafting a public release. It covers the local proof packet, benchmark trend summary, public hygiene scan, and release-note template.

Roadmap

Near-term work is focused on improving slice ranking, adding git-diff-aware context, benchmark fixtures for measured token savings, and deepening static extraction without making the tool heavyweight. Visual polish is still planned, but context savings stay the main product. See docs/roadmap.md for the current plan.

Support

If CodePrism saves you time or tokens, sponsorship helps fund parser coverage, MCP support, reproducible benchmarks, and public docs.

License

MIT. See LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CodePrism

Quick Commands

Why It Exists

What It Does

Status

Benchmark Snapshot

Reproducible Fixture Suite

Field Notes

Install From Source

Quick Start

Agent Install

Live Trace And Activity Replay

CLI Commands

Token-Saving Workflow

Privacy Model

Project Layout

Development

Contributing

Release

Roadmap

Support

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
.claude/commands		.claude/commands
.github		.github
apps		apps
docs		docs
examples		examples
integrations		integrations
scripts		scripts
src/contextopt		src/contextopt
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

CodePrism

Quick Commands

Why It Exists

What It Does

Status

Benchmark Snapshot

Reproducible Fixture Suite

Field Notes

Install From Source

Quick Start

Agent Install

Live Trace And Activity Replay

CLI Commands

Token-Saving Workflow

Privacy Model

Project Layout

Development

Contributing

Release

Roadmap

Support

License

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages