Deep codebase analysis and redesign tool. Index once, query cheaply many times.
Prism indexes any codebase through a five-layer pipeline — structural parsing, documentation extraction, LLM-powered semantic understanding, architectural analysis, and redesign blueprint generation — then serves the results through a CLI and an interactive dashboard.
prism init ~/my-project
prism index my-project --layer structural
prism index my-project --layer semantic
prism analyze my-project
prism blueprint my-project --goal "Productionize this PoC for enterprise deployment"
prism serve
| Use Case | How Prism Helps |
|---|---|
| Redesign a legacy app | Understand what exists, detect anti-patterns, get a proposed modern architecture |
| Productionize a PoC | Analyze a prototype, produce blueprints for a production-grade rebuild |
| Architectural audit | Detect circular dependencies, god modules, dead code, coupling issues, doc gaps |
| Onboard onto a codebase | Semantic search + hierarchical summaries for fast understanding |
┌──────────────────────────────────────────────────────────────┐
│ @prism/app │
│ │
│ ┌──────────┐ ┌──────────────┐ ┌────────────────────────┐ │
│ │ CLI │ │ Dashboard │ │ Blueprint Generator │ │
│ │commander │ │ Express+HTMX │ │ Claude Sonnet │ │
│ └────┬─────┘ └──────┬───────┘ └───────────┬────────────┘ │
│ │ │ │ │
├───────┼───────────────┼──────────────────────┼───────────────┤
│ @prism/core │
│ │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ Pipeline Orchestrator │ │
│ │ │ │
│ │ Layer 1 ─► Layer 2 ─► Layer 3 ─► Layer 4 │ │
│ │ Structural Docs Semantic Analysis │ │
│ │ tree-sitter README Claude Haiku Rollup + │ │
│ │ parsing configs summaries pattern │ │
│ │ symbols comments embeddings detection │ │
│ │ deps graph intent pgvector │ │
│ └───────────────────────────┬────────────────────────────┘ │
│ │ │
│ ┌───────────────────────────┴────────────────────────────┐ │
│ │ Database (Drizzle ORM) │ │
│ │ 9 tables: projects, files, symbols, dependencies, │ │
│ │ summaries, embeddings, findings, blueprints, runs │ │
│ └───────────────────────────┬────────────────────────────┘ │
└──────────────────────────────┼───────────────────────────────┘
│
┌─────────┴──────────┐
│ PostgreSQL │
│ + pgvector │
└────────────────────┘
┌─────────┐ ┌─────────────┐ ┌──────────────┐ ┌──────────────┐ ┌───────────┐
│ init │────►│ Layer 1 │────►│ Layer 2 │────►│ Layer 3 │────►│ Layer 4 │
│ │ │ Structural │ │ Docs │ │ Semantic │ │ Analysis │
│ register│ │ │ │ │ │ │ │ │
│ project │ │ tree-sitter │ │ README parse │ │ Claude Haiku │ │ rollup │
│ detect │ │ symbols │ │ config parse │ │ summaries │ │ patterns │
│ language│ │ deps graph │ │ comments │ │ embeddings │ │ gap │
│ count │ │ metrics │ │ intent │ │ pgvector │ │ analysis │
│ files │ │ │ │ │ │ │ │ │
└─────────┘ └─────────────┘ └──────────────┘ └──────────────┘ └─────┬─────┘
│
▼
┌──────────────┐ ┌──────────────┐ ┌───────────┐
│ Dashboard │◄────│ search │ │ Layer 5 │
│ │ │ │ │ Blueprint │
│ overview │ │ embed query │ │ │
│ file browser │ │ cosine sim │ │ Claude │
│ findings │ │ ranked │ │ Sonnet │
│ blueprints │ │ results │ │ redesign │
│ dep graph │ │ │ │ proposals │
│ modules │ │ │ │ │
└──────────────┘ └──────────────┘ └───────────┘
Prism tracks every file by SHA-256 content hash and every project by its last indexed git commit. On re-index:
git diff --name-only <last-commit>..HEAD ──► changed files only
│
┌───────────┴───────────┐
│ content hash check │
│ skip unchanged files │
└───────────┬───────────┘
│
┌───────────┴───────────┐
│ input hash check │
│ skip unchanged │
│ LLM summaries │
└───────────────────────┘
| Component | Technology |
|---|---|
| Runtime | Node.js 22 |
| Language | TypeScript (strict mode) |
| Database | PostgreSQL + pgvector |
| ORM | Drizzle ORM |
| Code Parsing | tree-sitter (WASM) — TS/JS, Python, C# |
| LLM Summaries | Claude Haiku via Anthropic SDK |
| LLM Analysis | Claude Sonnet via Anthropic SDK |
| Embeddings | Voyage AI or OpenAI (configurable) |
| CLI | Commander |
| Dashboard | Express + HTMX (server-rendered) |
| Auth | Azure Entra ID (MSAL) |
| Logging | Pino |
| Testing | Vitest (236 tests) |
- Node.js >= 22
- PostgreSQL with pgvector extension
- Azure Entra ID app registration (for dashboard auth)
git clone https://github.com/esbenwiberg/prism.git
cd prism
npm install
npm run buildCopy .env.example and fill in your values:
cp .env.example .env# Required
DATABASE_URL=postgresql://localhost:5432/prism
# For LLM summaries (Layer 3)
ANTHROPIC_API_KEY=sk-ant-...
# For embeddings — choose one provider
EMBEDDING_PROVIDER=voyage # or "openai"
VOYAGE_API_KEY=pa-... # if using Voyage
OPENAI_API_KEY=sk-... # if using OpenAI
# For dashboard auth
AZURE_TENANT_ID=...
AZURE_CLIENT_ID=...
AZURE_CLIENT_SECRET=...
SESSION_SECRET=your-secret-here
# Optional
DASHBOARD_PORT=3100 # default: 3100
SKIP_AUTH=true # bypass Entra ID in development
LOG_LEVEL=info # debug, info, warn, errorEnsure pgvector is installed, then run migrations:
CREATE DATABASE prism;
\c prism
CREATE EXTENSION IF NOT EXISTS vector;npx drizzle-kit migrateRegister a project for indexing. Scans the directory to count files and detect the primary language.
prism init ~/repos/my-app
prism init ~/repos/my-app --name "My App"Project 'my-app' registered (id: 1, 1,847 files detected, primary language: TypeScript)
Run the indexing pipeline. Supports layer selection and incremental indexing by default.
prism index my-app # run all configured layers
prism index my-app --layer structural # Layer 1 only
prism index my-app --layer docs # Layer 2 only
prism index my-app --layer semantic # Layer 3 (LLM summaries + embeddings)
prism index my-app --layer analysis # Layer 4 (rollup + pattern detection)
prism index my-app --full # force full re-indexShow indexing progress with per-layer details and cost tracking.
prism status my-app
prism status --all # all registered projectsProject: my-app (id: 1)
Path: /home/user/repos/my-app
Status: completed
Files: 1,847 | Symbols: 12,340
Layer Status Progress Cost Duration
structural completed 1847/1847 $0.0000 12.3s
docs completed 234/234 $0.0000 0.8s
semantic completed 1847/1847 $1.2300 4m 12s
analysis completed — $3.4500 2m 45s
Total cost: $4.68
Run Layer 4: hierarchical summary rollup, pattern detection, and gap analysis.
prism analyze my-app
prism analyze my-app --full # force full re-analysisGenerate Layer 5: subsystem-focused redesign proposals. Use --goal to steer the
proposals toward a specific outcome, and --focus to limit scope to a subsystem.
prism blueprint my-app
# Steer proposals with a goal
prism blueprint my-app --goal "Productionize this PoC for enterprise deployment"
prism blueprint my-app --goal "Migrate from Express to Fastify and replace Sequelize with Drizzle"
prism blueprint my-app --goal "Improve testability and reduce coupling"
# Focus on a specific subsystem
prism blueprint my-app --focus src/api
prism blueprint my-app --goal "Modernize the data layer" --focus src/db| Flag | Description |
|---|---|
-g, --goal <text> |
Redesign goal — all proposals will be aligned to this directive |
-f, --focus <path> |
Limit blueprint scope to findings and modules under this path |
Semantic search across indexed symbols using pgvector cosine similarity.
prism search my-app "authentication flow"
prism search my-app "database connection pooling" --limit 20 # Score Kind Symbol File Summary
1 0.92 function authenticateUser src/auth/middleware.ts Validates JWT token...
2 0.87 class AuthService src/auth/service.ts Handles user auth...
3 0.84 function createSession src/auth/session.ts Creates a new...
Start the dashboard.
prism serve # default port 3100
prism serve --port 4000The dashboard is a server-rendered Express + HTMX application with live partial updates.
| Route | Description |
|---|---|
/ |
Project overview — all registered projects with status badges |
/projects/:id |
Project detail — file/symbol/finding counts, navigation |
/projects/:id/files |
File browser — complexity, coupling, cohesion metrics per file |
/projects/:id/findings |
Findings — severity filter (critical/high/medium/low/info) |
/projects/:id/search |
Semantic search — debounced HTMX live search |
/projects/:id/modules |
Module overview — directory-level summaries and metrics |
/projects/:id/blueprints |
Redesign proposals — architecture, changes, risks, migration path |
/projects/:id/graph |
Dependency graph — interactive D3 force-directed visualization |
Authentication is via Azure Entra ID. Set SKIP_AUTH=true for local development.
Parses every file using tree-sitter (WASM bindings) to extract a complete structural model.
Languages supported: TypeScript, TSX, JavaScript, Python, C#
What it extracts:
- Symbols: functions, classes, interfaces, types, enums, exports, imports
- Dependencies: import/export relationships between files
- Metrics: cyclomatic complexity, afferent/efferent coupling, cohesion
Parses all non-code files to build the "intent layer" — what the codebase is supposed to do.
What it processes:
- README files, changelogs, docs/ directories
- Config files (60+ patterns: package.json, tsconfig, Dockerfile, CI configs, etc.)
- Inline comments and docstrings (JSDoc, Python docstrings, C# XML docs)
- Assembles a structured
ProjectIntentwith description, architecture, and tech stack
Calls Claude Haiku to generate natural-language summaries of every function and class, then embeds them for vector search.
Budget-controlled: configurable budgetUsd limit, tracks actual API cost per token.
Staleness detection: SHA-256 hash of the prompt input — only re-summarizes when code changes.
Embedding providers: Voyage AI (voyage-code-3) or OpenAI (text-embedding-3-small), 1536 dimensions.
Rolls up summaries hierarchically (function → file → module → system) and runs pattern detectors.
Detectors:
| Detector | What it finds |
|---|---|
| Circular Dependencies | Strongly connected components (Tarjan's algorithm) |
| Dead Code | Exported symbols with zero inbound references |
| God Modules | Files with excessive fan-in and fan-out |
| Layering Violations | Cross-layer imports (e.g., UI → DB directly) |
| Coupling Issues | Files exceeding coupling/cohesion thresholds |
Gap Analysis: Compares documentation intent with code reality to find discrepancies.
Feeds the complete project understanding — system summary, all findings, and project intent — to Claude Sonnet to generate subsystem-focused redesign proposals.
Each blueprint includes: proposed architecture, module changes, migration path, risks with mitigations, and rationale.
Prism reads prism.config.yaml from the project root, with environment variable overrides following the PRISM_<SECTION>_<KEY> pattern.
structural:
skipPatterns:
- "node_modules/**"
- ".git/**"
- "dist/**"
- "build/**"
- "vendor/**"
- "*.min.js"
- "*.min.css"
- "*.lock"
- "*.map"
maxFileSizeBytes: 1048576 # 1MB
semantic:
enabled: true
model: "claude-haiku-4-5-20251001"
embeddingProvider: "voyage" # or "openai"
embeddingModel: "voyage-code-3"
embeddingDimensions: 1536
budgetUsd: 10.0
analysis:
enabled: true
model: "claude-sonnet-4-6-20250514"
budgetUsd: 5.0
blueprint:
enabled: true
model: "claude-sonnet-4-6-20250514"
budgetUsd: 5.0
indexer:
batchSize: 100
maxConcurrentBatches: 4
incrementalByDefault: true
dashboard:
port: 3100All tables use the prism_ prefix. PostgreSQL with pgvector extension.
┌──────────────┐ ┌──────────────┐ ┌──────────────────┐
│ projects │────<│ files │────<│ symbols │
│ │ │ │ │ │
│ id │ │ id │ │ id │
│ name │ │ project_id │ │ file_id │
│ path (UNIQ) │ │ path │ │ project_id │
│ language │ │ language │ │ kind │
│ total_files │ │ size_bytes │ │ name │
│ total_symbols│ │ line_count │ │ start/end_line │
│ index_status │ │ content_hash │ │ exported │
│ last_commit │ │ complexity │ │ signature │
│ settings │ │ coupling │ │ docstring │
│ │ │ cohesion │ │ complexity │
│ │ │ is_doc/test/ │ │ │
│ │ │ config │ └──────────────────┘
│ │ │ doc_content │
│ │ │ metadata │ ┌──────────────────┐
│ │ │ UNIQ(proj, │────<│ dependencies │
│ │ │ path) │ │ │
│ │ └──────────────┘ │ source_file_id │
│ │ │ target_file_id │
│ │ ┌──────────────┐ │ source_symbol_id │
│ │────<│ summaries │ │ target_symbol_id │
│ │ │ │ │ kind │
│ │ │ level │ └──────────────────┘
│ │ │ target_id │
│ │ │ content │ ┌──────────────────┐
│ │ │ model │────<│ embeddings │
│ │ │ input_hash │ │ │
│ │ │ cost_usd │ │ summary_id │
│ │ └──────────────┘ │ embedding v(1536)│
│ │ │ model │
│ │ ┌──────────────┐ │ HNSW index │
│ │────<│ findings │ └──────────────────┘
│ │ │ │
│ │ │ category │ ┌──────────────────┐
│ │ │ severity │────<│ blueprints │
│ │ │ title │ │ │
│ │ │ description │ │ title │
│ │ │ evidence │ │ subsystem │
│ │ │ suggestion │ │ summary │
│ │ └──────────────┘ │ architecture │
│ │ │ module_changes │
│ │ ┌──────────────┐ │ migration_path │
│ │────<│ index_runs │ │ risks │
│ │ │ │ │ rationale │
│ │ │ layer │ │ model │
│ │ │ status │ │ cost_usd │
│ │ │ files_proc │ └──────────────────┘
│ │ │ files_total │
│ │ │ cost_usd │
│ │ │ duration_ms │
│ │ │ error │
└──────────────┘ └──────────────┘
prism/
├── packages/
│ ├── core/ @prism/core — indexing engine
│ │ └── src/
│ │ ├── indexer/
│ │ │ ├── pipeline.ts Pipeline orchestrator
│ │ │ ├── types.ts IndexContext, LayerResult, BudgetTracker
│ │ │ ├── structural/ Layer 1: tree-sitter parsing
│ │ │ │ ├── parser.ts WASM runtime + grammar loading
│ │ │ │ ├── extractor.ts Symbol extraction from AST
│ │ │ │ ├── graph.ts Dependency graph builder
│ │ │ │ ├── metrics.ts Complexity, coupling, cohesion
│ │ │ │ └── languages.ts Language detection + registry
│ │ │ ├── docs/ Layer 2: documentation parsing
│ │ │ │ ├── readme.ts README/markdown parsing
│ │ │ │ ├── comments.ts Inline comment extraction
│ │ │ │ ├── config.ts Config file detection (60+ patterns)
│ │ │ │ └── intent.ts Intent layer assembly
│ │ │ ├── semantic/ Layer 3: LLM summaries + embeddings
│ │ │ │ ├── summarizer.ts Claude Haiku summarization
│ │ │ │ ├── embedder.ts Voyage/OpenAI embedding providers
│ │ │ │ └── chunker.ts AST-aware code chunking
│ │ │ └── analysis/ Layer 4: pattern detection
│ │ │ ├── rollup.ts Hierarchical summary rollup
│ │ │ ├── patterns.ts Pattern detection orchestrator
│ │ │ ├── gap-analysis.ts Docs vs code comparison
│ │ │ └── detectors/ 5 architectural pattern detectors
│ │ ├── db/
│ │ │ ├── schema.ts 9 prism_ tables (Drizzle)
│ │ │ ├── connection.ts Pool + Drizzle singleton
│ │ │ ├── migrate.ts Migration runner
│ │ │ └── queries/ CRUD for all tables
│ │ ├── domain/
│ │ │ ├── types.ts Enums + interfaces
│ │ │ └── config.ts YAML config loader
│ │ └── logger.ts Pino singleton
│ │
│ └── app/ @prism/app — CLI + dashboard
│ └── src/
│ ├── cli/
│ │ ├── index.ts Commander dispatch
│ │ └── commands/ 7 commands
│ ├── dashboard/
│ │ ├── server.ts Express setup + auth
│ │ ├── routes/ 8 route handlers
│ │ ├── views/ 10 HTML-generating view modules
│ │ └── public/ HTMX extensions + D3 graph
│ ├── blueprint/
│ │ ├── generator.ts Claude Sonnet → proposals
│ │ ├── splitter.ts Subsystem grouping
│ │ └── types.ts Blueprint types
│ └── auth/
│ ├── entra.ts Azure Entra ID OAuth2
│ ├── session.ts Express session
│ └── middleware.ts Auth guard
│
├── prompts/ LLM prompt templates
│ ├── summarize-function.md
│ ├── summarize-file.md
│ ├── summarize-module.md
│ ├── summarize-system.md
│ ├── gap-analysis.md
│ └── blueprint.md
│
├── drizzle/ Database migrations
├── prism.config.yaml Default configuration
├── .env.example Environment variable template
└── vitest.config.ts Test configuration
npm run build # Build all packages (tsc)
npm test # Run all 236 tests (vitest)
npm run test:watch # Watch mode
npm run lint # Type-check without emittingProprietary. All rights reserved.