Skip to content

vectorian-rs/chizu

Repository files navigation

Chizu (地図)

Subject-to-file routing for coding agents.

Chizu is a local repository understanding engine. It extracts deterministic structural facts about a codebase -- components, files, symbols, docs, infra units, and their relationships -- and uses those facts to route a subject to the most relevant files and components. It also materializes a graph for human visualization and navigation. The visualize command can emit either a static SVG graph or a self-contained interactive HTML tree explorer over the indexed graph slice.

Chizu Knowledge Graph

Quick Start

Installation

Install the current workspace package from source:

cargo install --path chizu-cli

Install from crates.io after a registry release:

cargo install chizu-cli --bin chizu

Naming in this repo:

  • Project and CLI name: chizu
  • Installable Rust package in this workspace: chizu-cli
  • Indexed data directory created in target repos: .chizu/

This repository is not published to crates.io yet, so the supported install path today is cargo install --path chizu-cli.

1. Configure and Index

chizu --repo /path/to/repo config init
chizu --repo /path/to/repo index

This creates .chizu/graph.db and .chizu/graph.db.usearch in your repository with entities, edges, summaries, and embeddings. Requires a configured LLM and embedding provider (e.g. Ollama) to be running.

2. Search

chizu --repo /path/to/repo search "how does authentication work"

Returns a ranked reading plan: which files and entities to read first and why. The pipeline classifies the query, retrieves candidates via task routes, keyword/name/path matching, and vector search, expands graph neighbors, then reranks with weighted multi-signal scoring.

3. Inspect

chizu --repo /path/to/repo entities
chizu --repo /path/to/repo entity "component::cargo::crates/my-crate"
chizu --repo /path/to/repo edges --from "component::cargo::crates/my-crate"
chizu --repo /path/to/repo routes --task deploy

4. Visualize

Generate a static SVG snapshot when you want a shareable graph artifact:

chizu --repo /path/to/repo visualize -o graph.svg
open graph.svg

Generate an interactive HTML tree explorer when you want to browse a focused slice of the graph locally:

chizu --repo /path/to/repo visualize --interactive -o graph.html
open graph.html
  • SVG output is a static graph view suited for screenshots, docs, and quick inspection.
  • HTML output is a self-contained explorer with search, breadcrumbs, an inspector pane, theme toggle, and optional editor deep links.
  • If you omit --output, Chizu writes the SVG or HTML to stdout.

Commands

Command Description Key flags
index Extract facts + summarize + embed --force
search Full query pipeline -> reading plan --limit, --category, --format, positional query
entity Look up a single entity by id positional id
entities List entities --component, --kind
routes List task routes --task, --entity
edges List edges --from, --to, --rel
visualize Generate SVG or interactive HTML --entity-id, --depth, --kind, --exclude, --interactive, --max-nodes, --output
config Initialize or validate config subcommands: init, validate
guide Interactive usage guide none

Onboarding

Prerequisites

  1. Rust toolchain (1.85+): https://rustup.rs
  2. Ollama running locally: https://ollama.com
  3. Pull the required models:
ollama pull llama3:8b
ollama pull nomic-embed-text-v2-moe:latest

Verify ollama is running:

curl -s http://localhost:11434/v1/models | head -1

Step 1: Install chizu

git clone https://github.com/l1x/chizu.git
cd chizu
cargo install --path chizu-cli

This installs the chizu binary from the local chizu-cli package.

Step 2: Configure

From your target repository root:

cd /path/to/your/repo
chizu config init

This creates .chizu.toml with sensible defaults pointing to a local ollama instance. Edit it to customize exclude patterns, models, or rerank weights:

[index]
exclude_patterns = [
    "**/target/**",
    "**/.git/**",
    "**/node_modules/**",
    "**/.venv/**",
    "**/*.lock",
]

[providers.ollama]
base_url = "http://localhost:11434/v1"
timeout_secs = 120
retry_attempts = 3

[summary]
provider = "ollama"
model = "llama3:8b"
max_tokens = 512
temperature = 0.2
batch_size = 4
concurrency = 1

[embedding]
provider = "ollama"
model = "nomic-embed-text-v2-moe:latest"
dimensions = 768
batch_size = 32

Validate your config:

chizu config validate

Step 3: Index the repository

chizu index

This walks the repo, extracts entities and edges, generates LLM summaries, and builds the embedding index. On a mid-size repo (~60 files, ~650 entities) with local ollama, expect 5-10 minutes for the first run. Re-runs are incremental and skip unchanged files.

Output:

Indexed 64 files (64 walked)
Discovered 4 components
Inserted 656 entities and 649 edges
Summaries: 345 generated, 0 skipped, 0 failed
Embeddings: 345 generated, 0 skipped, 0 failed

Step 4: Search

chizu search "how does authentication work"
chizu search "deploy to prod" --category deploy
chizu search "fix the login bug" --format json --limit 5

Step 4b: Visualize

# Static SVG output
chizu visualize --entity-id "component::cargo::." --output graph.svg

# Interactive HTML output
chizu visualize --interactive --entity-id "component::cargo::." --output graph.html

The default SVG output gives you a static graph snapshot. The --interactive variant writes a single HTML file that embeds the graph data and renders a tree explorer with keyboard search, structural navigation, summary copy, and optional Open in editor links when [visualize].editor_link is configured.

Step 5: Onboard an agent

To give a coding agent (Claude Code, Cursor, Aider, etc.) access to chizu's knowledge graph, add a section to your CLAUDE.md or equivalent agent config:

## Repository map

This repo is indexed with chizu. Before exploring code, use chizu to find
relevant files:

\`\`\`bash
# Find entities related to a topic
chizu search "your question here"

# Get details on a specific entity
chizu entity "symbol::src/auth.rs::validate_token"

# List entities in a component
chizu entities --component "component::cargo::crates/core"

# Explore edges from an entity
chizu edges --from "component::cargo::crates/core"

# Get task-specific routes
chizu routes --task debug
\`\`\`

Use `chizu search` with `--format json` for structured output that can be
parsed programmatically. The search pipeline classifies the query into a task
category (understand, debug, build, test, deploy, configure), retrieves
candidates from multiple signals, expands graph neighbors, and returns a
ranked reading plan.

The agent can then run chizu search to orient itself before reading files, reducing the number of files it needs to explore and improving context relevance.

For non-interactive agent pipelines, use JSON output:

chizu search "how does the store layer work" --format json | jq '.entries[:3]'

This returns structured data the agent can parse to extract file paths, entity IDs, and relevance scores.

Architecture

source inputs
  -> adapters
  -> deterministic fact extraction
  -> fact store (sqlite)
  -> derived projections
       - graph relationships / traversal
       - summaries
       - task routes
       - vector index (usearch HNSW)
  -> query / expansion / rerank
  -> reading plan

The store backend is sqlite+usearch. SQLite stores canonical repo facts and derived metadata (entities, edges, files, summaries, task routes, embedding metadata). usearch provides HNSW-based approximate nearest neighbor search over embedding vectors. Vectors live only in usearch, not duplicated in SQLite.

Component Identity

Components use canonical path-based IDs derived from the repo-relative component root path, not from manifest display names:

  • component::cargo::crates/chizu-core (Rust crate)
  • component::npm::packages/web (npm package)
  • component::npm::. (root package)

Every file under a component root inherits that component's canonical component_id. Every entity derived from that file inherits the same ID. Component discovery happens before file extraction (two-phase indexing).

Configuration

All runtime configuration lives in .chizu.toml at the repository root. Missing file means all defaults apply. Generate one with chizu config init.

[index]
exclude_patterns = [
    "**/target/**",
    "**/.git/**",
    "**/node_modules/**",
    "**/.venv/**",
    "**/fuzz/**",
    "**/*.lock",
]

[search]
default_limit = 15

[search.rerank_weights]
task_route = 0.00
keyword = 0.25
name_match = 0.20
vector = 0.25
kind_preference = 0.10
exported = 0.10
path_match = 0.10

[providers.ollama]
base_url = "http://localhost:11434/v1"
timeout_secs = 120
retry_attempts = 3

[summary]
provider = "ollama"
model = "llama3:8b"
max_tokens = 512
temperature = 0.2
batch_size = 4
concurrency = 1
exported_only = true

[embedding]
provider = "ollama"
model = "nomic-embed-text-v2-moe:latest"
dimensions = 768
batch_size = 32

[visualize]
# Optional: enable editor links in interactive HTML output
# editor_link = "vscode://file/{abs_path}:{line}:{column}"

Provider connection config is defined once per provider under [providers.<name>]. The [summary] and [embedding] sections reference a provider by name. See docs/prd.md for configuration design rules.

Target Repositories

Mixed-language repos with infrastructure and documentation: Rust workspaces, npm workspaces, Terraform roots, Docker deployments, Astro/Hugo sites, and combinations thereof.

Documentation

License

MIT

About

Subject-to-file routing for coding agents. Local repository knowledge graph.

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors