File-level codebase intelligence for Claude Code. Encodes every source file in your repo as an HDC vector. Claude Code queries the index instead of scanning files.
Same architecture as glyphh-pipedream (3,146 apps) and glyphh-bfcl (#1 on BFCL V4). No LLM at build time. No LLM at search time. Pure HDC encoding and cosine search.
Built on Glyphh Ada 1.1 · Docs → · Glyphh Hub →
WORK IN PROGRESS — This model is under active development. Benchmarks show Glyphh uses 20% fewer tokens and 22% fewer turns than bare Claude Code, with equal search accuracy (13/15). Overall accuracy is 76% vs 84% due to MCP startup latency causing timeouts — not an HDC issue. See benchmark/BENCHMARK.md for full results and analysis.
# Create and activate a virtual environment (recommended)
python3 -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
# Install with runtime dependencies (includes FastAPI, SQLAlchemy, pgvector)
pip install 'glyphh[runtime]'This model requires PostgreSQL + pgvector for similarity search.
git clone https://github.com/glyphh-ai/model-code.git
cd model-code
# Start the Glyphh shell (prompts login on first run)
glyphh
# Inside the shell:
# glyphh> docker init # generates docker-compose.yml + init.sql
# glyphh> exit
# Start PostgreSQL + pgvector and the Glyphh runtime
docker compose up -d --waitThis starts:
- PostgreSQL 16 + pgvector on port 5432 (with HNSW indexing)
- Glyphh Runtime on port 8002
Swagger docs available at http://localhost:8002/docs in local mode.
glyphh
# glyphh> model deploy . # deploy code model to runtime# Full compile (all indexable files)
python compile.py /path/to/your/repo --runtime-url http://localhost:8002
# Incremental (changed files since last commit)
python compile.py /path/to/your/repo --incremental
# Incremental from a child repo / submodule commit
python compile.py /path/to/your/repo --incremental --diff-repo /path/to/child/repo
# Dry run (show what would be indexed)
python compile.py /path/to/your/repo --dry-runThe --diff-repo flag tells compile.py to run git diff HEAD^ HEAD in a
different repo than the source directory. Changed file paths are resolved
relative to the child repo but the source directory is still used as the
compile root. This is how the post-commit hook handles commits in monorepo
subdirectories, child repos, and submodules.
Add the MCP server using the Claude Code CLI:
claude mcp add --transport http glyphh http://localhost:8002/{org_id}/code/mcpTo find your org ID, run glyphh auth status in the Glyphh shell:
glyphh
# glyphh> auth status
# org_id: your-org-id-hereIn local mode the org ID is local-dev-org:
claude mcp add --transport http glyphh http://localhost:8002/local-dev-org/code/mcpRestart Claude Code to pick up the MCP config. In VS Code: Cmd+Shift+P →
"Claude Code: Restart". In the CLI: exit and re-enter the session.
Verify the connection with /mcp — you should see glyphh_search,
glyphh_related, and glyphh_stats listed as available tools.
Copy the included CLAUDE.md into your project root:
cp CLAUDE.md /path/to/your/project/CLAUDE.mdClaude Code loads this file automatically at the start of every conversation.
It teaches Claude Code to always search the Glyphh index before reading files,
check blast radius before editing, and use top_tokens and imports from
search results to avoid unnecessary file reads.
Without it, Claude Code will still have the MCP tools available but will fall back to its default file scanning behavior.
Compiles your codebase into a vector index. Exposes it to Claude Code via MCP.
Without Glyphh: Claude reads project structure, scans likely files, reads module, reads tests. ~6,000 tokens before first useful output.
With Glyphh:
Claude calls glyphh_search("auth token validation").
Returns: file path, confidence, top concepts, imports, related files.
Claude reads one file and acts.
~400 tokens before first useful output.
The index stores not just the vector but the token vocabulary of every file. Search results return enough context that Claude often does not need to read the file at all. When it does read, it already knows exactly what to look for.
Same paradigm as all Glyphh models. The file is the exemplar.
Build time: read file → tokenize path + identifiers + imports
→ encode into HDC vector → store vector + metadata in pgvector
Runtime: NL query → encode with same pipeline
→ cosine search against index
→ return file path + top tokens + imports
→ Claude reads one file, acts
No LLM at build time. No LLM at runtime for search.
Two-layer HDC encoder at 2,000 dimensions (pgvector HNSW compatible):
| Layer | Weight | Signal |
|---|---|---|
| path | 0.30 | File path tokens (BoW): src/services/user_service.py → src services user service py |
| content | 0.70 | Source file vocabulary |
| ↳ identifiers | 1.0 | All tokens from file content. camelCase/snake_case split before encoding |
| ↳ imports | 0.8 | Import/require/include targets. Strong cross-file dependency signal |
Metadata stored per file (not encoded, returned at search time):
top_tokens: 20 most frequent meaningful tokensimports: list of imported module/package namesextension: file typefile_size: bytes
Exposed through the runtime's model-specific MCP tool system:
Find files by natural language query. Returns ranked matches with confidence scores, top tokens, and import lists.
{"tool": "glyphh_search", "arguments": {"query": "auth token validation", "top_k": 5}}Confidence gate: below threshold returns ASK with candidates, never silent wrong routing.
Find files semantically related to a given file. Use before editing to understand blast radius.
{"tool": "glyphh_related", "arguments": {"file_path": "src/services/auth.py", "top_k": 5}}Index statistics: total files, extension breakdown.
The drift.py module computes semantic drift between file versions:
| Drift | Label | Meaning |
|---|---|---|
| 0.00–0.10 | cosmetic | Formatting, comments, rename |
| 0.10–0.30 | moderate | Logic update, new function |
| 0.30–0.60 | significant | Behavioral change, new dependency |
| 0.60–1.00 | architectural | Rewrite, interface change |
# Recompile only files changed in the last commit
python compile.py . --incremental
# Recompile files changed in a specific commit
python compile.py . --diff abc123
# Recompile when commit was in a child repo / submodule
python compile.py /path/to/monorepo --incremental --diff-repo /path/to/monorepo/child-repoThe index is updated automatically after every commit via the Claude Code
PostToolUse hook (see Claude Code hooks below).
For non-Claude workflows, a git post-commit hook is included at
hooks/post-commit.
Indexes: .py, .ts, .tsx, .js, .jsx, .java, .cpp, .c, .h,
.go, .rs, .rb, .cs, .swift, .sql, .graphql, .yaml, .json,
.sh, .css, .html, .svelte, .vue, .md, .proto, .tf, and more.
Skips: .git, node_modules, __pycache__, dist, build, vendor,
target, and other build/cache directories.
Max file size: 500 KB. Binary files auto-skipped.
By default Claude Code prompts for permission each time it calls an MCP tool.
To allow Glyphh tools silently, add them to .claude/settings.json in your
project:
{
"permissions": {
"allow": [
"mcp__glyphh__glyphh_search",
"mcp__glyphh__glyphh_related",
"mcp__glyphh__glyphh_drift",
"mcp__glyphh__glyphh_risk"
]
}
}Or use a wildcard to allow all tools from the Glyphh server:
{
"permissions": {
"allow": [
"mcp__glyphh__*"
]
}
}The first matching rule wins — Glyphh tools run silently while everything else still prompts.
Two hooks are included to integrate Glyphh with Claude Code:
- enforce-glyphh-search.sh (PreToolUse) — blocks Grep and Glob calls,
redirecting Claude to
glyphh_searchinstead - post-commit-compile.sh (PostToolUse) — runs
compile.py --incrementalafter everygit committo keep the index up to date
The post-commit hook takes a source directory as its first argument. This
is the root of the codebase you want indexed. The hook fires on any git commit that happens inside that directory — whether the commit is in the
source directory itself, a child repo, or a submodule.
When a commit lands in a child repo, the hook passes --diff-repo to
compile.py so it diffs the correct repo while still compiling against the
source directory root.
Add both hooks to .claude/settings.json in your project:
{
"hooks": {
"PreToolUse": [
{
"matcher": "Grep|Glob",
"hooks": [
{
"type": "command",
"command": "/path/to/model-code/hooks/enforce-glyphh-search.sh"
}
]
}
],
"PostToolUse": [
{
"matcher": "Bash",
"hooks": [
{
"type": "command",
"command": "/path/to/model-code/hooks/post-commit-compile.sh /path/to/source/dir"
}
]
}
]
}
}Replace /path/to/model-code with wherever you cloned this repo and
/path/to/source/dir with the root of the codebase to index.
| Variable | Default | Description |
|---|---|---|
GLYPHH_RUNTIME_URL |
http://localhost:8002 |
Runtime endpoint |
GLYPHH_TOKEN |
Auto-resolved from CLI session | Auth token |
GLYPHH_ORG_ID |
Auto-resolved from CLI session | Org ID |
GLYPHH_PYTHON |
/opt/homebrew/anaconda3/bin/python |
Python interpreter (must have requests) |
GLYPHH_HOOK_DISABLE |
— | Set to 1 to temporarily disable the hook |
cd glyphh-models/code
PYTHONPATH=../../glyphh-runtime pytest tests/ -v