Skip to content

lipex360x/sourcemap-indexer

Repository files navigation

sourcemap-indexer

Python uv License Tests Coverage Ruff mypy

Index any codebase into SQLite and enrich file metadata via an LLM — so an AI assistant can understand large projects through SQL queries instead of reading every file.


Index

# Section
1 How it works
2 Prerequisites
3 Installation
4 Quickstart
5 Commands
6 Configuration
7 Ignoring files
8 Custom layers
9 Project metadata
10 AI assistant skill
11 Post-commit hook
12 SQLite schema
13 Dev setup
14 Code quality

1. How it works

sourcemap-indexer runs in three phases. Each phase writes into the same SQLite database, adding a new layer of information on top of what the previous phase produced:

flowchart LR
    A["sourcemap init<br/><i>one-time</i>"] --> B[("index.db<br/>empty schema")]
    B --> C["sourcemap walk<br/><i>after code changes</i>"]
    C --> D[("index.db<br/>paths, languages,<br/>hashes, line counts")]
    D --> E["sourcemap enrich<br/><i>calls an LLM</i>"]
    E --> F[("index.db<br/>+ purpose, layer,<br/>tags, side effects")]
Loading

Note

init and walk are fully offline — no LLM required. Only enrich calls an external model.

Phase 1 — sourcemap init

Creates the directory structure needed by the other commands:

your-project/
├── .sourcemap/
│   ├── index.db          ← SQLite database (all metadata lives here)
│   ├── index.yaml        ← YAML snapshot of the last walk (intermediate file)
│   ├── layers.yaml       ← user-defined layer names (optional)
│   ├── project.yaml      ← project metadata shown by brief (optional)
│   └── logs/             ← LLM debug logs (only when SOURCEMAP_LLM_LOG=1)
└── .sourcemapignore      ← gitignore-syntax exclusion rules

Note

The output directory defaults to .sourcemap/ and can be changed via SOURCEMAP_MAPS_DIR. See Environment variables.

Note

init is idempotent — safe to run multiple times. It never overwrites an existing .sourcemapignore or database.

Phase 2 — sourcemap walk

Scans the project tree and updates the database in three internal steps:

  1. Scan — traverses all files (respecting .gitignore and .sourcemapignore), collects path, language, line count, size, content hash, and last-modified timestamp. On the second and subsequent runs, files whose mtime and size match the SQLite record are skipped entirely — only changed files are read and re-hashed. This makes walk scale to large codebases: a 10 000-file tree with 5 changed files reads 5 files instead of 10 000.
  2. Write — serializes the result to index.yaml inside .sourcemap/ (human-readable snapshot of every tracked file; planned for removal in a future release once SQLite becomes the sole source of truth)
  3. Sync — reads index.yaml and reconciles the SQLite database:
    • New file → inserted with needs_llm = true
    • File changed (hash diff) → updated with needs_llm = true
    • File removed → soft-deleted (kept in DB with deleted_at timestamp)
    • File unchanged → skipped
What index.yaml looks like
version: 1
generated_at: 1745000000
root: /path/to/your-project
files:
  - path: src/auth/login.ts
    language: ts
    lines: 82
    size_bytes: 2104
    content_hash: a3f1...
    last_modified: 1744900000
  - path: src/auth/logout.ts
    ...

This file is checked in to source control optionally — it gives a plain-text audit trail of what was indexed.

What you get without an LLM

After walk, the database already holds language, line count, size, and hash for every file. Run sourcemap stats to see the structural breakdown:

╭─ Sync ──────────────────────────────────────────────────────────╮
│ Inserted: 298   Updated: 0   Soft-deleted: 0                    │
╰─────────────────────────────────────────────────────────────────╯
╭─ Stats ─────────────────────────────────────────────────────────╮
│ Root   /your/project                                            │
│ LLM    not configured                                           │
│ Total: 298      Enriched: 0      Pending: 298                   │
│ ○○○○○○○○○○○○○○○○○○○○  0%                                        │
╰─────────────────────────────────────────────────────────────────╯
╭─ By layer ──────────────────────────────────────────────────────╮
│   unknown   298  ○○○○○○○○○○○○○○○○○○○○                           │
╰─────────────────────────────────────────────────────────────────╯
╭─ By language ───────────────────────────────────────────────────╮
│   py      114  ○○○○○○○○○○○○○○○○○○○○                             │
│   tsx      46  ○○○○○○○○                                         │
│   ts       43  ○○○○○○○○                                         │
│   md        9  ○○                                               │
│   sql       9  ○○                                               │
│   yaml      8  ○                                                │
│   json      7  ○                                                │
╰─────────────────────────────────────────────────────────────────╯
              ● all enriched  |  ● has pending  |  ○ not yet enriched

All files start at layer unknown — layers are assigned by the LLM during enrich. Language detection is immediate and requires no enrichment.

Phase 3 — sourcemap enrich

For every file marked needs_llm = true, enrichment:

  1. Reads the file content from disk
  2. Sends path + language + content to the LLM with a structured prompt
  3. Stores the LLM response back into SQLite:
Field What it contains
purpose One-sentence description of what the file does
layer Architectural layer (domain, infra, application, cli, lib, …)
stability core, stable, experimental, or deprecated
tags Semantic keywords (e.g. authentication, rate-limiting)
side_effects I/O boundaries (network, writes_fs, git, spawns_process)
invariants Key behavioral contracts the file upholds

After enrichment, needs_llm is cleared and llm_hash is set to the content hash at the time of enrichment — so future walks can detect drift.

Important

Enrichment calls the LLM for every pending file. For large codebases, use --limit N to process in batches and avoid timeouts or rate limits.

Note

Set SOURCEMAP_LLM_LOG=1 to record every LLM request and response to a timestamped YAML file. Logs land in .sourcemap/logs/ by default (or inside the directory set by SOURCEMAP_MAPS_DIR). Each enrich session produces one file (llm-YYYYMMDD-HHMMSSffffff.yaml) containing one YAML document per enriched file — useful for debugging prompts or auditing model output.


2. Prerequisites

Requirement Version Notes
uv any Used for installation and tool management
Python 3.11+ Managed automatically by uv tool install
An OpenAI-compatible LLM Required only for sourcemap enrich

Note

uv tool install pulls the correct Python version automatically. You do not need to install Python separately.

Important

sourcemap enrich calls an LLM. Without a reachable endpoint (SOURCEMAP_LLM_URL), walk and stats work fine — only enrichment is blocked.

Installing uv

macOS

curl -LsSf https://astral.sh/uv/install.sh | sh

Or via Homebrew:

brew install uv

Linux

curl -LsSf https://astral.sh/uv/install.sh | sh

Add ~/.local/bin to your PATH if not already present (the installer will prompt you).

Windows

powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"

Or via WinGet:

winget install --id=astral-sh.uv -e

After installation, restart your terminal and verify with uv --version.


3. Installation

uv tool install "git+https://github.com/lipex360x/sourcemap-indexer.git@main"

To upgrade:

uv tool upgrade sourcemap-indexer

To uninstall:

uv tool uninstall sourcemap-indexer

The binary lives at ~/.local/bin/sourcemap. The tool environment is at ~/.local/share/uv/tools/sourcemap-indexer/.


4. Quickstart

cd <your-project>
sourcemap init    # create .sourcemap/, .sourcemapignore, index.db
sourcemap walk    # scan files and sync into SQLite
sourcemap enrich  # call LLM to annotate each file
sourcemap stats   # auto-walks first, then shows totals and pending files

Note

sourcemap stats automatically runs walk before displaying data — no need to run walk manually before stats.


5. Commands

All commands are invoked as sourcemap <command>.

Setup

Command Description
init Create the maps directory, .sourcemapignore, and index.db
walk Scan files and sync metadata into SQLite

Enrichment

Command Description
enrich Send pending files to the LLM
stale List files whose content changed since the last enrich run

enrich flags:

Flag Description
--limit N Process at most N files per run
--force Re-enrich already enriched files
--file <path> Target a single specific file
--layer <layer> Filter by architectural layer
--language <lang> Filter by language
-m "<msg>" Inject an extra instruction into the LLM prompt
--with-context Inject depth-1 import context from indexed dependencies into the prompt (Python, TypeScript, JavaScript, TSX; off by default)
--export-llm-prompt Write the active prompt to a .md file before running (defaults to maps dir/prompt.md)
--output <path> Destination .md file for --export-llm-prompt

--with-context resolves each file's direct imports (depth 1 only), looks up their purpose from the SQLite index, and prepends a context block to the LLM prompt:

Context from direct imports:
- src/domain/cart.py: validates cart items and calculates totals
- src/infra/payment.py: handles Stripe API calls

Pending files are automatically sorted by their dependency graph (topological order) before enrichment — leaf files are processed first. This means --with-context produces non-empty context blocks in a single pass, even on a fresh index.

Supported languages: Python, TypeScript, JavaScript, TSX. For TS/JS/TSX, the extractor returns extension candidates (.ts, .tsx, .js, .jsx, index.ts, index.tsx) and the index disambiguates automatically. export … from re-exports and tsconfig path aliases are not resolved.

Constraints: depth 1 only (no transitive traversal); context is capped at 2000 characters — imports beyond the budget are dropped silently; unknown languages and imports not yet indexed produce no context (silent degradation).

Exploration

Command Description
brief Single-call project briefing — architecture, domain files, tags, side effects, risk areas (includes project metadata when .sourcemap/project.yaml is present)
brief --verbose (or -v) Same as brief plus a Files by layer section listing every enriched file with its 1-line purpose, grouped by layer — use when aggregate counts hide the concept you are looking for (common on documentation-heavy projects)
chapters Table of contents — enriched files grouped by layer and sorted by path (ideal for documentation-heavy projects)
contracts Invariants grouped by layer and file — the semantic contracts captured during enrichment
validate CI gate — verify every file on disk is indexed. Outputs PASS:sourcemap-db (exit 0) or one MISSING:path per unindexed file (exit 1). Run after walk in pre-commit hooks
profile Language distribution, inferred layers, test ratio, top files by size
stats Auto-runs walk; counts by layer and language; bar width = relative file count; green = enriched, yellow = pending
overview Layer × language matrix
domain Enriched domain-layer files with their purpose
effects Files with network or git side effects
tags Top 30 semantic tags by frequency
unstable Experimental or deprecated files
find Search files by tag, layer, or language
show <path> Full metadata for a specific file
query "<sql>" Free-form SQL against the index database

chapters and contracts flags:

Flag Description
--layer L Restrict the output to a single layer

stats flags:

Flag Description
--files List pending files below the counts
--page N Paginate the pending list (requires --files)

find flags:

Flag Description
--tag T Filter by semantic tag
--layer L Filter by architectural layer
--language L Filter by language

Maintenance

Command Description
reset Delete the index (offers a timestamped backup before wiping)
restore Restore index.db from a previously saved .bak file
install-skill Copy the skill file to your AI assistant's skills directory (--target <dir>)

6. Configuration

sourcemap enrich reads a .env file from the project root before resolving env vars. Variables already present in the shell environment take precedence.

Pick a provider

Provider Auth Cost Best for
http API key (Bearer) per-token OpenAI / OpenRouter / z.ai / local LM Studio / any OpenAI-compatible endpoint
claude-cli Claude Code login Claude.ai subscription Claude Code subscribers
opencode OpenCode CLI config depends on configured backend OpenCode users routing through their existing setup
gemini-cli Google OAuth free tier (~1 000 req/day) Gemini free-tier users

Set SOURCEMAP_LLM_PROVIDER to one of the above (default: http).

Common variables

These apply to every provider:

Variable Default Description
SOURCEMAP_LLM_PROVIDER http Backend selector — see table above
SOURCEMAP_LLM_LOG (off) 1 writes LLM request/response logs to logs/ inside the maps directory
SOURCEMAP_PAGE_SIZE 20 Number of pending files shown per page in stats
SOURCEMAP_MAPS_DIR .sourcemap Output directory for index.db, index.yaml, layers.yaml, and logs — relative to project root or absolute
SOURCEMAP_IMPORT_LLM_PROMPT (off) Path to a .md file — enrich reads it and sends its contents as the system prompt instead of the built-in default. Must have .md extension

Tip

Typical workflow: run sourcemap enrich --export-llm-prompt once to dump the default prompt, edit the generated file, then set SOURCEMAP_IMPORT_LLM_PROMPT to its path for subsequent runs.

After choosing a provider, copy the matching .env template below and run:

sourcemap enrich --limit 10

http — OpenAI-compatible endpoint (default)

Provider-specific variables:

Variable Required Description
SOURCEMAP_LLM_URL yes Endpoint URL (any OpenAI-compatible chat completions API)
SOURCEMAP_LLM_MODEL yes Model name passed to the endpoint
SOURCEMAP_LLM_API_KEY depends Bearer token — required by hosted providers, not needed for local servers

.env — OpenAI:

# .env  (add to .gitignore)
SOURCEMAP_LLM_PROVIDER=http
SOURCEMAP_LLM_URL=https://api.openai.com/v1/chat/completions
SOURCEMAP_LLM_MODEL=gpt-4o
SOURCEMAP_LLM_API_KEY=sk-...

.env — OpenRouter (free tier available):

# .env  (add to .gitignore)
SOURCEMAP_LLM_PROVIDER=http
SOURCEMAP_LLM_URL=https://openrouter.ai/api/v1/chat/completions
SOURCEMAP_LLM_MODEL=openai/gpt-oss-120b:free
SOURCEMAP_LLM_API_KEY=sk-or-v1-...
SOURCEMAP_LLM_LOG=1

.env — z.ai:

# .env  (add to .gitignore)
SOURCEMAP_LLM_PROVIDER=http
SOURCEMAP_LLM_URL=https://api.z.ai/api/coding/paas/v4/chat/completions
SOURCEMAP_LLM_MODEL=glm-5.1
SOURCEMAP_LLM_API_KEY=your-api-key

.env — local (LM Studio, no API key needed):

# .env  (add to .gitignore)
SOURCEMAP_LLM_PROVIDER=http
SOURCEMAP_LLM_URL=http://localhost:1234/v1/chat/completions
SOURCEMAP_LLM_MODEL=your-loaded-model-name

claude-cli — Claude Code subscription

Note

Runs via claude -p (Claude Code CLI). Requires Claude Code installed and authenticated — does not work with other claude CLI tools.

When SOURCEMAP_LLM_PROVIDER=claude-cli, the SOURCEMAP_LLM_URL, SOURCEMAP_LLM_MODEL, and SOURCEMAP_LLM_API_KEY variables are ignored — you can keep them in .env without conflict.

Provider-specific variables:

Variable Default Description
SOURCEMAP_LLM_CLI_MODEL (Claude default) Model — e.g. claude-haiku-4-5-20251001, claude-sonnet-4-6, claude-opus-4-7
SOURCEMAP_LLM_CLI_EFFORT (Claude default) Thinking budget — low, medium, high, xhigh, max

Setup:

npm install -g @anthropic-ai/claude-code
claude auth login

.env:

# .env  (add to .gitignore)
SOURCEMAP_LLM_PROVIDER=claude-cli
SOURCEMAP_LLM_CLI_MODEL=claude-sonnet-4-6
SOURCEMAP_LLM_CLI_EFFORT=high
SOURCEMAP_LLM_LOG=1

opencode — OpenCode CLI

Note

Runs via opencode run. Requires OpenCode installed and configured with at least one model provider.

When SOURCEMAP_LLM_PROVIDER=opencode, the SOURCEMAP_LLM_URL, SOURCEMAP_LLM_MODEL, SOURCEMAP_LLM_API_KEY, and SOURCEMAP_LLM_CLI_EFFORT variables are ignored.

Provider-specific variables:

Variable Default Description
SOURCEMAP_LLM_CLI_MODEL (OpenCode default) Any model ID recognised by your OpenCode config — e.g. anthropic/claude-sonnet-4-6, openrouter/openai/gpt-oss-120b:free

Setup:

npm install -g opencode-ai

.env:

# .env  (add to .gitignore)
SOURCEMAP_LLM_PROVIDER=opencode
SOURCEMAP_LLM_CLI_MODEL=openrouter/openai/gpt-oss-120b:free
SOURCEMAP_LLM_LOG=1

Note

When routing through OpenRouter, set your API key in OpenCode's own provider config — sourcemap passes the prompt to opencode run and does not forward SOURCEMAP_LLM_API_KEY to it.

gemini-cli — Google Gemini free tier

Note

Runs via gemini -p (Gemini CLI) authenticated with a personal Google account. The default model gemini-3-flash-preview is recommended — gemini-2.5-pro exists but the free tier exhausts quota quickly under enrichment workloads.

When SOURCEMAP_LLM_PROVIDER=gemini-cli, the SOURCEMAP_LLM_URL, SOURCEMAP_LLM_MODEL, SOURCEMAP_LLM_API_KEY, and SOURCEMAP_LLM_CLI_EFFORT variables are ignored.

Provider-specific variables:

Variable Default Description
SOURCEMAP_LLM_CLI_MODEL (Gemini default — gemini-3-flash-preview) Model — e.g. gemini-3-flash-preview, gemini-2.5-pro

Setup:

brew install gemini-cli
gemini   # interactive — sign in with your Google account on first run

.env:

# .env  (add to .gitignore)
SOURCEMAP_LLM_PROVIDER=gemini-cli
SOURCEMAP_LLM_LOG=1

Note

sourcemap always passes --skip-trust to gemini so headless runs are not blocked by the trusted-folder gate. Stderr noise from gemini (housekeeping warnings, IDE-companion errors) is ignored — only the response field of the JSON output is used. Quota errors (exhausted your capacity) short-circuit to gemini-cli-quota-exhausted instead of waiting for the binary's internal retry loop.


7. Ignoring files

.sourcemapignore uses the same syntax as .gitignore. Both files are read automatically — no extra config needed.

Built-in defaults (always excluded)
node_modules/   .git/         .venv/        __pycache__/
dist/           build/        .next/        .turbo/
coverage/       .sourcemap/   *.pyc         *.min.js
*.lock          *.db          *.sqlite      *.map

If you change SOURCEMAP_MAPS_DIR, add your custom directory here too so it is not indexed.

Add project-specific patterns to .sourcemapignore:

# exclude by extension
*.png
*.jpg
*.svg
*.ico
*.woff2

# exclude directories
secrets/
storybook-static/
public/assets/

# exclude specific files
src/generated/schema.ts

Pattern rules:

Pattern Effect
*.png All .png files anywhere in the tree
assets/ Entire directory (trailing slash = directory)
src/generated/ Subdirectory under a specific path
# at line start Comment — line is ignored

8. Custom layers

By default, the LLM assigns one of the built-in layers (domain, infra, application, cli, lib, config, hook, doc, test, unknown). If your project uses a different architecture, you can declare additional layer names in .sourcemap/layers.yaml:

layers:
  - presentation
  - gateway
  - jobs

sourcemap init creates the file with a commented-out example. Any layer name listed here is treated as valid — the LLM can assign it and find --layer will match it.

Note

Built-in layers always remain valid. layers.yaml only adds names; it does not replace the defaults.

Tip

After adding new layers, run sourcemap enrich --force --layer unknown to re-classify files that were previously unrecognised.

Documentation-heavy projects

Default layers are code-oriented. For projects that are primarily documentation (blueprints, standards, specifications), declare layers that match the document taxonomy so brief, chapters, and find --layer work at the right granularity. Example for a governance/blueprint repository:

layers:
  - foundations    # principles, philosophy, testing mandates
  - enforcement    # tools, scripts, cognitive-load limits
  - operations     # logging, error contracts, roadmap
  - shared         # stack-invariant configuration
  - stacks         # per-stack tool mappings
  - meta           # manifest-like documents

After declaring the layers, run sourcemap enrich --force so the LLM reclassifies every file using the richer taxonomy.

Note

The system prompt is extended automatically when custom layers are declared. The LLM is told to prefer a user-defined layer over a generic default (doc, config, unknown) whenever the file's top-level directory name matches a custom layer.

If a mismatch slips through anyway — e.g. the model chose doc for a file under foundations/sourcemap enrich prints a highlighted Layer mismatches section at the end of the run. No need to open logs or query the DB: the warning lists each path with its chosen and expected layer. Re-run sourcemap enrich --force --file <path> to retry.


9. Project metadata

brief can display a short metadata header when .sourcemap/project.yaml exists. All fields are optional — any field left out is skipped from the output:

name: engineering-blueprint
version: 1
purpose: Language-agnostic foundation for building software with Claude-assisted workflows.
audience:
  - claude
  - engineer
license: MIT

audience accepts a list or a single string. version accepts any scalar (string, integer, etc.) and is rendered verbatim.

When the file is absent, brief renders without a project section — no regressions for existing projects.

Note

project.yaml is purely informational. It does not change enrichment behaviour and is never sent to the LLM.


10. AI assistant skill

Install the bundled skill file so your AI assistant can query the index directly:

# Claude Code
sourcemap install-skill --target ~/.claude/skills

# Any other tool — point to its skills directory
sourcemap install-skill --target <your-tool-skills-dir>

11. Post-commit hook (auto-walk on every commit)

bash scripts/bash/install-hook.sh

Installs a post-commit hook that runs sourcemap walk after every commit, keeping the index current.

Note

Enrichment is not automatic — it calls the LLM and can be slow. Run sourcemap enrich manually when you want updated metadata.


12. SQLite schema

One core table (items) holds a row per file. Three satellite tables store the multi-valued LLM output (a file has many tags, many side effects, many invariants):

erDiagram
    items ||--o{ tags : has
    items ||--o{ side_effects : has
    items ||--o{ invariants : has

    items {
        int id PK
        string path
        string name
        string language
        string layer
        string stability
        string purpose
        int lines
        int size_bytes
        string content_hash
        string llm_hash
        bool needs_llm
        timestamp deleted_at
        timestamp llm_at
    }
    tags {
        int item_id FK
        string tag
    }
    side_effects {
        int item_id FK
        string effect
    }
    invariants {
        int item_id FK
        string invariant
    }
Loading

Walk fills: path, name, language, lines, size_bytes, content_hash, needs_llm, deleted_at. Enrich fills: purpose, layer, stability, llm_hash, llm_at, plus rows in tags / side_effects / invariants.

Layers: domain | infra | application | cli | hook | lib | config | doc | test | unknown — plus any user-defined names declared in .sourcemap/layers.yaml

Side effects: writes_fs | spawns_process | network | git | environ


13. Dev setup

git clone https://github.com/lipex360x/sourcemap-indexer.git
cd sourcemap-indexer
uv sync
uv run pytest

14. Code quality

Every commit passes a pre-commit pipeline that enforces the following gates:

Using validate in pre-commit hooks

sourcemap validate is designed as a CI gate: it checks that every file on disk was indexed by the last walk run. Pair them in a pre-commit hook:

sourcemap walk --root "$PROJECT_ROOT"
sourcemap validate --root "$PROJECT_ROOT"

Exit codes: 0 = all files indexed, 1 = one or more files missing from the index. Output is machine-parseable: PASS:sourcemap-db on success, MISSING:<path> per unindexed file on failure.

Automated gates (pre-commit / pre-push)

Tool What it checks Config
ruff Style, imports, simplification (SIM), returns (RET), bugbear (B), upgrades (UP), security (S) pyproject.toml [tool.ruff.lint]
ruff format Consistent formatting (replaces Black) pyproject.toml [tool.ruff]
McCabe complexity No function exceeds cyclomatic complexity 5 (C901) pyproject.toml [tool.ruff.lint.mccabe]
mypy Full strict type checking — no Any, no untyped functions pyproject.toml [tool.mypy]
bandit Deep security scan — severity/confidence filtering, broader rule set pyproject.toml [tool.bandit]
vulture Dead code detection — unused functions and variables
pylint C0103 Naming convention enforcement — no abbreviations (msg, cfg, err, …) pyproject.toml [tool.pylint]
pytest + coverage Test suite must pass at ≥ 95% line coverage pyproject.toml [tool.pytest]

Testing strategy

  • TDD mandatory — every behaviour is covered by a test written before the implementation (Red → Green)
  • No mocks on persistence — tests hit a real in-memory SQLite database (":memory:"); concurrency tests use a file-based DB via tmp_path (:memory: cannot be shared between threads)
  • No mocks on the filesystem — tests use tmp_path fixtures with real files
  • Integration tests run the full CLI via typer.testing.CliRunner end-to-end
  • Coverage minimum: 95% — enforced both by pytest and by the pre-push hook

Design decisions

Decision Why
Either[str, T] monad Explicit error propagation without exceptions — every fallible function returns Left(error_token) or Right(value). No hidden control flow.
Layer = str (not StrEnum) User-defined layers loaded from layers.yaml are unknown at import time. A str alias accepts any value; validation happens at the application boundary in run_enrich.
No comments in source Names carry meaning. Comments that explain what code does rot as code evolves; the only permitted comments are for non-obvious why — hidden constraints, workarounds, subtle invariants.
Single output directory (.sourcemap/) Config (layers.yaml, ignore) and data (index.db, index.yaml, logs/) live under one root. No two directories for the same concern.
_DEFAULT_LAYERS | user_layers The full valid-layer set is the union of built-in defaults and user-defined additions, computed at startup and passed through to run_enrich and LlmClient.
BEGIN IMMEDIATE + WAL in init_db Migration apply is wrapped in a BEGIN IMMEDIATE transaction so two concurrent processes (e.g. parallel walk + enrich) cannot both pass the "already applied?" check and duplicate a migration. WAL journal mode reduces SQLITE_BUSY errors under concurrent readers.

About

Codebase indexer powered by local LLM

Topics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors