Skip to content

hit9/code-symbol-index

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

code-symbol-index

Tree-sitter backed symbol index and code navigation for tools that need fast, bounded, LLM-friendly answers over a local codebase.

It provides a small Python API and a single CLI command:

code-symbol-index

The default CLI output is readable text. Add --json on query commands when a machine-readable response is better.

Features

  • Disk-backed SQLite index at .code-symbol-index/index.sqlite
  • Incremental indexing by mtime_ns + size
  • .gitignore aware file discovery
  • UTF-8 text file filtering
  • Mainstream language parsing through tree-sitter-language-pack
  • Symbol search, inspect, references, implementors, file outline, and index status
  • Bounded outputs designed for coding LLM context windows

This is syntactic code navigation, not a language server. It does not provide type-aware rename safety or full semantic call graph accuracy.

Install

Install the CLI as a uv tool:

uv tool install code-symbol-index

Or install from a local checkout:

uv tool install .

For local development with editable imports and tests:

uv venv .venv
uv pip install --python .venv/bin/python -e '.[dev]'

Then:

code-symbol-index --version

Quick Start

Build or refresh the index:

code-symbol-index index --root /path/to/repo

Check whether indexed tools are available:

code-symbol-index status --root /path/to/repo
code-symbol-index status --root /path/to/repo --check

Search symbols:

code-symbol-index search Tool --root /path/to/repo --limit 20
code-symbol-index search Tool Agent Runner --root /path/to/repo
code-symbol-index search Tool --root /path/to/repo --kind class,function --path src --exact-only

Inspect one symbol:

code-symbol-index inspect Tool --root /path/to/repo
code-symbol-index inspect Tool.method_name --root /path/to/repo
code-symbol-index inspect Tool --root /path/to/repo --anchors

Outline a file:

code-symbol-index outline src/app.py --root /path/to/repo
code-symbol-index outline src/app.py --root /path/to/repo --symbol Tool

Codex Skill

Install the Codex skill so LLM coding agents can discover and use code-symbol-index automatically:

code-symbol-index install-skill

To override the Codex home directory or force overwrite an existing skill:

code-symbol-index install-skill --codex-home ~/.codex --force

The command writes SKILL.md to $CODEX_HOME/skills/code-symbol-index/, or ~/.codex/skills/code-symbol-index/ when CODEX_HOME is not set.

Once installed, the agent will know the skill rules for symbol search, inspection, references, file outlines, incremental updates, and index status checks.

CLI

code-symbol-index languages
code-symbol-index --version
code-symbol-index version
code-symbol-index index --root /path/to/repo
code-symbol-index update src/app.py src/lib.py --root /path/to/repo
code-symbol-index status --root /path/to/repo
code-symbol-index status --root /path/to/repo --check
code-symbol-index status --root /path/to/repo --check --max-pending-files 20
code-symbol-index search Tool --root /path/to/repo
code-symbol-index search Tool Agent Runner --root /path/to/repo
code-symbol-index search Tool --root /path/to/repo --kind class,function --path src --exact-only
code-symbol-index inspect Tool --root /path/to/repo
code-symbol-index inspect Tool --root /path/to/repo --path src --exact-only
code-symbol-index inspect Tool --root /path/to/repo --anchors
code-symbol-index outline src/app.py --root /path/to/repo
code-symbol-index outline src/app.py --root /path/to/repo --symbol Tool
code-symbol-index refs Tool --root /path/to/repo --limit 20 --offset 0
code-symbol-index impls Greeter --root /path/to/repo --kind trait --limit 20 --offset 0
code-symbol-index clean --root /path/to/repo
code-symbol-index install-skill

JSON is available for structured consumers:

code-symbol-index search Tool --root /path/to/repo --json
code-symbol-index inspect Tool --root /path/to/repo --json
code-symbol-index inspect Tool --root /path/to/repo --anchors --json
code-symbol-index outline src/app.py --root /path/to/repo --json
code-symbol-index refs Tool --root /path/to/repo --json
code-symbol-index impls Tool --root /path/to/repo --json
code-symbol-index status --root /path/to/repo --json

Output Formats

Search returns candidates only, never source:

query: Tool
count: 2
limit: 20
has_more: false

symbols:
  - id: python:class:Tool:nanocode.py:1284:1330
    name: Tool
    kind: class
    file: nanocode.py
    range: 1284:1330
    signature: class Tool:
    score: exact
    language: python

For multiple search queries:

queries:
  - Tool
  - Agent
count: 2
limit: 20
has_more: false

symbols:
  - id: python:class:Tool:nanocode.py:1284:1330
    name: Tool
    kind: class
    file: nanocode.py
    range: 1284:1330
    signature: class Tool:
    score: exact
    matched_query: Tool

Inspect returns bounded source with stable 0-based line ranges:

symbol:
  id: python:function:foo:src/app.py:120:123
  name: foo
  kind: function
  file: src/app.py
  range: 120:123
  signature: def foo():
summary:
  imports: 2
  members: 0
  callers: 1
  callees: 1
  references: 3
  implementors: 0
imports:
  - range: 0:1
    statement: import os
source:
  status: full
  range: 120:123
  shown_range: 120:123
  total_lines: 3

  120 |def foo():
  121 |    if ok:
  122 |        return 1

Use inspect --anchors or inspect_text(..., anchors=True) to emit hashline source anchors from the current file contents:

source:
  status: full
  range: 120:123
  shown_range: 120:123
  total_lines: 3
  note: Use line:hash as edit anchor; code starts after |

120:a1b2c3d4|def foo():
121:d4e5f6a7|    if ok:
122:f6a7b8c9|        return 1

JSON inspect with anchors=True includes source_anchor with path, start_line, end_line, start_anchor, end_anchor, and lines[{line, hash, text}]. Hashes are computed from current file contents at output time.

Outline returns file structure without source or ids:

file: nanocode.py
range: 0:9060
count: 142

outline:
1284:1330 | class Tool:
1289:1292 |     def cli_args(cls, args):
1312:1325 |     def tool_schema(cls):
9023:9060 | def main(argv=None):

Status is fast by default and does not scan the directory:

index:
  status: ready
  root: /path/to/repo
  files: 128
  symbols: 4820
  languages: python, typescript
  language_breakdown:
    - python: 80 files (62.5%)
    - typescript: 48 files (37.5%)
  pending_changes: unknown

Use --check to scan the directory and compute staleness:

index:
  status: stale
  root: /path/to/repo
  files: 128
  symbols: 4820
  pending_changes: 3
  pending_files:
    - src/app.py
    - src/new_feature.py
  reason: files changed after last index update

pending_files is bounded by --max-pending-files and is only computed with --check.

Query Rules

inspect accepts only symbol-like input:

  • ClassName
  • function_name
  • ClassName.method_name
  • symbol_prefix

It rejects natural language, file paths, and directory paths. Use outline for file paths.

search accepts A|B|C as a non-regex OR shorthand. --kind accepts one kind or comma-separated kinds, --path filters to a file or directory, and --exact-only disables prefix/fuzzy matches. The same filters are available in the Python API as kind=, path=, and exact_only=True.

Python indexes top-level constants, top-level variables, and top-level dictionary keys as symbols. Dictionary keys use kind=dict_key and the parent assignment as container.

All line ranges are start:end, 0-based, with end exclusive.

Python API

import code_symbol_index as csi

csi.index("/path/to/repo")
csi.update(["src/app.py", "src/lib.py"], root="/path/to/repo")

print(csi.status_text("/path/to/repo"))
print(csi.search_text("Tool", root="/path/to/repo"))
print(csi.search_text("Tool|Agent", root="/path/to/repo", kind="class,function", path="src"))
print(csi.inspect_text("Tool", root="/path/to/repo"))
print(csi.inspect_text("Tool", root="/path/to/repo", path="src", exact_only=True))
print(csi.inspect_text("Tool", root="/path/to/repo", anchors=True))
print(csi.outline_text("src/app.py", root="/path/to/repo"))
print(csi.outline_text("src/app.py", root="/path/to/repo", symbol="Tool"))

symbols = csi.search("Tool", root="/path/to/repo", format="object")
symbols = csi.search(["Tool", "Agent", "Runner"], root="/path/to/repo")
search_payload = csi.search("Tool", root="/path/to/repo", format="json")
search_text = csi.search("Tool", root="/path/to/repo", format="text")
inspection = csi.inspect("Tool", root="/path/to/repo")
anchored = csi.inspect("Tool", root="/path/to/repo", format="json", anchors=True)
references = csi.refs("Tool", root="/path/to/repo", limit=20, offset=0)

For repeated queries, reuse a repository handle:

repo = csi.Repository("/path/to/repo")
repo.update(["src/app.py"])
print(repo.search_text("Tool"))
print(repo.inspect_text("Tool"))
print(repo.outline_text("src/app.py"))

Refresh and update accept an optional progress callback:

def on_progress(event, *, done=0, total=0, path=None):
    print(event, done, total, path)

repo = csi.Repository("/path/to/repo", progress=on_progress)
repo.refresh()
repo.update(["src/app.py"], progress=on_progress)

Stable progress events are scan, start, file, and finish.

To refresh the index during application startup without blocking startup:

thread = csi.refresh_async("/path/to/repo", progress=on_progress)

refresh_async creates its own Repository inside the background thread. Do not share a Repository instance across threads.

Queries require an existing index. Run code-symbol-index index or code_symbol_index.index() first. Queries do not sync automatically unless called with --sync or sync=True. After external file edits, call code_symbol_index.update(paths, root=...) or Repository.update(paths) to refresh only those files; deleted or newly ignored paths are removed from the index.

Top-level query APIs accept format="object" | "text" | "json":

  • object returns Python dataclasses/lists and is the default.
  • text returns the same readable format as the *_text helpers.
  • json returns JSON-safe Python dict/list data.

search accepts one query, A|B|C, or a list of symbol names/prefixes. Multiple queries are OR-ed, are not regexes, and share one total limit. Search text and JSON formats include has_more when more matches exist beyond limit.

Development

make install
make check
make smoke
make clean

Python API List

Index lifecycle:

  • index(root=".", *, language=None, progress=None) -> Repository
  • update(paths, *, root=".", language=None, progress=None) -> Repository CLI: code-symbol-index update <paths...> --root <repo>
  • refresh_async(root=".", *, language=None, db_path=None, progress=None, daemon=True) -> threading.Thread
  • install_skill(*, target="codex", codex_home=None, force=False) -> Path
  • clean(root=".") -> None
  • status(root=".", *, language=None, db_path=None, check=False, max_pending_files=50, format="object") -> IndexStatus | str | dict
  • status_text(root=".", *, language=None, db_path=None, check=False, max_pending_files=50) -> str

Queries:

  • search(query: str | list[str], *, root=".", kind=None, language=None, path=None, exact_only=False, limit=20, sync=False, format="object") -> list[Symbol] | str | dict
  • search_text(query: str | list[str], *, root=".", kind=None, language=None, path=None, exact_only=False, limit=20, sync=False) -> str
  • inspect(query, *, root=".", kind=None, language=None, path=None, exact_only=False, limit=20, anchors=False, sync=False, format="object", ...) -> Inspection | str | dict
  • inspect_text(query, *, root=".", kind=None, language=None, path=None, exact_only=False, anchors=False, sync=False, ...) -> str
  • refs(query, *, root=".", kind=None, language=None, path=None, exact_only=False, limit=20, offset=0, sync=False, format="object") -> Page | str | dict
  • impls(query, *, root=".", kind=None, language=None, path=None, exact_only=False, limit=20, offset=0, sync=False, format="object") -> Page | str | dict
  • outline(path, *, root=".", symbol=None, max_symbols=200, sync=False, format="object") -> Page | str | dict
  • outline_text(path, *, root=".", symbol=None, max_symbols=200, sync=False) -> str

Repository handle:

  • Repository(root=".", *, languages=None, include=None, exclude=None, db_path=None)
  • Repository.refresh(*, progress=None) -> Repository
  • Repository.update(paths=None, *, progress=None) -> Repository
  • Repository.search(...), search_text(...)
  • Repository.inspect(...), inspect_text(...)
  • Repository.refs(...), impls(...)
  • Repository.outline(...), outline_text(...)
  • Repository.clean() -> None

Data classes:

  • Symbol
  • Reference
  • Page
  • Inspection
  • InspectOptions
  • IndexStatus

About

A simple tree-sitter based code symbol index and searching. (fully ai maintained; (providing API + cli interfaces))

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors