Parseltongue

v1.2.0 - Parse once, query forever. A local HTTP backend that makes any LLM agent understand your codebase.

# Index your codebase
parseltongue pt01-folder-to-cozodb-streamer ./my-project --db "rocksdb:mycode.db"

# Start the HTTP server (default port: 7777)
parseltongue pt08-http-code-query-server --db "rocksdb:mycode.db"

# Query from your LLM agent
curl http://localhost:7777/codebase-statistics-overview-summary

12 languages: Rust, Python, JavaScript, TypeScript, Go, Java, C, C++, Ruby, PHP, C#, Swift

The Problem

graph LR
    subgraph "Without Parseltongue"
        A[LLM Agent] -->|grep/read files| B[500K tokens]
        B --> C[Context overflow]
        C --> D[Poor reasoning]
    end

    style A fill:#FFB6C1
    style B fill:#FFB6C1
    style C fill:#FFB6C1
    style D fill:#FFB6C1

Developers and LLM agents cannot easily understand codebases. They resort to grep, which:

Returns raw text (no semantic understanding)
Consumes excessive tokens
Misses relationships between code entities
Requires re-parsing on every query

The Solution

graph LR
    subgraph "With Parseltongue"
        A[LLM Agent] -->|HTTP query| B[3K tokens]
        B --> C[Focused context]
        C --> D[Better reasoning]
    end

    style A fill:#90EE90
    style B fill:#90EE90
    style C fill:#90EE90
    style D fill:#90EE90

Code is a graph, not text. Parseltongue:

Parses your codebase once (tree-sitter, 12 languages)
Stores entities + dependencies in a graph database (CozoDB)
Serves an HTTP API that any LLM agent can query

Impact Analysis

"If I change this function, what breaks?"

With grep:

flowchart LR
    A[grep -r 'authenticate'] --> B[51 matches]
    B --> C[500K tokens]
    C --> D[No dependency info]
    D --> E[Manual work]
    style E fill:#ffcccc,stroke:#cc0000

With Parseltongue:

flowchart LR
    A[blast-radius API] --> B[302 entities]
    B --> C[2K tokens]
    C --> D[14 direct + 288 transitive]
    D --> E[Graph answer]
    style E fill:#ccffcc,stroke:#00cc00

One query:

curl "http://localhost:7777/blast-radius-impact-analysis?entity=rust:fn:authenticate:src/auth.rs:10-50&hops=2"

{"total_affected": 302, "direct_callers": 14, "transitive": 288}

Quick Start

Step 1: Index Your Codebase

parseltongue pt01-folder-to-cozodb-streamer ./my-project --db "rocksdb:mycode.db"

Output:

Running Tool 1: folder-to-cozodb-streamer
  Database: rocksdb:mycode.db

Streaming Summary:
Total files found: 108
Files processed: 92
Entities created: 216 (CODE only)
  └─ CODE entities: 216
  └─ TEST entities: 982 (excluded for optimal LLM context)

✓ Indexing completed

Step 2: Start the HTTP Server

parseltongue pt08-http-code-query-server --db "rocksdb:mycode.db"

Output:

Parseltongue HTTP Server
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

HTTP Server running at: http://localhost:7777

┌─────────────────────────────────────────────────────────────────┐
│  Add to your LLM agent: PARSELTONGUE_URL=http://localhost:7777  │
└─────────────────────────────────────────────────────────────────┘

Quick test:
  curl http://localhost:7777/server-health-check-status

Step 3: Query from Your Agent

# Health check
curl http://localhost:7777/server-health-check-status

# Codebase overview
curl http://localhost:7777/codebase-statistics-overview-summary

# Search for functions
curl "http://localhost:7777/code-entities-search-fuzzy?q=authenticate"

# What calls this function?
curl "http://localhost:7777/reverse-callers-query-graph?entity=rust:fn:process:src_lib_rs:50-100"

# What breaks if I change this?
curl "http://localhost:7777/blast-radius-impact-analysis?entity=rust:fn:new:src_storage_rs:10-30&hops=3"

# Get optimal context for LLM
curl "http://localhost:7777/smart-context-token-budget?focus=rust:fn:main:src_main_rs:1-50&tokens=4000"

Jobs To Be Done

User Job	HTTP Endpoint	Token Cost
"Is the server running?"	`GET /server-health-check-status`	~35
"Give me codebase overview"	`GET /codebase-statistics-overview-summary`	~100
"List all endpoints"	`GET /api-reference-documentation-help`	~500
"List all entities"	`GET /code-entities-list-all`	~2K
"Find functions named X"	`GET /code-entities-search-fuzzy?q=X`	~500
"Get entity details"	`GET /code-entity-detail-view?key=X`	~200
"What calls this?"	`GET /reverse-callers-query-graph?entity=X`	~500
"What does this call?"	`GET /forward-callees-query-graph?entity=X`	~500
"List all edges"	`GET /dependency-edges-list-all`	~3K
"What breaks if I change X?"	`GET /blast-radius-impact-analysis?entity=X&hops=3`	~2K
"Any circular dependencies?"	`GET /circular-dependency-detection-scan`	~1K
"Where is the complexity?"	`GET /complexity-hotspots-ranking-view?top=10`	~500
"What modules exist?"	`GET /semantic-cluster-grouping-list`	~1K
"Give me optimal context"	`GET /smart-context-token-budget?focus=X&tokens=4000`	~4K

HTTP API Reference (15 Endpoints)

Core Endpoints

Endpoint	Description
`GET /server-health-check-status`	Server health check
`GET /codebase-statistics-overview-summary`	Entity/edge counts, languages
`GET /api-reference-documentation-help`	Full API documentation

Entity Endpoints

Endpoint	Description
`GET /code-entities-list-all`	All entities
`GET /code-entities-list-all?entity_type=function`	Filter by type
`GET /code-entity-detail-view?key=X`	Single entity details
`GET /code-entities-search-fuzzy?q=pattern`	Fuzzy search by name

Graph Query Endpoints

Endpoint	Description
`GET /dependency-edges-list-all`	All dependency edges
`GET /reverse-callers-query-graph?entity=X`	Who calls X?
`GET /forward-callees-query-graph?entity=X`	What does X call?
`GET /blast-radius-impact-analysis?entity=X&hops=N`	What breaks if X changes?

Analysis Endpoints

Endpoint	Description
`GET /circular-dependency-detection-scan`	Find circular dependencies
`GET /complexity-hotspots-ranking-view?top=N`	Complexity ranking
`GET /semantic-cluster-grouping-list`	Semantic module groups

Context Optimization

Endpoint	Description
`GET /smart-context-token-budget?focus=X&tokens=N`	Context selection within token budget

Real-World Example Queries (Dogfooded on Parseltongue)

These examples were run against Parseltongue's own codebase (217 entities, 3027 dependency edges).

Example 1: Understanding a New Codebase

# 1. Get codebase overview
curl http://localhost:7777/codebase-statistics-overview-summary | jq '.data'

Actual Response:

{
  "code_entities_total_count": 217,
  "test_entities_total_count": 0,
  "dependency_edges_total_count": 3027,
  "languages_detected_list": ["rust"],
  "database_file_path": "rocksdb:parseltongue.db"
}

# 2. Find complexity hotspots (most called functions)
curl "http://localhost:7777/complexity-hotspots-ranking-view?top=5" | jq '.data.hotspots'

Actual Response:

[
  {"rank": 1, "entity_key": "rust:fn:new:unknown:0-0", "inbound_count": 215, "total_coupling": 215},
  {"rank": 2, "entity_key": "rust:fn:unwrap:unknown:0-0", "inbound_count": 163, "total_coupling": 163},
  {"rank": 3, "entity_key": "rust:fn:to_string:unknown:0-0", "inbound_count": 139, "total_coupling": 139},
  {"rank": 4, "entity_key": "rust:fn:Ok:unknown:0-0", "inbound_count": 101, "total_coupling": 101},
  {"rank": 5, "entity_key": "rust:fn:Some:unknown:0-0", "inbound_count": 62, "total_coupling": 62}
]

Note: unknown:0-0 indicates stdlib/external calls (HashMap::new, unwrap, etc.)

# 3. Check for circular dependencies
curl http://localhost:7777/circular-dependency-detection-scan | jq '.data'

Actual Response:

{"has_cycles": false, "cycle_count": 0, "cycles": []}

Example 2: Impact Analysis Before Refactoring

# 1. Find who calls CozoDbStorage::new() (reverse dependencies)
curl "http://localhost:7777/reverse-callers-query-graph?entity=rust:method:new:__crates_parseltongue-core_src_storage_cozo_client_rs:38-54" | jq '.data.total_count'

Actual Response: 215 callers!

# 2. Get full blast radius (2-hop transitive impact)
curl "http://localhost:7777/blast-radius-impact-analysis?entity=rust:method:new:__crates_parseltongue-core_src_storage_cozo_client_rs:38-54&hops=2" | jq '.data'

Actual Response:

{
  "source_entity": "rust:method:new:__crates_parseltongue-core_src_storage_cozo_client_rs:38-54",
  "hops_requested": 2,
  "total_affected": 302,
  "by_hop": [
    {"hop": 1, "count": 214, "entities": ["rust:fn:build_cli:...", "rust:fn:start_http_server_blocking_loop:...", "..."]},
    {"hop": 2, "count": 88, "entities": ["rust:fn:main:...", "rust:fn:handle_blast_radius_impact_analysis:...", "..."]}
  ]
}

Insight: Changing CozoDbStorage::new() affects 302 entities transitively!

Example 3: Finding and Exploring Code

# 1. Search for storage-related entities
curl "http://localhost:7777/code-entities-search-fuzzy?q=storage" | jq '.data.total_count'

Actual Response: 36 matching entities (struct, impl, methods)

# 2. Get full source code of CozoDbStorage::new()
curl "http://localhost:7777/code-entity-detail-view?key=rust:method:new:__crates_parseltongue-core_src_storage_cozo_client_rs:38-54" | jq '.data'

Actual Response:

{
  "key": "rust:method:new:__crates_parseltongue-core_src_storage_cozo_client_rs:38-54",
  "file_path": "./crates/parseltongue-core/src/storage/cozo_client.rs",
  "entity_type": "method",
  "language": "rust",
  "code": "    pub async fn new(engine_spec: &str) -> Result<Self> {\n        let (engine, path) = if engine_spec.contains(':') {\n            let parts: Vec<&str> = engine_spec.splitn(2, ':').collect();\n            (parts[0], parts[1])\n        } else {\n            (engine_spec, \"\")\n        };\n        let db = DbInstance::new(engine, path, Default::default())\n            .map_err(|e| ParseltongError::DatabaseError {...})?;\n        Ok(Self { db })\n    }"
}

# 3. See what CozoDbStorage::new() calls (forward dependencies)
curl "http://localhost:7777/forward-callees-query-graph?entity=rust:method:new:__crates_parseltongue-core_src_storage_cozo_client_rs:38-54" | jq '.data.callees'

Actual Response:

[
  {"to_key": "rust:fn:Ok:unknown:0-0", "edge_type": "Calls"},
  {"to_key": "rust:fn:collect:unknown:0-0", "edge_type": "Calls"},
  {"to_key": "rust:fn:contains:unknown:0-0", "edge_type": "Calls"},
  {"to_key": "rust:fn:default:unknown:0-0", "edge_type": "Calls"},
  {"to_key": "rust:fn:map_err:unknown:0-0", "edge_type": "Calls"},
  {"to_key": "rust:fn:new:unknown:0-0", "edge_type": "Calls"},
  {"to_key": "rust:fn:splitn:unknown:0-0", "edge_type": "Calls"},
  {"to_key": "rust:fn:to_string:unknown:0-0", "edge_type": "Calls"}
]

Example 4: Find HTTP Handlers

# Search for all handler functions
curl "http://localhost:7777/code-entities-search-fuzzy?q=handler" | jq '.data.total_count'

Actual Response: 124 handler-related entities (functions, modules, structs)

Example 5: Smart Context for LLM Agents

# Get optimal context within 2000 token budget
curl "http://localhost:7777/smart-context-token-budget?focus=rust:method:new:__crates_parseltongue-core_src_storage_cozo_client_rs:38-54&tokens=2000" | jq '.data'

Actual Response:

{
  "focus_entity": "rust:method:new:__crates_parseltongue-core_src_storage_cozo_client_rs:38-54",
  "token_budget": 2000,
  "tokens_used": 816,
  "entities_included": 8,
  "context": [
    {"entity_key": "rust:fn:Ok:unknown:0-0", "relevance_score": 0.95, "relevance_type": "direct_callee"},
    {"entity_key": "rust:fn:collect:unknown:0-0", "relevance_score": 0.95, "relevance_type": "direct_callee"},
    {"entity_key": "rust:fn:contains:unknown:0-0", "relevance_score": 0.95, "relevance_type": "direct_callee"},
    {"entity_key": "rust:fn:default:unknown:0-0", "relevance_score": 0.95, "relevance_type": "direct_callee"},
    {"entity_key": "rust:fn:map_err:unknown:0-0", "relevance_score": 0.95, "relevance_type": "direct_callee"},
    {"entity_key": "rust:fn:new:unknown:0-0", "relevance_score": 0.95, "relevance_type": "direct_callee"},
    {"entity_key": "rust:fn:splitn:unknown:0-0", "relevance_score": 0.95, "relevance_type": "direct_callee"},
    {"entity_key": "rust:fn:to_string:unknown:0-0", "relevance_score": 0.95, "relevance_type": "direct_callee"}
  ]
}

Smart Context Algorithm:

Direct callers: score 1.0
Direct callees: score 0.95
Transitive deps: score 0.7 - (0.1 × depth)
Greedy knapsack selection until budget exhausted

Entity Key Format

Entity keys follow this pattern:

language:entity_type:entity_name:file_path:line_range

Example: rust:fn:authenticate:src_auth_rs:10-50

Language: rust
Type: fn (function)
Name: authenticate
File: src/auth.rs (slashes become underscores)
Lines: 10-50

Tip: When using entity keys in URLs with query parameters, colons work fine:

curl "http://localhost:7777/reverse-callers-query-graph?entity=rust:fn:process:src_lib_rs:1-20"

Response Format

All endpoints return consistent JSON:

{
  "success": true,
  "endpoint": "/blast-radius-impact-analysis",
  "data": {
    "source_entity": "rust:fn:process:src_lib_rs:1-20",
    "total_affected": 14,
    "by_hop": [{"hop": 1, "count": 5, "entities": [...]}]
  },
  "tokens": 234
}

The tokens field helps LLMs understand context budget impact.

CLI Options

parseltongue pt08-http-code-query-server [OPTIONS]

Option	Description	Default
`--port <PORT>`	HTTP port	7777
`--db <PATH>`	Database path	`mem` (in-memory)
`--verbose`	Enable verbose logging	false

Database format: Always use rocksdb: prefix for persistent databases:

--db "rocksdb:mycode.db"     # Correct
--db "mycode.db"              # Wrong

Languages Supported

Language	Extensions	Entity Types
Rust	`.rs`	fn, struct, enum, trait, impl, mod
Python	`.py`	def, class, async def
JavaScript	`.js`, `.jsx`	function, class, arrow functions
TypeScript	`.ts`, `.tsx`	function, class, interface, type
Go	`.go`	func, type, struct, interface
Java	`.java`	class, interface, method, enum
C	`.c`, `.h`	function, struct, typedef
C++	`.cpp`, `.hpp`	function, class, struct, template
Ruby	`.rb`	def, class, module
PHP	`.php`	function, class, trait
C#	`.cs`	class, struct, interface, method
Swift	`.swift`	func, class, struct, protocol

Edge Types

Edge Type	Direction	Meaning
`Calls`	downward	Function invocation
`Uses`	downward	Type/constant reference
`Implements`	upward	Trait implementation
`Extends`	upward	Inheritance
`Contains`	downward	Structural containment

Data Granularity

Parseltongue stores two types of data with different granularity levels:

Entities (Fine-Grained)

Entities are parsed at function/method/struct level with full source locations:

Entity Type	Count Example	Description
`struct`	66	Struct definitions
`function`	53	Free functions
`method`	46	Methods on impl blocks
`module`	40	Module declarations
`impl`	8	Implementation blocks
`enum`	4	Enum definitions

Entity Key Format: language:type:name:file_path:start_line-end_line

rust:method:new:__crates_core_src_storage_rs:38-54

Dependency Edges (File-to-Symbol)

Edges track file-level outgoing dependencies to symbol-level targets:

from_key: rust:file:__crates_core_src_entities_rs:1-1   (file level)
to_key:   rust:fn:new:unknown:0-0                        (symbol level)

External/Stdlib References (`unknown:0-0`)

When code calls external functions (stdlib, crate dependencies), the target has unknown:0-0 as its source location because parseltongue cannot locate the source:

Pattern	Meaning	Example
`rust:fn:new:unknown:0-0`	Stdlib `new()` calls	`HashMap::new()`, `Vec::new()`
`rust:fn:unwrap:unknown:0-0`	Stdlib `unwrap()` calls	`result.unwrap()`
`rust:fn:Ok:unknown:0-0`	Result enum variant	`Ok(value)`
`rust:module:SomeType:0-0`	External type reference	Type from another crate

Why this matters: The complexity hotspots endpoint shows rust:fn:new:unknown:0-0 with 215 callers - this means 215 places in your codebase call new() on various types.

Architecture

4-Word Naming Convention: All functions and endpoints use exactly 4 words:

serve-http-code-backend          # 4 words
blast-radius-impact-analysis     # 4 words
code-entities-search-fuzzy       # 4 words

Single Binary: ~50MB, zero runtime dependencies.

Installation

# Download (one command)
curl -L https://github.com/that-in-rust/parseltongue-dependency-graph-generator/releases/download/v1.2.0/parseltongue -o parseltongue && chmod +x parseltongue

# Verify
./parseltongue --version
# parseltongue 1.2.0

Optional: Add to PATH for global access:

sudo mv parseltongue /usr/local/bin/

Releases: https://github.com/that-in-rust/parseltongue-dependency-graph-generator/releases

License

MIT License - See LICENSE file

Section 02 Recipe Book

Parseltongue Ultimate Workflow Recipe Book

For LLM Agents: Strategic Code Graph Analysis Patterns

Version 2.0 — Synthesized from research + source code analysis
Purpose: Reference document for LLMs using Parseltongue to reason about codebases

Part 0: The Meta-Strategy

The Fundamental Insight

Parseltongue transforms code from unstructured text into a queryable graph. The 15 endpoints are not random—they form a coherent system:

┌─────────────────────────────────────────────────────────────────┐
│                    PARSELTONGUE ENDPOINT TAXONOMY               │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  DISCOVERY          TRAVERSAL           ANALYSIS                │
│  ─────────          ─────────           ────────                │
│  • statistics       • reverse-callers   • blast-radius          │
│  • list-all         • forward-callees   • circular-deps         │
│  • search-fuzzy     • edges-list        • complexity-hotspots   │
│  • entity-detail                        • semantic-clusters     │
│                                                                 │
│                     INTELLIGENCE                                │
│                     ────────────                                │
│                     • smart-context-token-budget                │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

The Golden Rule: Discovery → Traversal → Analysis → Intelligence

Every workflow follows this progression. Never jump to Analysis without Discovery first.

Part 1: Core Workflow Patterns

Pattern 1.1: The Orientation Sequence (First Contact)

When: You encounter an unfamiliar codebase
Goal: Build mental model in <5 minutes

# STEP 1: Vital Signs (10 tokens)
curl http://localhost:7777/server-health-check-status

# STEP 2: Scale Assessment (100 tokens)
curl http://localhost:7777/codebase-statistics-overview-summary
# → Decision Point:
#   • <500 entities: Small, can list-all
#   • 500-5000 entities: Medium, use search-fuzzy
#   • >5000 entities: Large, use semantic-clusters first

# STEP 3: Architectural Health (1K tokens)
curl http://localhost:7777/circular-dependency-detection-scan
# → Critical: If has_cycles=true, investigate BEFORE any changes

# STEP 4: Complexity Landscape (500 tokens)
curl "http://localhost:7777/complexity-hotspots-ranking-view?top=10"
# → Note: unknown:0-0 entries are stdlib calls (ignore for architecture)
# → Focus on YOUR code entities in top 10

# STEP 5: Module Structure (1K tokens)
curl http://localhost:7777/semantic-cluster-grouping-list
# → Algorithm: Label Propagation (from source code)
# → Groups entities by bidirectional edge connectivity
# → High internal_edges / low external_edges = cohesive module

Output: You now know:

How big is this codebase?
Is the architecture clean (no cycles)?
Where are the complexity hotspots?
What are the natural module boundaries?

Pattern 1.2: The Surgical Investigation (Bug Hunting)

When: You have an error/symptom and need root cause
Goal: Trace backward from symptom to cause

# STEP 1: Locate the symptom
curl "http://localhost:7777/code-entities-search-fuzzy?q=ERROR_TEXT_OR_FUNCTION"
# → Get entity key(s) related to the error

# STEP 2: Understand the context
curl "http://localhost:7777/code-entity-detail-view?key=ENTITY_KEY"
# → Read the actual code where error occurs

# STEP 3: Trace callers (who triggers this?)
curl "http://localhost:7777/reverse-callers-query-graph?entity=ENTITY_KEY"
# → Source code reveals: Uses fuzzy matching on function name
# → Even if key doesn't match exactly, it finds related calls

# STEP 4: Check temporal coupling (hidden relationships)
curl "http://localhost:7777/temporal-coupling-hidden-deps?entity=ENTITY_KEY"
# → KILLER INSIGHT: Shows files that change together WITHOUT code edges
# → If a config file shows high coupling but no code edge → likely cause

# STEP 5: Assess investigation scope
curl "http://localhost:7777/blast-radius-impact-analysis?entity=SUSPECT_ENTITY&hops=2"
# → Algorithm: BFS traversal with fuzzy key matching
# → hops=1: Direct callers only
# → hops=2: Callers + callers-of-callers
# → hops=3: Standard investigation depth

Decision Tree:

Error Found
    │
    ▼
Search for error location
    │
    ├─── Found unique match ──► Get entity detail ──► Trace callers
    │
    └─── Found multiple matches ──► Check complexity hotspots
                                    (bugs cluster in complex code)

Pattern 1.3: The Safe Refactor Protocol

When: You need to modify code without breaking things
Goal: Quantify risk before making changes

# STEP 1: Identify the target
curl "http://localhost:7777/code-entities-search-fuzzy?q=FUNCTION_TO_CHANGE"
# → Get exact entity key

# STEP 2: Direct impact (fan-in)
curl "http://localhost:7777/reverse-callers-query-graph?entity=ENTITY_KEY"
# → Count: This is the MINIMUM number of places to update

# STEP 3: Transitive impact (blast radius)
curl "http://localhost:7777/blast-radius-impact-analysis?entity=ENTITY_KEY&hops=3"
# → Algorithm from source: BFS with visited set, respects max hops
# → RISK SCORE = hop1_count × 3 + hop2_count × 2 + hop3_count × 1

# STEP 4: Check for cycle involvement
curl http://localhost:7777/circular-dependency-detection-scan
# → Algorithm: DFS with three-color marking (WHITE/GRAY/BLACK)
# → If your entity is IN a cycle, refactoring is HIGH RISK

# STEP 5: Hidden coupling check
curl "http://localhost:7777/temporal-coupling-hidden-deps?entity=ENTITY_KEY"
# → If files show high co-change with NO code edge:
#   These files MUST be updated together even if code doesn't require it

# STEP 6: Generate comprehensive context for AI assistance
curl "http://localhost:7777/smart-context-token-budget?focus=ENTITY_KEY&tokens=6000"
# → Algorithm from source:
#   • Direct callers: score 1.0
#   • Direct callees: score 0.95
#   • Transitive depth N: score = 0.7 - (0.1 × N)
#   • Greedy knapsack selection until budget exhausted

Risk Quantification Matrix:

Metric	Low Risk	Medium Risk	High Risk
Direct callers	0-5	6-15	>15
Blast radius (3 hops)	0-20	21-50	>50
In cycle?	No	Adjacent	Yes
Temporal coupling violations	0	1-2	>2

Part 2: Advanced Composite Workflows

Pattern 2.1: The Architecture Audit

When: Tech debt assessment, sprint planning, major refactor planning
Goal: Comprehensive health score with prioritized action items

# PHASE 1: Structural Health
curl http://localhost:7777/codebase-statistics-overview-summary
curl http://localhost:7777/circular-dependency-detection-scan
curl http://localhost:7777/semantic-cluster-grouping-list

# PHASE 2: Complexity Analysis
curl "http://localhost:7777/complexity-hotspots-ranking-view?top=30"

# PHASE 3: For EACH top-10 hotspot (excluding unknown:0-0):
for hotspot in TOP_10_HOTSPOTS:
    curl "http://localhost:7777/blast-radius-impact-analysis?entity=${hotspot}&hops=2"
    curl "http://localhost:7777/reverse-callers-query-graph?entity=${hotspot}"
    curl "http://localhost:7777/forward-callees-query-graph?entity=${hotspot}"
    
    # Calculate: fan_in + fan_out = total_coupling
    # If fan_in > 10 AND fan_out > 10 → GOD CLASS smell

Scoring Formula:

Architecture Health Score = 100 - penalties

Penalties:
  - Each cycle: -15 points
  - Each god class (fan_in > 10 AND fan_out > 10): -5 points
  - Hotspot in top 5 with blast_radius > 50: -3 points
  - Cluster with external_edges > internal_edges: -2 points

Pattern 2.2: The Feature Archaeology

When: Understanding how an existing feature works
Goal: Complete call tree from entry point to data layer

# STEP 1: Find the entry point
curl "http://localhost:7777/code-entities-search-fuzzy?q=FEATURE_NAME"
# → Look for handlers, controllers, or public APIs

# STEP 2: Forward trace (depth-first feature walk)
CURRENT=ENTRY_POINT
VISITED=[]
CALL_TREE=[]

while CURRENT not fully explored:
    curl "http://localhost:7777/forward-callees-query-graph?entity=${CURRENT}"
    # → Add each callee to CALL_TREE
    # → For each callee that's YOUR code (not unknown:0-0):
    #     Recurse into it
    
# STEP 3: Identify the layers
curl http://localhost:7777/semantic-cluster-grouping-list
# → Map each entity in CALL_TREE to its cluster
# → Typical pattern: Handler → Service → Repository → External

# STEP 4: Get complete context for understanding
curl "http://localhost:7777/smart-context-token-budget?focus=ENTRY_POINT&tokens=8000"

Output: A complete map like:

handle_user_login (cluster: handlers)
  └── validate_credentials (cluster: services)
        └── query_user_by_email (cluster: repositories)
              └── new:unknown:0-0 (external: database)
        └── verify_password_hash (cluster: crypto)
              └── argon2:unknown:0-0 (external: argon2)

Pattern 2.3: The Pre-Deployment Risk Assessment

When: Before releasing a set of changes
Goal: Risk score and required test coverage

# For each changed file/entity:

TOTAL_RISK = 0
REQUIRED_TESTS = []

for changed_entity in CHANGES:
    # Get blast radius
    result = curl "http://localhost:7777/blast-radius-impact-analysis?entity=${changed_entity}&hops=2"
    TOTAL_RISK += result.total_affected * 0.5
    
    # Check if it's a hotspot
    hotspots = curl "http://localhost:7777/complexity-hotspots-ranking-view?top=50"
    if changed_entity in hotspots.top_20:
        TOTAL_RISK += 10  # Hotspot penalty
    
    # Get temporal coupling
    temporal = curl "http://localhost:7777/temporal-coupling-hidden-deps?entity=${changed_entity}"
    for coupling in temporal.hidden_dependencies:
        if not coupling.has_code_edge:
            if coupling.coupled_entity not in CHANGES:
                REQUIRED_TESTS.append(coupling.coupled_entity)
                TOTAL_RISK += 5  # Missing coupled change

# Check for new cycles
cycles = curl http://localhost:7777/circular-dependency-detection-scan
if cycles.has_cycles:
    TOTAL_RISK += 50  # Critical: cycles detected

# Final assessment
if TOTAL_RISK > 100:
    RECOMMENDATION = "HIGH RISK - Requires senior review"
elif TOTAL_RISK > 50:
    RECOMMENDATION = "MEDIUM RISK - Extended testing required"
else:
    RECOMMENDATION = "LOW RISK - Standard review process"

Part 3: LLM Integration Patterns

Pattern 3.1: Intelligent Code Explanation

Goal: Explain code with full relational context

# STEP 1: Get the code itself
entity = curl "http://localhost:7777/code-entity-detail-view?key=TARGET"

# STEP 2: Get relationship context (who uses it, what it uses)
callers = curl "http://localhost:7777/reverse-callers-query-graph?entity=TARGET"
callees = curl "http://localhost:7777/forward-callees-query-graph?entity=TARGET"

# STEP 3: Get cluster context (what module is this)
clusters = curl http://localhost:7777/semantic-cluster-grouping-list
# → Find which cluster contains TARGET

# STEP 4: Get optimized context
context = curl "http://localhost:7777/smart-context-token-budget?focus=TARGET&tokens=4000"

LLM Prompt Template:

You are explaining code from a {entity.language} codebase.

## Context
Module: {cluster_name} (Purpose: {inferred_from_cluster_entities})
Callers ({callers.total_count}): {top_3_callers_summary}
Callees ({callees.total_count}): {top_3_callees_summary}

## Related Code
{smart_context.context}

## Target Code
{entity.code}

Explain:
1. What this code does
2. Why it exists (based on its callers)
3. How it fits into the larger system
4. Any potential issues or improvements

Pattern 3.2: AI-Assisted Debugging

Goal: Generate hypotheses with supporting evidence

# STEP 1: Locate error context
error_entity = curl "http://localhost:7777/code-entities-search-fuzzy?q=ERROR_LOCATION"

# STEP 2: Get backward trace (potential causes)
callers = curl "http://localhost:7777/reverse-callers-query-graph?entity=ERROR_ENTITY"

# STEP 3: Check for hidden dependencies
temporal = curl "http://localhost:7777/temporal-coupling-hidden-deps?entity=ERROR_ENTITY"

# STEP 4: Get complexity context
hotspots = curl "http://localhost:7777/complexity-hotspots-ranking-view?top=20"

# STEP 5: Generate rich context
context = curl "http://localhost:7777/smart-context-token-budget?focus=ERROR_ENTITY&tokens=6000"

LLM Prompt Template:

I'm debugging: {symptom_description}

## Error Location
{error_entity.code}

## Upstream Code (Potential Causes)
{for caller in callers.top_5}
- {caller.from_key}: calls this via {caller.edge_type}
{endfor}

## Hidden Dependencies (Files that change together)
{for dep in temporal.hidden_dependencies if not dep.has_code_edge}
⚠️ {dep.coupled_entity} - changes together but NO code connection
   Insight: {dep.insight}
{endfor}

## Complexity Note
{if error_entity in hotspots}
This code is in the TOP {rank} complexity hotspots - bugs often cluster here
{endif}

## Related Code
{smart_context}

Generate 3-5 hypotheses for the root cause, ordered by likelihood.
For each hypothesis, cite specific evidence from the context above.

Pattern 3.3: Refactoring Assistance

Goal: Plan safe refactoring with full impact awareness

# STEP 1: Full impact analysis
blast = curl "http://localhost:7777/blast-radius-impact-analysis?entity=TARGET&hops=3"

# STEP 2: Architecture context
clusters = curl http://localhost:7777/semantic-cluster-grouping-list
cycles = curl http://localhost:7777/circular-dependency-detection-scan

# STEP 3: Hidden coupling (what ELSE needs to change)
temporal = curl "http://localhost:7777/temporal-coupling-hidden-deps?entity=TARGET"

# STEP 4: Maximum context
context = curl "http://localhost:7777/smart-context-token-budget?focus=TARGET&tokens=8000"

LLM Prompt Template:

I want to refactor: {target_entity.key}
Goal: {refactoring_goal}

## Impact Analysis
- Direct callers: {blast.by_hop[0].count}
- 2-hop impact: {blast.by_hop[1].count if exists}
- 3-hop impact: {blast.by_hop[2].count if exists}
- TOTAL affected: {blast.total_affected}

## Architecture Constraints
Module: {cluster_name}
Cycle involvement: {yes/no, list if yes}

## Code Context
{smart_context}

Create a step-by-step refactoring plan that:
1. Maintains backward compatibility for {blast.by_hop[0].count} direct callers
2. Respects module boundary of {cluster_name}
3. Does NOT introduce new cycles
4. Includes changes to hidden dependencies: {temporal.hidden_deps_list}
5. Specifies test coverage requirements

Part 4: Decision Frameworks

When to Use Each Endpoint

Question Type	Primary Endpoint	Supporting Endpoints
"How big is this?"	`/codebase-statistics-overview-summary`	-
"Where is X?"	`/code-entities-search-fuzzy`	`/code-entity-detail-view`
"What is X?"	`/code-entity-detail-view`	-
"Who uses X?"	`/reverse-callers-query-graph`	`/blast-radius-impact-analysis`
"What does X use?"	`/forward-callees-query-graph`	-
"What breaks if I change X?"	`/blast-radius-impact-analysis`	`/reverse-callers-query-graph`
"Is architecture healthy?"	`/circular-dependency-detection-scan`	`/complexity-hotspots-ranking-view`
"What's most complex?"	`/complexity-hotspots-ranking-view`	`/blast-radius-impact-analysis`
"What are the modules?"	`/semantic-cluster-grouping-list`	-
"Give me context for LLM"	`/smart-context-token-budget`	All others as input

Hop Depth Guidelines for Blast Radius

Scenario	Hops	Rationale
Quick check	1	Direct callers only
Standard refactor	2	Immediate ecosystem
Major API change	3	Full transitive closure
Architecture analysis	4+	Complete picture

Warning: If impact exceeds 100 at hops=2, you're touching critical infrastructure.

Token Budget Guidelines for Smart Context

Task	Tokens	Why
Quick explanation	2000	Focused, single entity
Code review	4000	Include immediate neighbors
Debugging	6000	Need hypothesis space
Refactoring	8000	Maximum context for safe changes

Part 5: Anti-Patterns (What NOT to Do)

❌ Anti-Pattern 1: Jumping to Blast Radius

Wrong:

# Don't start here!
curl "http://localhost:7777/blast-radius-impact-analysis?entity=GUESS&hops=3"

Right:

# Always search first to get exact entity key
curl "http://localhost:7777/code-entities-search-fuzzy?q=function_name"
# THEN use the returned key for blast radius

❌ Anti-Pattern 2: Ignoring unknown:0-0

Wrong: Treating rust:fn:new:unknown:0-0 as a real entity to investigate

Right: These are stdlib/external calls. Filter them out when analyzing YOUR code:

hotspots = curl "http://localhost:7777/complexity-hotspots-ranking-view?top=20"
your_hotspots = [h for h in hotspots if 'unknown:0-0' not in h.entity_key]

❌ Anti-Pattern 4: Ignoring Token Counts

Wrong: Dumping all endpoint results into LLM context

Right: Every endpoint returns a tokens field. Track it:

total_tokens = 0
total_tokens += stats.tokens      # ~100
total_tokens += entities.tokens   # ~2000
total_tokens += blast.tokens      # ~2000
# Check against your LLM's context limit!

Part 6: The Complete Decision Flowchart

START
  │
  ▼
┌─────────────────────────────────┐
│ What's your goal?               │
└─────────────────────────────────┘
  │
  ├─── New codebase ──────────────► Pattern 1.1: Orientation Sequence
  │
  ├─── Fix a bug ─────────────────► Pattern 1.2: Surgical Investigation
  │
  ├─── Refactor safely ───────────► Pattern 1.3: Safe Refactor Protocol
  │
  ├─── Tech debt assessment ──────► Pattern 2.1: Architecture Audit
  │
  ├─── Understand a feature ──────► Pattern 2.2: Feature Archaeology
  │
  ├─── Pre-release check ─────────► Pattern 2.3: Risk Assessment
  │
  └─── Generate LLM context ──────► Pattern 3.x: LLM Integration

Appendix A: Entity Key Cheatsheet

Format: language:type:name:file_path:start_line-end_line

Example	Meaning
`rust:fn:main:src_main_rs:8-45`	Function `main` in `src/main.rs` lines 8-45
`rust:struct:Parser:lib_rs:10-25`	Struct `Parser` definition
`rust:method:new:storage_rs:38-54`	Method `new` on some impl block
`rust:fn:new:unknown:0-0`	External/stdlib `new()` call

Appendix B: Algorithm Quick Reference

Endpoint	Algorithm	Complexity
`/circular-dependency-detection-scan`	DFS with 3-color marking	O(V + E)
`/semantic-cluster-grouping-list`	Label Propagation (LPA)	O(E × iterations)
`/blast-radius-impact-analysis`	BFS with hop tracking	O(V + E) per hop
`/smart-context-token-budget`	Greedy knapsack	O(E + V log V)
`/complexity-hotspots-ranking-view`	Edge counting + sort	O(E + V log V)

Parse once, query forever.

This document is a reference for LLM agents using Parseltongue to reason about code with graphs, not text.

Parse once, query forever.

Parseltongue: Making LLMs reason about code with graphs, not text.

Name		Name	Last commit message	Last commit date
Latest commit History 793 Commits
.claude		.claude
.parseltongue/workspaces/self_analysis_20251126_150927		.parseltongue/workspaces/self_analysis_20251126_150927
crates		crates
dependency_queries		dependency_queries
entity_queries		entity_queries
zzArchive		zzArchive
.gitignore		.gitignore
CLANG_VS_TREESITTER_COMPARISON.md		CLANG_VS_TREESITTER_COMPARISON.md
CLAUDE.md		CLAUDE.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
FAQsToParseltongueV01.md		FAQsToParseltongueV01.md
README.md		README.md

that-in-rust/parseltongue-dependency-graph-generator

Folders and files

Latest commit

History

Repository files navigation

Parseltongue

The Problem

The Solution

Impact Analysis

Quick Start

Step 1: Index Your Codebase

Step 2: Start the HTTP Server

Step 3: Query from Your Agent

Jobs To Be Done

HTTP API Reference (15 Endpoints)

Core Endpoints

Entity Endpoints

Graph Query Endpoints

Analysis Endpoints

Context Optimization

Real-World Example Queries (Dogfooded on Parseltongue)

Example 1: Understanding a New Codebase

Example 2: Impact Analysis Before Refactoring

Example 3: Finding and Exploring Code

Example 4: Find HTTP Handlers

Example 5: Smart Context for LLM Agents

Entity Key Format

Response Format

CLI Options

Languages Supported

Edge Types

Data Granularity

Entities (Fine-Grained)

Dependency Edges (File-to-Symbol)

External/Stdlib References (unknown:0-0)

Architecture

Installation

License

Section 02 Recipe Book

Parseltongue Ultimate Workflow Recipe Book

For LLM Agents: Strategic Code Graph Analysis Patterns

Part 0: The Meta-Strategy

The Fundamental Insight

The Golden Rule: Discovery → Traversal → Analysis → Intelligence

Part 1: Core Workflow Patterns

Pattern 1.1: The Orientation Sequence (First Contact)

Pattern 1.2: The Surgical Investigation (Bug Hunting)

Pattern 1.3: The Safe Refactor Protocol

Part 2: Advanced Composite Workflows

Pattern 2.1: The Architecture Audit

Pattern 2.2: The Feature Archaeology

Pattern 2.3: The Pre-Deployment Risk Assessment

Part 3: LLM Integration Patterns

Pattern 3.1: Intelligent Code Explanation

Pattern 3.2: AI-Assisted Debugging

Pattern 3.3: Refactoring Assistance

Part 4: Decision Frameworks

When to Use Each Endpoint

Hop Depth Guidelines for Blast Radius

Token Budget Guidelines for Smart Context

Part 5: Anti-Patterns (What NOT to Do)

❌ Anti-Pattern 1: Jumping to Blast Radius

❌ Anti-Pattern 2: Ignoring unknown:0-0

❌ Anti-Pattern 4: Ignoring Token Counts

Part 6: The Complete Decision Flowchart

Appendix A: Entity Key Cheatsheet

Appendix B: Algorithm Quick Reference

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 15

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

External/Stdlib References (`unknown:0-0`)

Packages