Skip to content

bug: codedb_symbol body=true returns only the signature line — Symbol.line_end is never populated #224

@ocordeiro

Description

@ocordeiro

Problem

codedb_symbol with body=true is documented to return the full source of the matched symbol so agents can avoid a follow-up codedb_read. In practice it returns only the signature line.

Example against a real TypeScript file:

codedb_symbol { name: "handleResendWebhook", body: true }

Response:

1 results for 'handleResendWebhook':
  backend/src/services/resendWebhookService.ts:105 (function)
  // export async function handleResendWebhook(event, svixId): Promise<...> {
  105 | export async function handleResendWebhook(event, svixId): Promise<...> {

The actual function is 55 lines (L105–L159). Only line 105 is returned. Reproduced across recordBounce, checkAndSuspend, handleBounced, validateNewsletterForSending, suspendBouncedSubscribers — every multi-line function behaves the same way, regardless of language.

Root cause

Every per-language parser in src/explore.zig writes:

.line_start = line_num,
.line_end   = line_num,

The line_end field exists on Symbol (src/explore.zig:33) since the first commit (c7b28d4, 2026-03-03), but it has never been populated with the real end of the block — it has always been set to the signature line. This affects parseZigLine, parsePythonLine, parseTsLine, parseRustLine, parsePhpLine, parseGoLine, and parseRubyLine (~35 occurrences of .line_end = line_num).

The body=true flag was added in 8051d12 (2026-03-05, "feat: add agent-friendly MCP enhancements"), wiring codedb_symbol through explorer.getSymbolBody(path, line_start, line_end, ...) in src/mcp.zig:783. That call assumes a correct line_end and extracts [line_start..line_end] via extractLines — which, with line_end == line_start, returns a one-line slice. The feature commit did not add a regression test that would catch this.

Every language added after that commit (feat: add Go and Ruby language support (#151), feat(explore): add PHP/Laravel language support (#87), feat: add Rust symbol parsing to codedb_outline) replicated the same .line_end = line_num pattern, propagating the bug.

Failing Test

Branch (on fork): issue-224-failing-test · commit 9aeca60.

Test added to src/tests.zig:

test "issue-224: codedb_symbol body=true returns only signature — line_end never populated" {
    var arena = std.heap.ArenaAllocator.init(testing.allocator);
    defer arena.deinit();
    const alloc = arena.allocator();

    var explorer = Explorer.init(alloc);

    // Multi-line function: signature on line 1, body on lines 2..4, closing brace on line 5.
    try explorer.indexFile("t.zig",
        \\pub fn foo() u32 {
        \\    const a: u32 = 1;
        \\    const b: u32 = 2;
        \\    return a + b;
        \\}
    );

    const results = try explorer.findAllSymbols("foo", alloc);
    defer alloc.free(results);
    try testing.expect(results.len == 1);

    const sym = results[0].symbol;
    try testing.expectEqual(@as(u32, 1), sym.line_start);
    // With the bug, line_end == line_start (== 1). After the fix, it must reach
    // the closing brace on line 5.
    try testing.expectEqual(@as(u32, 5), sym.line_end);

    // Full-body extraction via getSymbolBody — the exact path codedb_symbol body=true
    // takes — must contain every body line, not just the signature.
    const body = (try explorer.getSymbolBody("t.zig", sym.line_start, sym.line_end, alloc)) orelse
        return error.TestUnexpectedResult;
    try testing.expect(std.mem.indexOf(u8, body, "pub fn foo()") != null);
    try testing.expect(std.mem.indexOf(u8, body, "const a: u32 = 1;") != null);
    try testing.expect(std.mem.indexOf(u8, body, "const b: u32 = 2;") != null);
    try testing.expect(std.mem.indexOf(u8, body, "return a + b;") != null);
}

Reproduce from upstream main:

# From a clone of the upstream repo
git remote add ocordeiro https://github.com/ocordeiro/codedb.git
git fetch ocordeiro issue-224-failing-test
git checkout ocordeiro/issue-224-failing-test
zig build test 2>&1 | grep "issue-224"

Fails with:

error: 'tests.test.issue-224: codedb_symbol body=true returns only signature — line_end never populated' failed: expected 5, found 1
src/tests.zig:5397:5: in test.issue-224: ... (test)
    try testing.expectEqual(@as(u32, 5), sym.line_end);

All other 323 tests pass.

Expected

For every multi-line symbol kind, Symbol.line_end must point to the closing delimiter of the block (last } for brace languages, last body line for Python, end keyword for Ruby). codedb_symbol body=true and explorer.getSymbolBody(...) should then return the full source of the symbol in a single call, eliminating the follow-up codedb_read that the flag was added to avoid.

Single-line kinds (.import, .variable, .constant, .comment_block, .type_alias, .macro_def) should remain with line_end == line_start.

Fix

Post-processing pass in indexFileInner (src/explore.zig), inserted between the end of the streaming parse loop and the self.mu.lock() call (while content is still in scope). The parsers themselves are not modified — a single helper, computeSymbolEnds(content, &outline), walks the already-populated outline.symbols and fills in line_end per language:

  • Brace languages (Zig, C, C++, TypeScript, JavaScript, Rust, Go, PHP): scan forward from line_start, tracking brace depth and skipping "..." strings, // line comments and /* */ block comments, until the first opened { is closed. Bails out after 10 lines without an opener so forward declarations / abstract methods stay single-line.
  • Python: indent-based. First contiguous run of lines whose indent is strictly greater than the signature's. Tolerates multi-line signatures, blank lines, and comments inside the body.
  • Ruby: indent-based plus standalone end snapping at signature indent.

Naturally single-line kinds are short-circuited. Struct/enum/union defs already use dedicated .*_def kinds, so they flow through brace scanning correctly.

This design was chosen over refactoring each parseXLine:

  1. The parsers are streaming per-line — giving them access to the rest of the file would require restructuring all 7.
  2. A single fix location covers every current language and any future one.
  3. Nothing about symbol detection — which has test coverage — is touched.

Side effects checked:

  • extractLines (src/explore.zig) already supports real line_end values (1-based, inclusive) — unchanged.
  • getSymbolBody and the codedb_symbol handler in src/mcp.zig:783 — unchanged.
  • codedb_outline rendering reads only line_start — unaffected.
  • Snapshots persisted on disk still contain the old line_end == line_start values; users will need to re-index to pick up correct ranges after the fix. Worth a note in the changelog.

Fix branch (on fork): fix/issue-224-symbol-line-end · commit 8919fb8.

zig build test --summary all: 333/333 passing with 9 new tests covering the regression plus one per language (Zig fn, Zig struct_def, Python def, Python class, TS function, Rust fn, Go func, PHP function, Ruby def) and an abstract-signature edge case.

End-to-end verification

Smoke-tested against the fix binary by calling codedb_symbol name=computeSymbolEnds body=true via MCP — returned all 36 lines of the function body (L1939–L1974 in src/explore.zig), not just the signature line.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions