Skip to content

C#: tuple return types with a trailing suffix ((int, int)[], (int, int)?, (int, int)[][]) silently dropped — returnType tuple branch has no trailing-suffix slot; distinct from #241 #328

@Widthdom

Description

@Widthdom

Summary

The C# symbol extractor's returnType alternation accepts a "plain tuple" return (int, string) Foo() via the \([^)]+\) branch, and accepts a "tuple inside a generic" like Task<(int, string)> only after the fix in #241. But a third shape is silently dropped in both the current code and the proposed #241 fix:

Tuple with a trailing suffix[], [,], [][], or ?. The tuple matches the first alternative of returnType, but there is no character after \) in the alternation to consume trailing [/]/?, so the regex hits the mandatory \s+(?<name>\w+) anchor against a [ or ? and the whole method/property line fails.

Concrete shapes dropped on v1.10.0:

public (int, int)[]   A()  => new (int, int)[0];                  // DROPPED — tuple array
public (int x, int y)[] B()=> new (int x, int y)[0];              // DROPPED — named tuple array
public (int, int)?    C()  => null;                               // DROPPED — nullable tuple
public (int x, int y)? D() => null;                               // DROPPED — named nullable tuple
public (int, int)[]   Ap { get; set; }                            // DROPPED — tuple-array auto-property
public (int, int)?    Np { get; set; }                            // DROPPED — nullable-tuple auto-property
public (int, int)[][] E()  => new (int, int)[0][];                // DROPPED — jagged tuple array
public (int, int)[]   Fp   => new (int, int)[0];                  // DROPPED — expression-bodied property

These are not rare. Tuple arrays are the idiomatic shape for "collection of paired results" when the user doesn't want to define a record (CSV parsers, matrix-row enumerators, coordinate lists). Nullable tuples are the canonical TryFind-style return in modern C# 10+ code ((User user, Error err)? result). Every one is silently absent from symbols / definition / inspect / outline.

This is distinct from #241: #241's fix widens the identifier-class branch to allow a balanced (...) embedded inside generic type parameters (Dictionary<string, (int, int)>). The present issue is about a suffix after the tuple branch itself ((int, int)[] / (int, int)?), which #241's suggested regex does not address — its first alternative is still a bare \([^()]+\) with no trailing [\w\[\]?]* slot.

Repro

CDIDX=/root/.local/bin/cdidx
mkdir -p /tmp/dogfood/cs-tuple-suffix
cat > /tmp/dogfood/cs-tuple-suffix/T.cs <<'EOF'
namespace Demo;

public class Svc
{
    // === Tuple with trailing []/?/[][]  — expected to be captured, all DROPPED ===
    public (int, int)[]        A()  => new (int, int)[0];              // line 6
    public (int x, int y)[]    B()  => new (int x, int y)[0];          // line 7
    public (int, int)?         C()  => null;                           // line 8
    public (int x, int y)?     D()  => null;                           // line 9
    public (int, int)[][]      E()  => new (int, int)[0][];            // line 10
    public (int, int)[] Ap { get; set; } = System.Array.Empty<(int, int)>();
    public (int, int)? Np { get; set; }                                // line 12
    public (int, int)[] Fp => new (int, int)[0];                       // line 13

    // === Baseline — captured today ===
    public (int, int)          Plain() => (0, 0);                       // line 16
}
EOF
"$CDIDX" index /tmp/dogfood/cs-tuple-suffix --rebuild >/dev/null
"$CDIDX" symbols --db /tmp/dogfood/cs-tuple-suffix/.cdidx/codeindex.db --path T.cs

Observed (actual):

function   Plain                                    T.cs:16
class      Svc                                      T.cs:3-17
namespace  Demo                                     T.cs:1
(3 symbols in 1 files)

Only the plain tuple return is captured. All 9 tuple-with-suffix members (6 methods, 1 property field-init, 1 nullable auto-property, 1 expression-bodied property) are dropped.

Downstream effects:

  • definition A / B / C / D / E return No symbols found. on any library with tuple-array or nullable-tuple APIs (e.g. (double x, double y)[] coordinate buffers, (int id, string name)? optional lookup results).
  • outline of a CSV parser / matrix helper / coordinate-math library understates property/method surface.
  • references on the tuple-typed property is empty because the symbol doesn't exist.
  • symbols --kind property --count on a struct that exposes (T, U)[] data is biased low.

Suspected root cause (from reading the source)

src/CodeIndex/Indexer/SymbolExtractor.cs:94 — the C# method regex:

new("function",  new Regex(
    @"^\s*(?!(?:await|return|throw|yield|var|typeof|sizeof|nameof|default|if|for|foreach|while|switch|catch|lock|using|case|else|when|break|continue|goto)\b)"
  + @"(?:(?<visibility>public|private|protected\s+internal|private\s+protected|protected|internal)\s+)?"
  + @"(?:(?:static|sealed|partial|readonly|unsafe|extern|virtual|override|abstract|async|new|file)\s+)*"
  + @"(?<returnType>\([^)]+\)|(?:global::)?[\w?.<>\[\],:]+)\s+(?<name>\w+)\s*(?:<[^>]+>\s*)?\(",
    RegexOptions.Compiled),
    BodyStyle.Brace, "visibility", "returnType"),

The returnType alternation:

\([^)]+\)  |  (?:global::)?[\w?.<>\[\],:]+
  • Branch 1 (\([^)]+\)): eats exactly one balanced (...) group. Nothing follows. So (int, int)[] matches only the (int, int) prefix, leaving [] A() on the tape. Next \s+(?<name>\w+) expects whitespace but sees [; fail.
  • Branch 2 (identifier class): characters allowed are \w ? . < > [ ] , :. ( and ) are both excluded, so the branch refuses to start on (int, int)[].

Both (int, int)? (nullable tuple) and (int, int)[] (tuple array) and (int, int)[,] (tuple rectangular-array) and (int, int)[][] (jagged tuple array) die on the same boundary.

Same gap on properties:

  • :100 — property { get/set/init }: uses the single-alternative (?:global::)?[\w?.<>\[\],:]+ (no tuple branch at all).
  • :103 — expression-bodied property: same single-alternative class.

So even a plain (int, int) return without suffix fails on a property — only methods have the tuple branch. Property (int, int) P { get; } is also dropped (not included in this repro, but trivially reproducible).

The fix that covers this issue (and as a bonus also covers the plain-tuple property case) is to widen the returnType alternation so Branch 1 accepts trailing suffixes:

(?<returnType>
    (?: \([^)]+\) | (?:global::)?[\w?.<>\[\],:]+ )
    (?: \?          // nullable
      | \[[\],\s]*\] // array / rectangular array
    )*
)

(Written across lines for readability; in a one-liner this becomes ((?:\([^)]+\)|(?:global::)?[\w?.<>\[\],:]+)(?:\?|\[[\],\s]*\])*).) Combined with #241's suggestion of letting (...) appear embedded in the identifier class, this covers:

The char class in Branch 2 already allows [/]/?, so the suffix loop does not need to be repeated for Branch 2 on pure identifier types — it is specifically the tuple alternative that was missing the suffix.

Suggested direction

  1. Apply the suffix widening at SymbolExtractor.cs:94 (method), :100 (property with get/set/init), :103 (expression-bodied property), :116 (explicit interface impl), and :118 (indexer). The property/indexer regexes additionally need the tuple branch added to their alternation, since today they only have the identifier branch.
  2. Coordinate with C#: methods returning a generic over a tuple type (Task<(int, string)>, Dictionary<string, (int x, int y)>) are dropped from the symbol index #241: both issues touch the same regex lines, and a single patch that handles both embedded-paren-in-generic and tuple-with-trailing-suffix is cleaner than landing them in two separate commits.
  3. Add SymbolExtractorTests.cs fixtures for each suffix combination on both methods and properties:
    • (int, int)[] Arr(), (int, int)? Null(), (int, int)[][] Jag(), (int, int)[,] Rect()
    • (int, int)[] Arr { get; }, (int, int)? Null { get; }
    • Regression: (int, int) Plain() still captures.
  4. After the fix lands, verify cdidx symbols --db <...> on the above fixture produces 10 rows (9 members + class + namespace minus namespace, plus Plain).

Why it matters

  • Tuple arrays are the natural shape for "ordered list of paired values" without defining a new type — common in graph libraries ((int from, int to)[] Edges), parsers ((Token tok, int pos)[]), numerical code ((double x, double y)[] coordinate arrays), and CSV readers.
  • Nullable tuples are the canonical TryFind-style return in modern C# ((User user, Error err)? TryGet(...) or a lightweight (double x, double y)? result from a root-finding routine). Every Try* API that wants two outputs without out parameters uses this shape.
  • Jagged tuple arrays appear in 2D result buffers and in sparse representations.
  • All silent drops — no warning, no degraded confidence — so users first notice when definition returns zero on a method they can see in their editor.

Cross-language note

  • Swift has tuple types with suffix options ((Int, Int)?, [(Int, Int)]); Swift's return-type regex is in its own row set and not affected by the C# extractor.
  • Rust tuple types use (T, U) and arrays use [T; N] — a Rust tuple-array is [(T, U); N], shape-different from C#. Separate pattern.
  • Kotlin has no built-in tuple syntax (uses Pair<A, B> / Triple<A, B, C>), so this regex gap does not apply.
  • TypeScript has tuple-like arrays [T, U][] and optional tuples [T, U] | null; the TS extractor row set would need separate verification.

Fix is C#-scoped.

Scope

  • src/CodeIndex/Indexer/SymbolExtractor.cs:94, :100, :103, :116, :118 — widen returnType alternation to accept trailing []/[,]/? suffixes after a tuple group; additionally add the tuple branch to property / indexer / explicit-interface regexes.
  • tests/CodeIndex.Tests/SymbolExtractorTests.cs — fixtures for each suffix variant on both methods and properties.
  • DEVELOPER_GUIDE.md language-pattern table — note tuple-with-suffix is covered for C#.

Related

Environment

  • cdidx v1.10.0 (/root/.local/bin/cdidx, trimmed release build).
  • Repro fixture above at /tmp/dogfood/cs-tuple-suffix/T.cs; cdidx languages shows csharp with yes/yes.
  • Platform: linux-x64 cloud container.
  • Filed from a cloud Claude Code session per CLOUD_BOOTSTRAP_PROMPT.md.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions