You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The C# symbol extractor's returnType alternation accepts a "plain tuple" return (int, string) Foo() via the \([^)]+\) branch, and accepts a "tuple inside a generic" like Task<(int, string)> only after the fix in #241. But a third shape is silently dropped in both the current code and the proposed #241 fix:
Tuple with a trailing suffix — [], [,], [][], or ?. The tuple matches the first alternative of returnType, but there is no character after \) in the alternation to consume trailing [/]/?, so the regex hits the mandatory \s+(?<name>\w+) anchor against a [ or ? and the whole method/property line fails.
These are not rare. Tuple arrays are the idiomatic shape for "collection of paired results" when the user doesn't want to define a record (CSV parsers, matrix-row enumerators, coordinate lists). Nullable tuples are the canonical TryFind-style return in modern C# 10+ code ((User user, Error err)? result). Every one is silently absent from symbols / definition / inspect / outline.
This is distinct from #241: #241's fix widens the identifier-class branch to allow a balanced (...) embedded inside generic type parameters (Dictionary<string, (int, int)>). The present issue is about a suffix after the tuple branch itself ((int, int)[] / (int, int)?), which #241's suggested regex does not address — its first alternative is still a bare \([^()]+\) with no trailing [\w\[\]?]* slot.
Repro
CDIDX=/root/.local/bin/cdidx
mkdir -p /tmp/dogfood/cs-tuple-suffix
cat > /tmp/dogfood/cs-tuple-suffix/T.cs <<'EOF'namespace Demo;public class Svc{ // === Tuple with trailing []/?/[][] — expected to be captured, all DROPPED === public (int, int)[] A() => new (int, int)[0]; // line 6 public (int x, int y)[] B() => new (int x, int y)[0]; // line 7 public (int, int)? C() => null; // line 8 public (int x, int y)? D() => null; // line 9 public (int, int)[][] E() => new (int, int)[0][]; // line 10 public (int, int)[] Ap { get; set; } = System.Array.Empty<(int, int)>(); public (int, int)? Np { get; set; } // line 12 public (int, int)[] Fp => new (int, int)[0]; // line 13 // === Baseline — captured today === public (int, int) Plain() => (0, 0); // line 16}EOF"$CDIDX" index /tmp/dogfood/cs-tuple-suffix --rebuild >/dev/null
"$CDIDX" symbols --db /tmp/dogfood/cs-tuple-suffix/.cdidx/codeindex.db --path T.cs
Observed (actual):
function Plain T.cs:16
class Svc T.cs:3-17
namespace Demo T.cs:1
(3 symbols in 1 files)
Only the plain tuple return is captured. All 9 tuple-with-suffix members (6 methods, 1 property field-init, 1 nullable auto-property, 1 expression-bodied property) are dropped.
Downstream effects:
definition A / B / C / D / E return No symbols found. on any library with tuple-array or nullable-tuple APIs (e.g. (double x, double y)[] coordinate buffers, (int id, string name)? optional lookup results).
outline of a CSV parser / matrix helper / coordinate-math library understates property/method surface.
references on the tuple-typed property is empty because the symbol doesn't exist.
symbols --kind property --count on a struct that exposes (T, U)[] data is biased low.
Suspected root cause (from reading the source)
src/CodeIndex/Indexer/SymbolExtractor.cs:94 — the C# method regex:
Branch 1 (\([^)]+\)): eats exactly one balanced (...) group. Nothing follows. So (int, int)[] matches only the (int, int) prefix, leaving [] A() on the tape. Next \s+(?<name>\w+) expects whitespace but sees [; fail.
Branch 2 (identifier class): characters allowed are \w ? . < > [ ] , :. ( and ) are both excluded, so the branch refuses to start on (int, int)[].
Both (int, int)? (nullable tuple) and (int, int)[] (tuple array) and (int, int)[,] (tuple rectangular-array) and (int, int)[][] (jagged tuple array) die on the same boundary.
Same gap on properties:
:100 — property { get/set/init }: uses the single-alternative (?:global::)?[\w?.<>\[\],:]+ (no tuple branch at all).
:103 — expression-bodied property: same single-alternative class.
So even a plain (int, int) return without suffix fails on a property — only methods have the tuple branch. Property (int, int) P { get; } is also dropped (not included in this repro, but trivially reproducible).
The fix that covers this issue (and as a bonus also covers the plain-tuple property case) is to widen the returnType alternation so Branch 1 accepts trailing suffixes:
(Written across lines for readability; in a one-liner this becomes ((?:\([^)]+\)|(?:global::)?[\w?.<>\[\],:]+)(?:\?|\[[\],\s]*\])*).) Combined with #241's suggestion of letting (...) appear embedded in the identifier class, this covers:
The char class in Branch 2 already allows [/]/?, so the suffix loop does not need to be repeated for Branch 2 on pure identifier types — it is specifically the tuple alternative that was missing the suffix.
Suggested direction
Apply the suffix widening at SymbolExtractor.cs:94 (method), :100 (property with get/set/init), :103 (expression-bodied property), :116 (explicit interface impl), and :118 (indexer). The property/indexer regexes additionally need the tuple branch added to their alternation, since today they only have the identifier branch.
After the fix lands, verify cdidx symbols --db <...> on the above fixture produces 10 rows (9 members + class + namespace minus namespace, plus Plain).
Why it matters
Tuple arrays are the natural shape for "ordered list of paired values" without defining a new type — common in graph libraries ((int from, int to)[] Edges), parsers ((Token tok, int pos)[]), numerical code ((double x, double y)[] coordinate arrays), and CSV readers.
Nullable tuples are the canonical TryFind-style return in modern C# ((User user, Error err)? TryGet(...) or a lightweight (double x, double y)? result from a root-finding routine). Every Try* API that wants two outputs without out parameters uses this shape.
Jagged tuple arrays appear in 2D result buffers and in sparse representations.
All silent drops — no warning, no degraded confidence — so users first notice when definition returns zero on a method they can see in their editor.
Cross-language note
Swift has tuple types with suffix options ((Int, Int)?, [(Int, Int)]); Swift's return-type regex is in its own row set and not affected by the C# extractor.
Rust tuple types use (T, U) and arrays use [T; N] — a Rust tuple-array is [(T, U); N], shape-different from C#. Separate pattern.
Kotlin has no built-in tuple syntax (uses Pair<A, B> / Triple<A, B, C>), so this regex gap does not apply.
TypeScript has tuple-like arrays [T, U][] and optional tuples [T, U] | null; the TS extractor row set would need separate verification.
Fix is C#-scoped.
Scope
src/CodeIndex/Indexer/SymbolExtractor.cs:94, :100, :103, :116, :118 — widen returnType alternation to accept trailing []/[,]/? suffixes after a tuple group; additionally add the tuple branch to property / indexer / explicit-interface regexes.
tests/CodeIndex.Tests/SymbolExtractorTests.cs — fixtures for each suffix variant on both methods and properties.
DEVELOPER_GUIDE.md language-pattern table — note tuple-with-suffix is covered for C#.
Summary
The C# symbol extractor's returnType alternation accepts a "plain tuple" return
(int, string) Foo()via the\([^)]+\)branch, and accepts a "tuple inside a generic" likeTask<(int, string)>only after the fix in #241. But a third shape is silently dropped in both the current code and the proposed #241 fix:Tuple with a trailing suffix —
[],[,],[][], or?. The tuple matches the first alternative ofreturnType, but there is no character after\)in the alternation to consume trailing[/]/?, so the regex hits the mandatory\s+(?<name>\w+)anchor against a[or?and the whole method/property line fails.Concrete shapes dropped on v1.10.0:
These are not rare. Tuple arrays are the idiomatic shape for "collection of paired results" when the user doesn't want to define a record (CSV parsers, matrix-row enumerators, coordinate lists). Nullable tuples are the canonical
TryFind-style return in modern C# 10+ code ((User user, Error err)? result). Every one is silently absent fromsymbols/definition/inspect/outline.This is distinct from #241: #241's fix widens the identifier-class branch to allow a balanced
(...)embedded inside generic type parameters (Dictionary<string, (int, int)>). The present issue is about a suffix after the tuple branch itself ((int, int)[]/(int, int)?), which #241's suggested regex does not address — its first alternative is still a bare\([^()]+\)with no trailing[\w\[\]?]*slot.Repro
Observed (actual):
Only the plain tuple return is captured. All 9 tuple-with-suffix members (6 methods, 1 property field-init, 1 nullable auto-property, 1 expression-bodied property) are dropped.
Downstream effects:
definition A/B/C/D/EreturnNo symbols found.on any library with tuple-array or nullable-tuple APIs (e.g.(double x, double y)[]coordinate buffers,(int id, string name)?optional lookup results).outlineof a CSV parser / matrix helper / coordinate-math library understates property/method surface.referenceson the tuple-typed property is empty because the symbol doesn't exist.symbols --kind property --counton a struct that exposes(T, U)[]data is biased low.Suspected root cause (from reading the source)
src/CodeIndex/Indexer/SymbolExtractor.cs:94— the C# method regex:The returnType alternation:
\([^)]+\)): eats exactly one balanced(...)group. Nothing follows. So(int, int)[]matches only the(int, int)prefix, leaving[] A()on the tape. Next\s+(?<name>\w+)expects whitespace but sees[; fail.\w ? . < > [ ] , :.(and)are both excluded, so the branch refuses to start on(int, int)[].Both
(int, int)?(nullable tuple) and(int, int)[](tuple array) and(int, int)[,](tuple rectangular-array) and(int, int)[][](jagged tuple array) die on the same boundary.Same gap on properties:
:100— property{ get/set/init }: uses the single-alternative(?:global::)?[\w?.<>\[\],:]+(no tuple branch at all).:103— expression-bodied property: same single-alternative class.So even a plain
(int, int)return without suffix fails on a property — only methods have the tuple branch. Property(int, int) P { get; }is also dropped (not included in this repro, but trivially reproducible).The fix that covers this issue (and as a bonus also covers the plain-tuple property case) is to widen the returnType alternation so Branch 1 accepts trailing suffixes:
(Written across lines for readability; in a one-liner this becomes
((?:\([^)]+\)|(?:global::)?[\w?.<>\[\],:]+)(?:\?|\[[\],\s]*\])*).) Combined with #241's suggestion of letting(...)appear embedded in the identifier class, this covers:(int, int) M()✓ (already works)(int, int)[] M()✓ (this issue)(int, int)? M()✓ (this issue)(int, int)[][] M()✓ (this issue)(int, int)[,] M()✓ (this issue, rectangular)Task<(int, int)> M()✓ (needs C#: methods returning a generic over a tuple type (Task<(int, string)>,Dictionary<string, (int x, int y)>) are dropped from the symbol index #241's embedded-paren fix)Task<(int, int)>[] M()✓ (both fixes applied)The char class in Branch 2 already allows
[/]/?, so the suffix loop does not need to be repeated for Branch 2 on pure identifier types — it is specifically the tuple alternative that was missing the suffix.Suggested direction
SymbolExtractor.cs:94(method),:100(property with get/set/init),:103(expression-bodied property),:116(explicit interface impl), and:118(indexer). The property/indexer regexes additionally need the tuple branch added to their alternation, since today they only have the identifier branch.Task<(int, string)>,Dictionary<string, (int x, int y)>) are dropped from the symbol index #241: both issues touch the same regex lines, and a single patch that handles both embedded-paren-in-generic and tuple-with-trailing-suffix is cleaner than landing them in two separate commits.SymbolExtractorTests.csfixtures for each suffix combination on both methods and properties:(int, int)[] Arr(),(int, int)? Null(),(int, int)[][] Jag(),(int, int)[,] Rect()(int, int)[] Arr { get; },(int, int)? Null { get; }(int, int) Plain()still captures.cdidx symbols --db <...>on the above fixture produces 10 rows (9 members + class + namespace minus namespace, plusPlain).Why it matters
(int from, int to)[] Edges), parsers ((Token tok, int pos)[]), numerical code ((double x, double y)[]coordinate arrays), and CSV readers.TryFind-style return in modern C# ((User user, Error err)? TryGet(...)or a lightweight(double x, double y)? resultfrom a root-finding routine). EveryTry*API that wants two outputs withoutoutparameters uses this shape.definitionreturns zero on a method they can see in their editor.Cross-language note
(Int, Int)?,[(Int, Int)]); Swift's return-type regex is in its own row set and not affected by the C# extractor.(T, U)and arrays use[T; N]— a Rust tuple-array is[(T, U); N], shape-different from C#. Separate pattern.Pair<A, B>/Triple<A, B, C>), so this regex gap does not apply.[T, U][]and optional tuples[T, U] | null; the TS extractor row set would need separate verification.Fix is C#-scoped.
Scope
src/CodeIndex/Indexer/SymbolExtractor.cs:94, :100, :103, :116, :118— widen returnType alternation to accept trailing[]/[,]/?suffixes after a tuple group; additionally add the tuple branch to property / indexer / explicit-interface regexes.tests/CodeIndex.Tests/SymbolExtractorTests.cs— fixtures for each suffix variant on both methods and properties.DEVELOPER_GUIDE.mdlanguage-pattern table — note tuple-with-suffix is covered for C#.Related
Task<(int, string)>,Dictionary<string, (int x, int y)>) are dropped from the symbol index #241 — C# generic-over-tuple return (Task<(int, string)>). Overlapping regex site, different characterization. Landing both together is natural.Task<Result<A, B>>,Dictionary<K, V>) are silently dropped — idiomatic .NET formatting is effectively unindexed #222 — C# generic return types with spaces (same regex line, same modifier-slot family).int*,void**,delegate*<...>,int*[]) #234 — C# pointer / function-pointer return types (same regex line, same "char class too narrow" family).ref/ref readonlyreturn types on methods and properties are silently dropped —public ref T Find(...)produces zero symbols #224 — C#ref/ref readonlyreturn types (same regex family, modifier slot).event Action<string, int> Foo) — event name becomesint, delegate is silently dropped #223 — C# event / delegate with space in generic (same extractor).public partial string Name { get; set; }) silently dropped —partialnot in property-regex modifier list #228 — C# 13 partial properties (property regex, adjacent gap).readonlyproperties silently dropped ANDreadonly get =>accessor bodies misclassified as phantomproperty getrows —readonlymissing from property regex modifier slot #327 — C#readonlyproperty modifier (property regex, adjacent gap).Environment
/root/.local/bin/cdidx, trimmed release build)./tmp/dogfood/cs-tuple-suffix/T.cs;cdidx languagesshowscsharpwithyes/yes.CLOUD_BOOTSTRAP_PROMPT.md.