You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
C# methods / properties with contextual-keyword modifier (partial, required, readonly) + tuple-suffix return type emit phantom function partial / function required / function readonly rows via ctor-regex fallback #349
When a C# method or property has the shape visibility<space>KEYWORD<space>(TupleOrTupleSuffix) Name and the primary method / property regex fails to match (because its returnType slot cannot accept a tuple with a trailing ? / [] suffix — see #328 / #338), the lower-priority constructor regex at SymbolExtractor.cs:97 claims the line with visibility as visibility and the modifier keyword itself (partial, required, readonly) as the constructor name. The result is a phantom function partial / function required / function readonly row whose "column" is the modifier keyword, not any real identifier.
// line 6 → phantom function partial (real P1 dropped)publicpartial(int,int)?P1();// line 14 → phantom function required (real R1 dropped)publicrequired(int,int)R1{get;init;}// line 35 → phantom function readonly (real M dropped)publicreadonly(int,int)?M()=>null;
This is a new phantom family distinct from the existing method-regex-backtrack phantoms (function static#342, function readonly#336, function delegate#340, function const#346) because the mechanism is different: those come from the method regex visibility-group backtracking inside :94. The phantoms here come from the ctor regex at :97 matching public\s+\w+\s*\( on a line where the method/property regex failed.
Repro
CDIDX=/root/.local/bin/cdidx
mkdir -p /tmp/dogfood/cs-modifier-phantom
cat > /tmp/dogfood/cs-modifier-phantom/M.cs <<'EOF'namespace ModifierPhantom;// `partial` + tuple-suffix — method regex fails → ctor regex grabs `partial` as namepublic partial class A{ public partial (int, int)? P1(); public partial (int, int)[] P2(); public partial System.Collections.Generic.List<(int, int)> P3(); // no phantom; see note}// `required` + tuple property — property regex fails → ctor regex grabs `required`public class B{ public required (int, int) R1 { get; init; } public required (int, int)? R2 { get; init; }}// Plain tuple-suffix methods without an extra modifier — silent drop, NO phantompublic class C{ public (int, int)? M1() => null; public (int, int)[] M2() => null; public (int, int) M3() => (0, 0); // baseline tuple — CAPTURED}// `readonly` on method inside `readonly struct` + tuple-suffix — ctor regex grabs `readonly`public class D{ public readonly struct E { public readonly (int, int)? M() => null; }}EOF"$CDIDX" index /tmp/dogfood/cs-modifier-phantom --rebuild
"$CDIDX" symbols --db /tmp/dogfood/cs-modifier-phantom/.cdidx/codeindex.db
Observed:
class A M.cs:4-9
class B M.cs:12-16
class C M.cs:19-27
class D M.cs:30-37
struct E M.cs:32-36
function M3 M.cs:26 ← baseline tuple captured
function partial M.cs:6 ← phantom (real P1)
function partial M.cs:7 ← phantom (real P2)
function readonly M.cs:35 ← phantom (real M)
function required M.cs:14 ← phantom (real R1)
function required M.cs:15 ← phantom (real R2)
namespace ModifierPhantom M.cs:1
Note that P3 (public partial System.Collections.Generic.List<(int, int)> P3();) does not produce a phantom because the ctor regex public\s+\w+\s*\( fails — after partial the next token is System (letter), not (. Phantoms only emit when the character immediately after the modifier keyword's trailing space is (.
Similarly, plain tuple-suffix methods (M1, M2) drop silently without a phantom because their shape is public (...) with no \w+ between public and ( — the ctor regex needs a name token between visibility and the open paren.
This regex greedily matches any line starting with a visibility keyword + a single identifier token + (. It assumes that at this point in the PatternCache, the earlier, more specific rows (method :94, property :100/:103, indexer :118, explicit interface :116) have been tried and failed.
The problem is that none of those earlier rows' returnType slots accept a tuple with a trailing ? / [] / * suffix. When the tuple-suffix method/property regex fails:
public partial (int, int)? P1(); — method regex :94 tries returnType=(int, int) tuple-alt, then needs \s+name, but the next char is ?. Fails.
Ctor regex :97 tries visibility=public, name=partial, then \s*\( — the space after partial and the ( of the tuple satisfy it. Match wins.
A symbol is emitted with name="partial", visibility="public", BodyStyle.Brace.
The ctor regex has no guard against:
Contextual keywords that are not valid constructor names (partial, required, readonly, async, sealed, virtual, override, abstract, new, file, unsafe, extern, static).
Cases where the \s*\( is the start of a tuple literal rather than a parameter list (no way to tell from the ctor regex's limited shape).
For the real constructor case this regex is intended to handle (public Foo() { }), the name Foo would be the class name. So an extra guard that rejects contextual keyword names would not break legitimate ctor matches.
Suggested direction
(A) Add a keyword-name negative lookahead to the ctor regex at :97:
The negative lookahead rejects modifier keywords and other C# contextual keywords as ctor names. Real ctors use the class identifier, which is never a keyword.
(B) Fix the upstream tuple-suffix support in :94 / :100 / :103.#328 / #338 already track this. Once the method/property regex accepts (TupleAlt)(?:\?|\[\])* as returnType, the failing-fallback path to ctor regex goes away for tuple-suffix cases. This is the structurally correct fix but spans multiple regex rows.
Preferred: (A) in addition to (B). (A) hardens the ctor regex against the class of phantom regardless of which upstream regex happens to fail. (B) fixes the immediate families in #328/#338/#347. Both should ship; (A) is a single-line change that remains useful after (B) lands because other upstream regex failures (e.g. unusual where clauses, C# 14 extension blocks, future syntax) would trigger the same ctor-regex-grabs-keyword pattern.
Why it matters
Phantom function partial / function required / function readonly rows pollute symbols, definition, outline, hotspots, and unused. A project with N partial method declarations containing tuple-suffix returns emits N phantoms, each at the declaration line.
Real symbols are lost. Alongside each phantom, the real method/property is silently dropped. definition P1 --exact returns No definitions found. on the repro above.
callers / callees / impact miss the real symbols. Call graphs terminate prematurely because the real definitions aren't indexed; references still land as raw call-site matches but can't be resolved to a defined symbol.
Visible in every modern C# codebase with required members (C# 11+ DTOs), partial methods with richer return types (Razor / source generators / minimal APIs), or readonly members on readonly structs.
Silent. No warning is emitted; the phantom looks like a normal function row unless the user notices the name is a C# keyword.
Cross-language note
C# — documented here.
Java — Java's ctor regex shape is similar but Java has no equivalent modifier keywords (partial, required, readonly-as-member-modifier) that combine with tuple-like return syntax. Java tuples don't exist, and Record primary ctors are a class-level construct. Not affected by this specific mechanism.
Kotlin / Swift / Rust — different ctor/syntax, not affected.
Scope
src/CodeIndex/Indexer/SymbolExtractor.cs:97 — add keyword-name negative lookahead to the ctor regex.
tests/CodeIndex.Tests/SymbolExtractorTests.cs — fixtures asserting no function partial / function required / function readonly phantoms for the shapes in the repro, and regression for real same-line ctors (public Foo() { } still captured).
Related upstream regexes that, when fixed, remove the triggering failure:
Summary
When a C# method or property has the shape
visibility<space>KEYWORD<space>(TupleOrTupleSuffix) Nameand the primary method / property regex fails to match (because itsreturnTypeslot cannot accept a tuple with a trailing?/[]suffix — see #328 / #338), the lower-priority constructor regex atSymbolExtractor.cs:97claims the line withvisibilityas visibility and the modifier keyword itself (partial,required,readonly) as the constructor name. The result is a phantomfunction partial/function required/function readonlyrow whose "column" is the modifier keyword, not any real identifier.This is a new phantom family distinct from the existing method-regex-backtrack phantoms (
function static#342,function readonly#336,function delegate#340,function const#346) because the mechanism is different: those come from the method regex visibility-group backtracking inside:94. The phantoms here come from the ctor regex at:97matchingpublic\s+\w+\s*\(on a line where the method/property regex failed.Repro
Observed:
Note that
P3(public partial System.Collections.Generic.List<(int, int)> P3();) does not produce a phantom because the ctor regexpublic\s+\w+\s*\(fails — afterpartialthe next token isSystem(letter), not(. Phantoms only emit when the character immediately after the modifier keyword's trailing space is(.Similarly, plain tuple-suffix methods (
M1,M2) drop silently without a phantom because their shape ispublic (...)with no\w+betweenpublicand(— the ctor regex needs a name token between visibility and the open paren.Suspected root cause
src/CodeIndex/Indexer/SymbolExtractor.cs:97(instance constructor regex):This regex greedily matches any line starting with a visibility keyword + a single identifier token +
(. It assumes that at this point in thePatternCache, the earlier, more specific rows (method:94, property:100/:103, indexer:118, explicit interface:116) have been tried and failed.The problem is that none of those earlier rows'
returnTypeslots accept a tuple with a trailing?/[]/*suffix. When the tuple-suffix method/property regex fails:public partial (int, int)? P1();— method regex:94tries returnType=(int, int)tuple-alt, then needs\s+name, but the next char is?. Fails.:97tries visibility=public, name=partial, then\s*\(— the space afterpartialand the(of the tuple satisfy it. Match wins.name="partial",visibility="public",BodyStyle.Brace.The ctor regex has no guard against:
partial,required,readonly,async,sealed,virtual,override,abstract,new,file,unsafe,extern,static).\s*\(is the start of a tuple literal rather than a parameter list (no way to tell from the ctor regex's limited shape).For the real constructor case this regex is intended to handle (
public Foo() { }), the nameFoowould be the class name. So an extra guard that rejects contextual keyword names would not break legitimate ctor matches.Suggested direction
(A) Add a keyword-name negative lookahead to the ctor regex at
:97:The negative lookahead rejects modifier keywords and other C# contextual keywords as ctor names. Real ctors use the class identifier, which is never a keyword.
(B) Fix the upstream tuple-suffix support in
:94/:100/:103. #328 / #338 already track this. Once the method/property regex accepts(TupleAlt)(?:\?|\[\])*as returnType, the failing-fallback path to ctor regex goes away for tuple-suffix cases. This is the structurally correct fix but spans multiple regex rows.Preferred: (A) in addition to (B). (A) hardens the ctor regex against the class of phantom regardless of which upstream regex happens to fail. (B) fixes the immediate families in #328/#338/#347. Both should ship; (A) is a single-line change that remains useful after (B) lands because other upstream regex failures (e.g. unusual
whereclauses, C# 14 extension blocks, future syntax) would trigger the same ctor-regex-grabs-keyword pattern.Why it matters
function partial/function required/function readonlyrows pollutesymbols,definition,outline,hotspots, andunused. A project with Npartialmethod declarations containing tuple-suffix returns emits N phantoms, each at the declaration line.definition P1 --exactreturnsNo definitions found.on the repro above.callers/callees/impactmiss the real symbols. Call graphs terminate prematurely because the real definitions aren't indexed; references still land as raw call-site matches but can't be resolved to a defined symbol.requiredmembers (C# 11+ DTOs),partialmethods with richer return types (Razor / source generators / minimal APIs), orreadonlymembers onreadonly structs.Cross-language note
partial,required,readonly-as-member-modifier) that combine with tuple-like return syntax. Java tuples don't exist, andRecordprimary ctors are a class-level construct. Not affected by this specific mechanism.Scope
src/CodeIndex/Indexer/SymbolExtractor.cs:97— add keyword-name negative lookahead to the ctor regex.tests/CodeIndex.Tests/SymbolExtractorTests.cs— fixtures asserting nofunction partial/function required/function readonlyphantoms for the shapes in the repro, and regression for real same-line ctors (public Foo() { }still captured).:94method regex — tuple-suffix returnType (C#: tuple return types with a trailing suffix ((int, int)[],(int, int)?,(int, int)[][]) silently dropped — returnType tuple branch has no trailing-suffix slot; distinct from #241 #328, C# properties with spaced generics (Func<int, int, int>) or tuple types ((int, int)) are silently dropped — returnType char class on property regexes lacks\sand(/)#338).:100,:103property regex — tuple returnType slot lacks tuple-alt altogether (C# explicit interface property implementation (int IFoo.Value { get; set; }) is silently dropped #333).Related
(int, int)[],(int, int)?,(int, int)[][]) silently dropped — returnType tuple branch has no trailing-suffix slot; distinct from #241 #328 — C# tuple return types with trailing suffix(int, int)[],(int, int)?,(int, int)[][]silently dropped on method row. Upstream root of this phantom family.Func<int, int, int>) or tuple types ((int, int)) are silently dropped — returnType char class on property regexes lacks\sand(/)#338 — C# tuple return types on method row (adjacent to C#: tuple return types with a trailing suffix ((int, int)[],(int, int)?,(int, int)[][]) silently dropped — returnType tuple branch has no trailing-suffix slot; distinct from #241 #328).int IFoo.Value { get; set; }) is silently dropped #333 — C# property with tuple return silently dropped (upstream root forrequired+ tuple phantom).void M<[GenAttr<int>] T>(T t)) are silently dropped — the generic-param group<[^>]+>cannot cross the nested>inside the attribute #347 — generic-attribute on type parameter method dropped (different upstream trigger, same "silent drop while ctor regex may pick up phantom" family — didn't emit phantom in C# methods with a generic attribute applied to a type parameter (void M<[GenAttr<int>] T>(T t)) are silently dropped — the generic-param group<[^>]+>cannot cross the nested>inside the attribute #347 because the line shape didn't matchpublic\s+\w+\s*\().function staticrows (and the real operator is silently dropped) — operator regex\S+can't cross tuple whitespace, method regex backtracks intopublic→returnType /static→name #342 / C# readonly fields with tuple types (public readonly (int, int) X;) emit phantomfunction readonlyrows — method regex backtrackspublicinto returnType #336 / C# delegates with tuple return types (public delegate (int, int) MakePair();) are dropped AND emit phantomfunction delegaterows #340 / C# const fields with tuple types (public const (int, int) Pair = (1, 2);) emit phantomfunction constrows — const row:69lacks tuple support, method regex:94backtrackspublicinto returnType andconstinto name #346 — method-regex visibility-backtrack phantoms. Same "phantom function row with modifier keyword as name" symptom, different mechanism (method regex backtracking vs ctor regex claiming the line).: base(...)/: this(...)leaks phantomfunction base/function thissymbols #331 — C# wrapped: base(...)/: this(...)phantomfunction base/function this. Different mechanism (returnTypechar class includes:), different manifestation.static/publicon its own line) are silently dropped #348 — wrapped-modifier ctor silent drop. Adjacent: ctor regex sensitivity to modifier placement.Environment
/root/.local/bin/cdidx)./tmp/dogfood/cs-modifier-phantom/M.cs.CLOUD_BOOTSTRAP_PROMPT.md.