Skip to content

C#: readonly properties silently dropped AND readonly get => accessor bodies misclassified as phantom property get rows — readonly missing from property regex modifier slot #327

@Widthdom

Description

@Widthdom

Summary

readonly is a valid property/accessor modifier on C# 8+ struct members (and implicit on readonly struct / readonly ref struct members), but the C# property regex modifier slot omits it. This causes two distinct visible failures from the same root cause:

  1. Every readonly property is silently dropped. All three shapes — expression-bodied (public readonly int X => _v;), auto-property (public readonly int X { get; }), and accessor-body (public readonly int X { get => _v; }) — never surface in symbols / definition / outline / inspect.
  2. Per-accessor readonly get/set => … lines are captured as phantom property get (or property set) rows. In a property where the overall property line is already lost to a separate bug (Allman-style brace on the next line — C#: properties declared with { on the next line (block style) are silently dropped — only same-line { get; set; } is captured #229), the standalone accessor line readonly get => _v; still matches the expression-bodied property regex: readonly is consumed as the returnType and get as the name. A bogus property get appears in the symbol table pointing at the accessor line.

The second effect is independent of #229: even on a file that has zero Allman-style properties, any per-accessor readonly get/set => expr; line on its own emits a phantom property get row. The first effect is independent of #228 (which covers partial, not readonly) and independent of #224 (which covers ref / ref readonly returns, not the readonly modifier slot).

readonly methods are captured today because the method regex at SymbolExtractor.cs:94 already includes readonly in its modifier slot — the bug is specifically that the two property regexes at :100 and :103 do not.

Repro

CDIDX=/root/.local/bin/cdidx
mkdir -p /tmp/dogfood/cs-readonly-prop
cat > /tmp/dogfood/cs-readonly-prop/R.cs <<'EOF'
namespace Demo;

public struct S
{
    private int _v;

    // 1) readonly + expression-bodied property — DROPPED
    public readonly int A => _v;

    // 2) readonly + auto-property (same-line brace) — DROPPED
    public readonly int B { get; }

    // 3) readonly + expression-bodied accessor — DROPPED
    public readonly int C { get => _v; }

    // 4) Allman-style Mixed (dropped by #229) …
    public int Mixed
    {
        // 5) … but the accessor line here surfaces as a PHANTOM `property get`.
        readonly get => _v;
        set => _v = value;
    }

    // Baselines — captured:
    public int D { get; set; }                    // plain auto-property
    public readonly int GetD() => D;              // readonly method (captured)
    public readonly int Sum(int x) => _v + x;     // readonly method with args (captured)
}
EOF
"$CDIDX" index /tmp/dogfood/cs-readonly-prop --rebuild >/dev/null
"$CDIDX" symbols --db /tmp/dogfood/cs-readonly-prop/.cdidx/codeindex.db --path R.cs

Observed (actual):

function   GetD            R.cs:27
property   D               R.cs:26
function   Sum             R.cs:28
property   get             R.cs:20    ← PHANTOM: from `readonly get => _v;`
struct     S               R.cs:3-29
namespace  Demo            R.cs:1

Missing:

Extra:

  • property get (L20) — phantom, should not exist.

Downstream effects:

  • definition A / B / C all return No symbols found. on any readonly struct / readonly ref struct codebase that uses explicit readonly markers on properties.
  • outline of a file like System.Numerics.Vector<T> (heavy with readonly properties) lists ~30–50% of its property surface.
  • symbols --kind property --count on System.Span<T> / ReadOnlySpan<T> / Memory<T> / any performance-oriented library that prefers readonly struct undercounts by the number of explicit-readonly properties.
  • symbols get --exact returns phantom matches on any file that uses the readonly get => … accessor modifier.
  • inspect get can suggest that "get" is a member of the surrounding struct, polluting AI navigation.

Suspected root cause (from reading the source)

src/CodeIndex/Indexer/SymbolExtractor.cs:100 — property with get/set/init:

new("property",  new Regex(
    @"^\s*(?:(?<visibility>public|private|protected\s+internal|private\s+protected|protected|internal)\s+)?"
  + @"(?:(?:static|virtual|override|abstract|sealed|new|required)\s+)*"
  + @"(?<returnType>(?:global::)?[\w?.<>\[\],:]+)\s+(?<name>\w+)\s*\{\s*(?:get|set|init)",
    RegexOptions.Compiled),
    BodyStyle.Brace, "visibility", "returnType"),

SymbolExtractor.cs:103 — expression-bodied property:

new("property",  new Regex(
    @"^\s*(?:(?<visibility>public|private|protected\s+internal|private\s+protected|protected|internal)\s+)?"
  + @"(?:(?:static|virtual|override|abstract|sealed|new|required)\s+)*"
  + @"(?<returnType>(?:global::)?[\w?.<>\[\],:]+)\s+(?<name>\w+)\s*=>\s*",
    RegexOptions.Compiled),
    BodyStyle.None, "visibility", "returnType"),

Both modifier alternations are static|virtual|override|abstract|sealed|new|required. Missing: readonly. Compare with the method regex at :94:

(?:(?:static|sealed|partial|readonly|unsafe|extern|virtual|override|abstract|async|new|file)\s+)*

readonly is there, which is exactly why public readonly int GetD() => D; at L27 of the repro is captured.

Effect 1 — dropped symbol

Walkthrough for public readonly int A => _v; against :103:

  • public → visibility matched.
  • modifier slot: readonly is not in the alternation → 0 modifiers consumed.
  • (?<returnType>[\w?.<>\[\],:]+) → matches readonly.
  • \s+(?<name>\w+) → matches int (name=int).
  • \s*=>\s* → expects => but finds A; match fails.

The whole line is rejected, no symbol produced.

Effect 2 — phantom symbol

Walkthrough for the accessor line readonly get => _v; against :103:

  • Leading whitespace eaten.
  • visibility: optional, not matched.
  • modifier slot: readonly not in the alternation → 0 modifiers consumed.
  • (?<returnType>[\w?.<>\[\],:]+) → matches readonly.
  • \s+(?<name>\w+) → matches get (name=get).
  • \s*=>\s* → matches =>.

Match succeeds; symbol recorded as property get. There is no containing-property gate, no "did we just see a { opening this member?" check, so the accessor line is taken to be a top-level property declaration all on its own.

Why both effects share one fix

Adding readonly to the modifier slot in both :100 and :103:

(?:(?:static|virtual|override|abstract|sealed|new|required|readonly)\s+)*

— for the phantom readonly get => _v; line, readonly is now consumed by the modifier slot, then (?<returnType>\w+) must match a non-empty returnType followed by \s+(?<name>\w+)\s*=>\s*. The remaining text is get => _v;, which has no name \s*=> shape (there is no second identifier between get and =>). Match fails; no phantom.

— for public readonly int A => _v;, readonly is now consumed by the modifier slot, leaving int A => _v;. returnType matches int, name matches A, => matches. Match succeeds; property captured.

Both effects fall out of the single one-token addition.

Suggested direction

  1. Extend the modifier alternation in both property regexes at SymbolExtractor.cs:100 and :103 to include readonly. Also consider unsafe, extern, and file for completeness, since those are all valid property modifiers in current C# (e.g. file int X { get; } on a file-scoped struct, unsafe int* P { get; } in combination with C#: methods with pointer / function-pointer return types are dropped from the symbol index (int*, void**, delegate*<...>, int*[]) #234).
    (?:(?:static|virtual|override|abstract|sealed|new|required|readonly|unsafe|extern|file)\s+)*
  2. Land the same modifier widening in the indexer regex at :118 (already includes static|virtual|override|abstract|sealed|new — add readonly).
  3. Add an independent guard so per-accessor readonly get/set => expr; lines don't drive the expression-bodied property regex on their own, as a defense-in-depth against future modifier slots creating similar phantoms: either require the returnType to be followed by an identifier distinct from get/set/init, or skip the expression-bodied property regex when the preceding non-blank line ends in {.
  4. Add SymbolExtractorTests.cs fixtures asserting:

Cross-language note

The property-modifier-slot gap is C#-specific. Closest analogues:

The phantom from accessor line effect does generalize: any language whose per-accessor body syntax (e.g. hypothetical readonly get => x;) coincidentally matches the "expression-bodied property" shape would be susceptible. Today the direct impact is C# only.

Scope

  • src/CodeIndex/Indexer/SymbolExtractor.cs:100, :103, :118 — add readonly (and plausibly unsafe|extern|file) to modifier alternations.
  • tests/CodeIndex.Tests/SymbolExtractorTests.cs — fixtures for the three readonly property forms and for the accessor-line phantom suppression.
  • DEVELOPER_GUIDE.md language-pattern reference table — note that C# properties support readonly modifier.

Related

Environment

  • cdidx v1.10.0 (/root/.local/bin/cdidx, trimmed release).
  • Repro fixture above; cdidx languages shows csharp with yes/yes.
  • Platform: linux-x64 cloud container.
  • Filed from a cloud Claude Code session per CLOUD_BOOTSTRAP_PROMPT.md.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions