Skip to content

C#: events/delegates with spaces in the generic type (event Action<string, int> Foo) — event name becomes int, delegate is silently dropped #223

@Widthdom

Description

@Widthdom

Summary

The C# event and delegate patterns use \S+ (one-shot non-whitespace) to absorb the type between the event / delegate keyword and the symbol name. This works for single-token types (Action, void, Task<int>) but collapses on multi-argument generics because \S+ stops at the first whitespace, which on idiomatic code is the space after a , inside <K, V>. Result:

  1. event Action<string, int> NamedEvent; → indexed as event kind named int.
  2. event Func<string, int, bool> Filter; → also indexed as event int. Two events collide on the same bogus name.
  3. public delegate Task<Dictionary<string, int>> LoadAsync();not captured at all (pattern fails on the mismatched terminator).

This is adjacent to #222 (methods dropping when the return type has a space in generics) but a distinct extractor pattern with a different failure mode: silent wrong name for events, silent disappearance for delegates.

Repro

CDIDX=/root/.local/bin/cdidx
mkdir -p /tmp/dogfood/csev
cat > /tmp/dogfood/csev/D.cs <<'EOF'
namespace App;
using System;

public delegate Task<int>                         GetIdAsync(string user);          // OK
public delegate Task<Dictionary<string, int>>     LoadAsync();                      // DROPPED
public delegate TResult                           Func<T1, T2, TResult>(T1 a, T2 b); // OK (no space in return)

public class Pub
{
    public event Action<string, int> NamedEvent;                  // name becomes "int"
    public event Func<string, int, bool> Filter;                  // name becomes "int"
    public event EventHandler<ChangedArgs> Changed;               // OK (single-arg generic)
    public event Action OnReady;                                  // OK (non-generic)
}
EOF
"$CDIDX" /tmp/dogfood/csev --db /tmp/dogfood/csev.db
"$CDIDX" symbols --db /tmp/dogfood/csev.db

Actual output (abridged):

event      Changed                                  D.cs:17
delegate   Func                                     D.cs:7
delegate   GetIdAsync                               D.cs:5
event      OnReady                                  D.cs:20
event      int                                      D.cs:13    # should be NamedEvent
event      int                                      D.cs:14    # should be Filter; collides with NamedEvent
# (LoadAsync delegate entirely missing)

symbols --name NamedEvent --exact returns nothing. symbols --name int returns two rows of events — neither of which is semantically named int.

Suspected root cause (from reading the source)

src/CodeIndex/Indexer/SymbolExtractor.cs:105 (delegate):

new("delegate", new Regex(
    @"^\s*(?:(?<visibility>...)?\s+)?(?:(?:static|unsafe)\s+)?delegate\s+\S+\s+(?<name>\w+)\s*[\(<]",
    ...), BodyStyle.None, "visibility"),

src/CodeIndex/Indexer/SymbolExtractor.cs:107 (event):

new("event", new Regex(
    @"^\s*(?:(?<visibility>...)?\s+)?(?:(?:static)\s+)?event\s+\S+\s+(?<name>\w+)",
    ...), BodyStyle.None, "visibility"),

Walkthrough on event Action<string, int> NamedEvent;:

  • event matches.
  • \S+ greedy-matches Action<string, (stops at the space after ,).
  • \s+ matches the space.
  • (?<name>\w+) matches int.
  • Regex is done — > NamedEvent; is ignored.

Walkthrough on delegate Task<Dictionary<string, int>> LoadAsync();:

  • \S+ matches Task<Dictionary<string, (first whitespace-stop).
  • \s+ matches the space.
  • (?<name>\w+) matches int.
  • \s*[\(<] expects ( or <; next char is >. Fails → row dropped entirely.

Suggested direction

Replace the lazy \S+ tokenizer with a generic-aware returnType class matching the approach used for methods in #222. A minimal patch that keeps the existing shape:

// Type token: any identifier-ish run that may contain matched generic bracket pairs with commas and whitespace
// (non-greedy; outer anchor does the heavy lifting)
static readonly string CsTypeToken = @"(?:\([^)]+\)|(?:global::)?[\w?.<>\[\],:\s]+?)";

new("delegate", new Regex(
    @"^\s*(?:(?<visibility>...)?\s+)?(?:(?:static|unsafe)\s+)?delegate\s+" + CsTypeToken + @"\s+(?<name>\w+)\s*[\(<]",
    ...), BodyStyle.None, "visibility"),

new("event", new Regex(
    @"^\s*(?:(?<visibility>...)?\s+)?(?:(?:static)\s+)?event\s+" + CsTypeToken + @"\s+(?<name>\w+)\s*[;={]",
    ...), BodyStyle.None, "visibility"),

Note the event regex also gains an anchor at the name's trailing position (;, =, or {) — this is what disambiguates the trailing name from a middle-of-generic-arg identifier. Without this anchor, a lazy type class would never stop until the end of line.

Alternatively, rewrite both patterns to use balanced-angle-bracket tokenization (as suggested in #222).

Why it matters

  • Events with multi-argument Action<T1, T2> / Func<T1, T2, TResult> handlers are a mainstream .NET pattern — WPF / Blazor / Orleans / MediatR / every eventing API uses them.
  • The failure mode here is worse than "drop": events are indexed under a nonsense, colliding name (int), which actively pollutes symbols, hotspots, unused, and any name-based search. An AI tool asked "find all events" gets int with two source rows.
  • definition NamedEvent returns no rows, so an AI agent is misled into thinking the event doesn't exist.
  • LoadAsync (and any delegate returning Task<Dictionary<K, V>> / Task<Tuple<...>>) silently vanishes.

Cross-language note

Java has analogous functional-interface usage (Function<String, Integer>, BiConsumer<K, V>), but Java interfaces / methods are handled by the regular method regex, not a dedicated event/delegate pattern — so this specific bug is C#-only. However the underlying tokenization problem ([\w?.<>\[\],:]+ no \s) is the same family as #222, and a shared generic-type helper would close both gaps in one change.

Scope

  • src/CodeIndex/Indexer/SymbolExtractor.cs:105-107 — rewrite delegate/event type tokenizers.
  • tests/CodeIndex.Tests/SymbolExtractorTests.cs — fixtures: event Action<K, V>, event Func<A, B, C>, event Action (no generic), event EventHandler<T>, delegate Task<Dict<K, V>> X(), delegate void X<T1, T2>(T1 a, T2 b).

Related

Environment

  • cdidx: v1.10.0 (tarball from GitHub releases).
  • Platform: linux-x64 container.
  • Filed from a cloud Claude Code session per CLOUD_BOOTSTRAP_PROMPT.md.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions