Summary
The C# event and delegate patterns use \S+ (one-shot non-whitespace) to absorb the type between the event / delegate keyword and the symbol name. This works for single-token types (Action, void, Task<int>) but collapses on multi-argument generics because \S+ stops at the first whitespace, which on idiomatic code is the space after a , inside <K, V>. Result:
event Action<string, int> NamedEvent; → indexed as event kind named int.
event Func<string, int, bool> Filter; → also indexed as event int. Two events collide on the same bogus name.
public delegate Task<Dictionary<string, int>> LoadAsync(); → not captured at all (pattern fails on the mismatched terminator).
This is adjacent to #222 (methods dropping when the return type has a space in generics) but a distinct extractor pattern with a different failure mode: silent wrong name for events, silent disappearance for delegates.
Repro
CDIDX=/root/.local/bin/cdidx
mkdir -p /tmp/dogfood/csev
cat > /tmp/dogfood/csev/D.cs <<'EOF'
namespace App;
using System;
public delegate Task<int> GetIdAsync(string user); // OK
public delegate Task<Dictionary<string, int>> LoadAsync(); // DROPPED
public delegate TResult Func<T1, T2, TResult>(T1 a, T2 b); // OK (no space in return)
public class Pub
{
public event Action<string, int> NamedEvent; // name becomes "int"
public event Func<string, int, bool> Filter; // name becomes "int"
public event EventHandler<ChangedArgs> Changed; // OK (single-arg generic)
public event Action OnReady; // OK (non-generic)
}
EOF
"$CDIDX" /tmp/dogfood/csev --db /tmp/dogfood/csev.db
"$CDIDX" symbols --db /tmp/dogfood/csev.db
Actual output (abridged):
event Changed D.cs:17
delegate Func D.cs:7
delegate GetIdAsync D.cs:5
event OnReady D.cs:20
event int D.cs:13 # should be NamedEvent
event int D.cs:14 # should be Filter; collides with NamedEvent
# (LoadAsync delegate entirely missing)
symbols --name NamedEvent --exact returns nothing. symbols --name int returns two rows of events — neither of which is semantically named int.
Suspected root cause (from reading the source)
src/CodeIndex/Indexer/SymbolExtractor.cs:105 (delegate):
new("delegate", new Regex(
@"^\s*(?:(?<visibility>...)?\s+)?(?:(?:static|unsafe)\s+)?delegate\s+\S+\s+(?<name>\w+)\s*[\(<]",
...), BodyStyle.None, "visibility"),
src/CodeIndex/Indexer/SymbolExtractor.cs:107 (event):
new("event", new Regex(
@"^\s*(?:(?<visibility>...)?\s+)?(?:(?:static)\s+)?event\s+\S+\s+(?<name>\w+)",
...), BodyStyle.None, "visibility"),
Walkthrough on event Action<string, int> NamedEvent;:
event matches.
\S+ greedy-matches Action<string, (stops at the space after ,).
\s+ matches the space.
(?<name>\w+) matches int.
- Regex is done —
> NamedEvent; is ignored.
Walkthrough on delegate Task<Dictionary<string, int>> LoadAsync();:
\S+ matches Task<Dictionary<string, (first whitespace-stop).
\s+ matches the space.
(?<name>\w+) matches int.
\s*[\(<] expects ( or <; next char is >. Fails → row dropped entirely.
Suggested direction
Replace the lazy \S+ tokenizer with a generic-aware returnType class matching the approach used for methods in #222. A minimal patch that keeps the existing shape:
// Type token: any identifier-ish run that may contain matched generic bracket pairs with commas and whitespace
// (non-greedy; outer anchor does the heavy lifting)
static readonly string CsTypeToken = @"(?:\([^)]+\)|(?:global::)?[\w?.<>\[\],:\s]+?)";
new("delegate", new Regex(
@"^\s*(?:(?<visibility>...)?\s+)?(?:(?:static|unsafe)\s+)?delegate\s+" + CsTypeToken + @"\s+(?<name>\w+)\s*[\(<]",
...), BodyStyle.None, "visibility"),
new("event", new Regex(
@"^\s*(?:(?<visibility>...)?\s+)?(?:(?:static)\s+)?event\s+" + CsTypeToken + @"\s+(?<name>\w+)\s*[;={]",
...), BodyStyle.None, "visibility"),
Note the event regex also gains an anchor at the name's trailing position (;, =, or {) — this is what disambiguates the trailing name from a middle-of-generic-arg identifier. Without this anchor, a lazy type class would never stop until the end of line.
Alternatively, rewrite both patterns to use balanced-angle-bracket tokenization (as suggested in #222).
Why it matters
- Events with multi-argument
Action<T1, T2> / Func<T1, T2, TResult> handlers are a mainstream .NET pattern — WPF / Blazor / Orleans / MediatR / every eventing API uses them.
- The failure mode here is worse than "drop": events are indexed under a nonsense, colliding name (
int), which actively pollutes symbols, hotspots, unused, and any name-based search. An AI tool asked "find all events" gets int with two source rows.
definition NamedEvent returns no rows, so an AI agent is misled into thinking the event doesn't exist.
LoadAsync (and any delegate returning Task<Dictionary<K, V>> / Task<Tuple<...>>) silently vanishes.
Cross-language note
Java has analogous functional-interface usage (Function<String, Integer>, BiConsumer<K, V>), but Java interfaces / methods are handled by the regular method regex, not a dedicated event/delegate pattern — so this specific bug is C#-only. However the underlying tokenization problem ([\w?.<>\[\],:]+ no \s) is the same family as #222, and a shared generic-type helper would close both gaps in one change.
Scope
src/CodeIndex/Indexer/SymbolExtractor.cs:105-107 — rewrite delegate/event type tokenizers.
tests/CodeIndex.Tests/SymbolExtractorTests.cs — fixtures: event Action<K, V>, event Func<A, B, C>, event Action (no generic), event EventHandler<T>, delegate Task<Dict<K, V>> X(), delegate void X<T1, T2>(T1 a, T2 b).
Related
Environment
- cdidx: v1.10.0 (tarball from GitHub releases).
- Platform: linux-x64 container.
- Filed from a cloud Claude Code session per
CLOUD_BOOTSTRAP_PROMPT.md.
Summary
The C#
eventanddelegatepatterns use\S+(one-shot non-whitespace) to absorb the type between theevent/delegatekeyword and the symbol name. This works for single-token types (Action,void,Task<int>) but collapses on multi-argument generics because\S+stops at the first whitespace, which on idiomatic code is the space after a,inside<K, V>. Result:event Action<string, int> NamedEvent;→ indexed aseventkind namedint.event Func<string, int, bool> Filter;→ also indexed asevent int. Two events collide on the same bogus name.public delegate Task<Dictionary<string, int>> LoadAsync();→ not captured at all (pattern fails on the mismatched terminator).This is adjacent to #222 (methods dropping when the return type has a space in generics) but a distinct extractor pattern with a different failure mode: silent wrong name for events, silent disappearance for delegates.
Repro
Actual output (abridged):
symbols --name NamedEvent --exactreturns nothing.symbols --name intreturns two rows of events — neither of which is semantically namedint.Suspected root cause (from reading the source)
src/CodeIndex/Indexer/SymbolExtractor.cs:105(delegate):src/CodeIndex/Indexer/SymbolExtractor.cs:107(event):Walkthrough on
event Action<string, int> NamedEvent;:eventmatches.\S+greedy-matchesAction<string,(stops at the space after,).\s+matches the space.(?<name>\w+)matchesint.> NamedEvent;is ignored.Walkthrough on
delegate Task<Dictionary<string, int>> LoadAsync();:\S+matchesTask<Dictionary<string,(first whitespace-stop).\s+matches the space.(?<name>\w+)matchesint.\s*[\(<]expects(or<; next char is>. Fails → row dropped entirely.Suggested direction
Replace the lazy
\S+tokenizer with a generic-aware returnType class matching the approach used for methods in #222. A minimal patch that keeps the existing shape:Note the event regex also gains an anchor at the name's trailing position (
;,=, or{) — this is what disambiguates the trailing name from a middle-of-generic-arg identifier. Without this anchor, a lazy type class would never stop until the end of line.Alternatively, rewrite both patterns to use balanced-angle-bracket tokenization (as suggested in #222).
Why it matters
Action<T1, T2>/Func<T1, T2, TResult>handlers are a mainstream .NET pattern — WPF / Blazor / Orleans / MediatR / every eventing API uses them.int), which actively pollutessymbols,hotspots,unused, and any name-based search. An AI tool asked "find all events" getsintwith two source rows.definition NamedEventreturns no rows, so an AI agent is misled into thinking the event doesn't exist.LoadAsync(and any delegate returningTask<Dictionary<K, V>>/Task<Tuple<...>>) silently vanishes.Cross-language note
Java has analogous functional-interface usage (
Function<String, Integer>,BiConsumer<K, V>), but Java interfaces / methods are handled by the regular method regex, not a dedicated event/delegate pattern — so this specific bug is C#-only. However the underlying tokenization problem ([\w?.<>\[\],:]+no\s) is the same family as #222, and a shared generic-type helper would close both gaps in one change.Scope
src/CodeIndex/Indexer/SymbolExtractor.cs:105-107— rewrite delegate/event type tokenizers.tests/CodeIndex.Tests/SymbolExtractorTests.cs— fixtures:event Action<K, V>,event Func<A, B, C>,event Action(no generic),event EventHandler<T>,delegate Task<Dict<K, V>> X(),delegate void X<T1, T2>(T1 a, T2 b).Related
Task<Result<A, B>>,Dictionary<K, V>) are silently dropped — idiomatic .NET formatting is effectively unindexed #222 — C# method returnType space-in-generics (same family).int,void) as function names on function-pointer typedefs and function-returning-pointer declarations #160, C# function regex captures? Foo(args)and: Foo(args)ternary continuation lines as function declarations, producing phantom symbols and silently losing the real callers #175, C#:[assembly: Attr(args)]/[module: Attr(args)]lines falsely indexed as method definitions #219 — C# extractor false-positives / false-negatives.Environment
CLOUD_BOOTSTRAP_PROMPT.md.