Skip to content

JIT miscompiles permanently-hot reflection-built generic delegate site (multiple symptoms incl. OverflowException at MemoryMarshal.AsBytes, NullReferenceException in RuntimeType internals, AV at teardown) #128976

@marklam

Description

@marklam

Description

I found this problem while trying to optimize the PureHDF package to reduce reflection overhead. I've used AI tooling to try to capture the failure modes and build a standalone repo to demonstrate the problem.

When a single instance of a generic delegate built via reflection is
held alive by a cache and called millions of times, the JIT
intermittently produces bad code for the call path. The symptom is not
constant — across runs of the same binary on the same machine I've
observed at least six distinct failure modes from what is almost
certainly one underlying codegen bug:

Symptom Where
System.OverflowException at MemoryMarshal.AsBytes checked(span.Length * sizeof(T)) with Length = 1, sizeof(T) = 12
System.Exception: total file element count != total memory element count An upstream ulong[1].Aggregate(...) returns a wrong value
System.InvalidOperationException from the wrong branch of is null check (buffer is null || buffer.Equals(default)) is false when buffer is default(T)
System.NullReferenceException inside System.RuntimeType.ListBuilder<T>.Add(T) Runtime/reflection internals stepped on
Unhandled System.NullReferenceException at a plain { get; } auto-property get_Message() returns null for a non-null readonly field — this corrupted
System.EntryPointNotFoundException at System.IDisposable.Dispose() during teardown Method-table corruption visible during GC/finalization
Process abort with exit code 0xC0000005 during BDN AfterActualRun Heap corruption that GC walks into

All of them disappear with DOTNET_TieredCompilation=0, which strongly
suggests this is a tier-up codegen issue.

Reproduction Steps

Self-contained repro: https://github.com/marklam/tierproblems

A copy is in this comment for convenience (also see the full repro at
the URL above for the multi-target csproj, ~250 LOC of inlined helper
classes, and the README):

git clone https://github.com/marklam/TierProblems.git tier-problems
cd tier-problems
dotnet build -c Release
dotnet run -c Release --no-build --framework net8.0     # frequent hit
dotnet run -c Release --no-build --framework net10.0    # rare hit

The program builds a chain that mirrors PureHDF's read path:

  • An outer cached Reader<TResult> delegate per (TResult, TElement)
    pair, built once via MethodInfo.MakeGenericMethod(...).CreateDelegate(...)
    pointing at an instance generic method on the receiver class.

  • An inner cached DecodeDelegate<TElement> per (TElement, isRawMode)
    pair on the message instance, built once via
    MakeGenericMethod(...).Invoke(...) of a static helper that returns a
    static-local-function delegate of the form

    static void decode(IH5ReadStream source, Span<T> target)
        => source.ReadDataset(MemoryMarshal.AsBytes(target));
  • A hot loop that calls the outer cached delegate ~200M times on the
    same receiver, which routes through ReadCoreLevel1_generic<TResult, TElement>
    ReadCoreLevel2<TElement> → the cached inner decoder → the static
    local function → MemoryMarshal.AsBytes.

The try { } catch (Exception) { … } around each hot-loop call
captures most symptoms; some manifestations (mid-warmup
NullReferenceException from a corrupted this, the AV) skip the
catch and tear the process down.

Expected behavior

The hot loop should complete deterministically. MemoryMarshal.AsBytes
on a Span<T> of length 1 cannot overflow. A field-backed { get; }
auto-property on a non-null instance cannot return null. Equals on
default(T) against default(T) for a sequential struct cannot
return false.

Actual behavior

After a variable number of calls (anywhere from <1M to ~200M
observed), one of the symptoms above fires.

Repro hit-rate (15+ runs each, default tier-up)

13th Gen Intel Core i9-13900KS / Windows 11 (10.0.26200) / x64 / .NET SDK 10.0.300 / Server GC:

Runtime Sample Failures Rate
8.0.27 (8.0.2726.22922) 15 10 ~67%
10.0.8 45 3 ~7%
8.0.27 + DOTNET_TieredCompilation=0 6 0 0%

So the bug is present in both .NET 8 and .NET 10 — .NET 10 just
misses the bad codegen path much more often (and is meaningfully faster
overall on the same loop, ~17s vs ~30s per 200M calls). Disabling
tier-up consistently eliminates it.

Sample stack traces

InvalidOperationExceptionbuffer.Equals(default(TResult))
returning false for default(Sample):

HIT after 22,646,862 calls in 4.0s
  outer: System.InvalidOperationException: JIT corruption: the 'buffer is default' check returned false even though the caller passed default(TResult).

at TierProblems.Inlined.NativeAttribute.ReadCoreLevel1_generic[TResult,TElement](TResult buffer, IH5ReadStream source, UInt64[] memoryDims)
at TierProblems.Inlined.NativeAttribute.Read[T](UInt64[] memoryDims)

NullReferenceException in runtime reflection internals:

HIT after 156,419,186 calls in 24.0s
  outer: System.NullReferenceException: Object reference not set to an instance of an object.

at System.RuntimeType.ListBuilder`1.Add(T item)
at TierProblems.Inlined.NativeAttribute.ReadCoreLevel1_generic[TResult,TElement](TResult buffer, IH5ReadStream source, UInt64[] memoryDims)

NullReferenceException from a field-backed auto-property (on .NET 10
during the warmup loop, before the catch):

Unhandled exception. System.NullReferenceException: Object reference not set to an instance of an object.
   at TierProblems.Inlined.NativeAttribute.get_Message()
   at TierProblems.Inlined.NativeAttribute.GetDecoderAndFileElementCount[TElement]()
   at TierProblems.Inlined.NativeAttribute.ReadCoreLevel1_generic[TResult,TElement](TResult buffer, IH5ReadStream source, UInt64[] memoryDims)

OverflowException at MemoryMarshal.AsBytes (the original symptom
that motivated this repro; observed against the unmodified PureHDF
codebase before the workaround):

System.OverflowException: Arithmetic operation resulted in an overflow.
at PureHDF.VOL.Native.DatatypeMessage.<GetDecodeInfoForUnmanagedMemory>g__decode|46_0[T](IH5ReadStream source, Span`1 target)
at PureHDF.VOL.Native.NativeAttribute.ReadCoreLevel1_generic[TResult,TElement](...)

Notes

  • A "shape-only" repro that mirrored the call structure but didn't
    carry PureHDF's per-call work (allocation of SystemMemoryStream,
    MemoryManager<T> virtual GetSpan(), ulong[1], new T[1], the
    reflection chain into RuntimeHelpers.IsReferenceOrContainsReferences<T>)
    ran ~7× faster per call and never reproduced the bug at 1B+ calls.
    Inlining the actual classes verbatim (which slows the loop to ~6M
    calls/sec) brings the bug back. The bug appears to be sensitive to
    how much work the JIT compiles into the tier-up target, not just
    the abstract call shape — small/cheap targets may get folded enough
    to bypass it.
  • The bug fires while iterating only one cache entry, but reliably
    reproducing requires priming several (TResult, TElement) pairs in
    the outer cache during warm-up first. With a single entry primed it
    never reproduced in 200M+ calls.
  • A ProjectReference to the affected library (PureHDF) without
    calling any of its methods does not reproduce the bug — so this
    isn't a module-initialiser / .cctor / metadata-load issue. The
    corrupt codegen requires the affected methods to actually run and
    reach tier 1.

Regression?

No response

Known Workarounds

set the environment variable DOTNET_TieredCompilation=0

Configuration

  • Windows 11 Pro (10.0.26200.8524), x64
  • 13th Gen Intel Core i9-13900KS, 24c/24t
  • .NET SDK 10.0.300
  • Runtimes tested: 8.0.27, 10.0.8
  • Server GC enabled (<ServerGarbageCollection>true</ServerGarbageCollection>); behaviour unchanged with workstation GC on a brief check.

Other information

No response

Metadata

Metadata

Assignees

Labels

area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions