Optimize memory usage when building the blob heap. by teo-tsirpanis · Pull Request #127304 · dotnet/runtime

teo-tsirpanis · 2026-04-22T23:01:16Z

Background

When building the blob heap, MetadataBuilder keeps track of the blobs added, to avoid adding them multiple times. In the beginning, this was happening using a Dictionary<ImmutableArray<byte>, BlobHandle> and a custom comparer that compared the keys by value. This approach had the disadvantage of always allocating an ImmutableArray<byte> when you called GetOrAddBlob with anything except an immutable array. #81059 improved this situation and eliminated most allocations when the blob already exists. However, there are several optimization opportunities in how we build the blob heap:

We still get an allocation when we call GetOrAddBlob with a multi-chunk BlobBuilder, even if the blob already existed.
Adding a new blob to the heap still ends up making an allocation.
Unlike other heap types, the blob heap gets written in random order, which requires allocating a contiguous memory block as large as the size of the entire blob heap. This subverts BlobBuilder's pooling and chunking facilities, and leads to an LOH allocation.

This PR fixes all of the above.

Changes

Instead of keeping track of each blob as an ImmutableArray<byte> and writing the blob heap at the end, we write the blob heap to a BlobBuilder as each blob gets added, and keep track of each blob by its position within that BlobBuilder.

In order to do that, BlobBuilder was extended to support writing data that can be later referenced using a BlobBuilder.Segment struct. This is an internal-only functionality that slightly alters some invariants of BlobBuilder, but is invisible to external consumers. Segment-addressible buffers are written in chunks of increasingly sized buffers up to 8K bytes, matching the behavior of StringBuilder. This chunking logic will be user-configurable and expanded to all BlobBuilder APIs as part of #100418.

Afterwards, BlobDictionary was updated to use BlobBuilder.Segment as its key type, and append to the BlobBuilder to get a segment when a blob does not already exist. Also, the modern .NET implementation of BlobDictionary was significantly simplified by making use of the AlternateLookup API.

TODO

Benchmark

dotnet-policy-service · 2026-04-22T23:02:33Z

Tagging subscribers to this area: @dotnet/area-system-reflection-metadata
See info in area-owners.md if you want to be subscribed.

Copilot

Pull request overview

This PR refactors how System.Reflection.Metadata builds the #Blob heap to reduce allocations and avoid a large contiguous buffer allocation by writing blob data incrementally into a BlobBuilder and deduplicating by referencing written segments.

Changes:

Write #Blob heap content incrementally into a dedicated HeapBlobBuilder as blobs are added, and compute heap sizes from _blobBuilder.Count.
Extend BlobBuilder with internal “Segment” APIs to allow later referencing of previously written data for deduplication.
Update BlobDictionary to use BlobBuilder.Segment keys (and AlternateLookup on .NET) instead of ImmutableArray<byte> keys.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
src/libraries/System.Reflection.Metadata/src/System/Reflection/Metadata/Ecma335/MetadataBuilder.cs	Switches serialized heap size accounting to use `_blobBuilder.Count`.
src/libraries/System.Reflection.Metadata/src/System/Reflection/Metadata/Ecma335/MetadataBuilder.Heaps.cs	Reworks blob heap accumulation/writing to use `_blobBuilder` and removes the “write blob heap at end” path.
src/libraries/System.Reflection.Metadata/src/System/Reflection/Metadata/Ecma335/BlobDictionary.cs	Changes blob dedup dictionary to key by `BlobBuilder.Segment` and uses `AlternateLookup` on .NET.
src/libraries/System.Reflection.Metadata/src/System/Reflection/Metadata/BlobWriterImpl.cs	Adds span-based compressed-integer writer used by segment writing.
src/libraries/System.Reflection.Metadata/src/System/Reflection/Metadata/BlobBuilder.cs	Adjusts invariants / chunk expansion behavior to support segment-writing scenarios.
src/libraries/System.Reflection.Metadata/src/System/Reflection/Metadata/BlobBuilder.Segment.cs	New internal segment-writing implementation and `Segment` struct for stable references.
src/libraries/System.Reflection.Metadata/src/System/Reflection/Internal/Utilities/Hash.cs	Refactors FNV hashing to add an “accumulate” helper.
src/libraries/System.Reflection.Metadata/src/System.Reflection.Metadata.csproj	Includes the new `BlobBuilder.Segment.cs` file (and normalizes the first line).

Copilot · 2026-04-22T23:10:35Z


-            _blobs.GetOrAdd(ReadOnlySpan<byte>.Empty, ImmutableArray<byte>.Empty, default, out _);
-            _blobHeapSize = 1;
+            _blobs = new BlobDictionary(_blobBuilder, 32);


The initial capacity for _blobs dropped from 1024 to 32. If the blob heap commonly contains hundreds/thousands of unique blobs (as the previous default implied), this will cause more dictionary resizes and allocations. Consider keeping the previous capacity (or deriving it from an existing heuristic) unless there’s data showing 32 is sufficient.

Suggested change

_blobs = new BlobDictionary(_blobBuilder, 32);

_blobs = new BlobDictionary(_blobBuilder, 1024);

This can be discussed. Other heaps used multiples of 1024 as their capacity in bytes, not elements. Now that we can set the capacity of the blob heap being built, I moved the use of 1024 there, and set the dictionary's initial capacity to $\sqrt{1024} = 32$ elements.

Copilot

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

…he blob heap.

teo-tsirpanis · 2026-04-23T20:10:59Z

@EgorBot -arm

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using System.Buffers.Binary;
using System.Reflection.Metadata;
using System.Reflection.Metadata.Ecma335;

BenchmarkSwitcher.FromAssembly(typeof(BlobHeapBenchmarks).Assembly).Run(args);

[MemoryDiagnoser]
public class BlobHeapBenchmarks
{
    const int BlobSize = 20;

    [Benchmark]
    [Arguments(2_000)]
    [Arguments(20_000)]
    public int Run(int blobCount)
    {
        var mdBuilder = new MetadataBuilder();
        byte[] buffer = new byte[BlobSize];
        for (int i = 0; i < blobCount; i++)
        {
            BinaryPrimitives.WriteInt32LittleEndian(buffer, i);
            _ = mdBuilder.GetOrAddBlob(buffer);
        }
        var mdRootBuilder = new MetadataRootBuilder(mdBuilder, suppressValidation: true);
        BlobBuilder output = new BlobBuilder();
        mdRootBuilder.Serialize(output, 0, 0);
        return output.Count;
    }
}

Optimize memory usage when building the blob heap.

9480359

Copilot AI review requested due to automatic review settings April 22, 2026 23:01

github-actions Bot added the area-System.Reflection.Metadata label Apr 22, 2026

dotnet-policy-service Bot added the community-contribution Indicates that the PR has been added by a community member label Apr 22, 2026

Copilot started reviewing on behalf of teo-tsirpanis April 22, 2026 23:02 View session

Copilot AI reviewed Apr 22, 2026

View reviewed changes

Address Copilot feedback.

4064724

jkotas added the tenet-performance Performance related issue label Apr 23, 2026

Fix typos.

6cb03db

Copilot AI review requested due to automatic review settings April 23, 2026 17:01

Copilot started reviewing on behalf of teo-tsirpanis April 23, 2026 17:02 View session

Copilot AI reviewed Apr 23, 2026

View reviewed changes

Comment thread src/libraries/System.Reflection.Metadata/src/System/Reflection/Metadata/BlobBuilder.Segment.cs

Free the placeholder empty head chunk after having finished writing t…

40cf6f1

…he blob heap.

teo-tsirpanis force-pushed the srm-blob-heap-opt branch from 782f239 to 40cf6f1 Compare April 23, 2026 17:12

EgorBot mentioned this pull request Apr 23, 2026

Benchmarks for dotnet/runtime#127304 (for @teo-tsirpanis) EgorBot/Benchmarks#145

Open

build-analysis Bot mentioned this pull request Apr 23, 2026

Android arm32 device not found (armeabi-v7a architecture unavailable) #125440

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize memory usage when building the blob heap.#127304

Optimize memory usage when building the blob heap.#127304
teo-tsirpanis wants to merge 4 commits intodotnet:mainfrom
teo-tsirpanis:srm-blob-heap-opt

teo-tsirpanis commented Apr 22, 2026 •

edited

Loading

Uh oh!

dotnet-policy-service Bot commented Apr 22, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Apr 22, 2026

Uh oh!

teo-tsirpanis Apr 22, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

teo-tsirpanis commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	_blobs = new BlobDictionary(_blobBuilder, 32);
	_blobs = new BlobDictionary(_blobBuilder, 1024);

Conversation

teo-tsirpanis commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Background

Changes

TODO

Uh oh!

dotnet-policy-service Bot commented Apr 22, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

teo-tsirpanis Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

teo-tsirpanis commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

teo-tsirpanis commented Apr 22, 2026 •

edited

Loading

teo-tsirpanis Apr 22, 2026 •

edited

Loading