Skip to content

Fix xplat dump test OOM by splitting Helix payloads#127108

Open
max-charlamb wants to merge 1 commit intomainfrom
dev/max-charlamb/xplat-dump-correlation-payload
Open

Fix xplat dump test OOM by splitting Helix payloads#127108
max-charlamb wants to merge 1 commit intomainfrom
dev/max-charlamb/xplat-dump-correlation-payload

Conversation

@max-charlamb
Copy link
Copy Markdown
Member

Summary

Fix OOM in xplat dump tests by splitting Helix payloads so that multi-platform dumps are not zipped into a single MemoryStream-backed archive.

Problem

The Helix SDK uses DirectoryPayload.DoUploadAsync which zips entire directories into a MemoryStream-backed ZipArchive. When xplat dump tests include dumps from all platforms (linux-x64, linux-arm64, windows-x64, windows-x86, windows-arm64), the combined size exceeds the ~2GB byte[] limit of MemoryStream, causing an OutOfMemoryException regardless of process bitness.

Fix

Split the Helix payload into:

  • Correlation payload: Test DLLs (shared across all work items, uploaded once)
  • Per-work-item payloads: Platform-specific dumps (one platform per work item)

This keeps each individual zip well under the 2GB limit and also reduces total upload size since test binaries are no longer duplicated per work item.

Testing

  • Build 1384837 confirmed OOM is resolved: 4/6 xplat jobs passed (the 2 failures are 32-bit host issues unrelated to this change, tracked separately)

The xplat dump tests were hitting OutOfMemoryException when the Helix SDK
tried to zip all platforms' dumps into a single MemoryStream-backed
ZipArchive, exceeding the 2GB byte[] limit.

Fix by splitting payloads:
- Test DLLs: sent as a correlation payload (shared, uploaded once)
- Dumps: sent as per-work-item payloads (one platform per work item)

This avoids the single large zip and reduces redundant uploads since test
DLLs are no longer duplicated across work items.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses out-of-memory failures in the cDAC x-plat dump Helix test flow by avoiding creation of a single massive in-memory zip payload when multiple platforms’ dumps are included together.

Changes:

  • Split Helix payloads so platform dumps are uploaded as per-work-item payloads (one platform per work item).
  • Upload test binaries as a Helix correlation payload shared across all work items.
  • Update the runtime-diagnostics pipeline to pass the new payload parameters (TestPayload, DumpPayloadBase) to the Helix project.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
src/native/managed/cdac/tests/DumpTests/cdac-dump-xplat-test-helix.proj Moves test assets to correlation payload and switches each work item to upload only its platform’s dump directory.
eng/pipelines/runtime-diagnostics.yml Updates Helix send parameters to provide separate test and dump payload paths for the x-plat dump test stage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants