Skip to content

[release/8.0] Fix OOM in BigInteger OuterLoop tests causing SIGKILL on Linux#126011

Open
Copilot wants to merge 3 commits intorelease/8.0from
copilot/fix-system-runtime-numerics-tests-crash
Open

[release/8.0] Fix OOM in BigInteger OuterLoop tests causing SIGKILL on Linux#126011
Copilot wants to merge 3 commits intorelease/8.0from
copilot/fix-system-runtime-numerics-tests-crash

Conversation

Copy link
Contributor

Copilot AI commented Mar 24, 2026

main PR

Description

System.Runtime.Numerics.Tests outerloop tests were crashing with exit code 137 (SIGKILL) on Linux because LargeValueLogTests and DoubleExplicitCastFromLargeBigIntegerTests shifted BigInteger by int.MaxValue / 10 bits (~214M bits) per iteration in nested loops (up to 4×3 = 12 times), creating values of ~107MB each. With 2 parallel test threads in a Docker container, this reliably triggered the OOM killer.

On main, PR #102874 added BigInteger.MaxLength which causes these shifts to throw OverflowException instead of allocating. release/8.0 has no such cap, so allocations succeed and exhaust container memory.

Changes:

  • log.cs / LargeValueLogTests: Replace int.MaxValue / 10 with 1 << 24 in both the shift operation and the expected log value calculation. Test correctness is preserved — the values are still well above the BigInteger.Log precision threshold.
  • cast_from.cs / DoubleExplicitCastFromLargeBigIntegerTests: Replace int.MaxValue / 10 with 1 << 24. 2^(1<<24) still far exceeds double.MaxValue ≈ 2^1024, so infinity assertions remain valid.

Peak per-iteration allocation drops from ~107MB to ~8MB, eliminating the OOM in constrained container environments.

Customer Impact

Persistent 100% failure rate of System.Runtime.Numerics.Tests outerloop on all four Linux legs (x64, arm64, musl-x64, musl-arm64) in every scheduled release/8.0 build. Known issue hit count: 56 times in one month.

Regression

No — this is a long-standing test design issue exposed by container memory limits, not a regression introduced in a recent release.

Testing

Test-only changes. The fix reduces allocation sizes while keeping the test assertions semantically correct. The library itself is not modified.

Risk

Very low. Changes are confined to test helper functions; no product code is touched. The assertions remain correct with smaller (but still large) values.

Package authoring signed off?

IMPORTANT: If this change touches code that ships in a NuGet package, please make certain that you have added any necessary package authoring and gotten it explicitly reviewed.

This is a test-only change; no NuGet package authoring is required.


📍 Connect Copilot coding agent with Jira, Azure Boards or Linear to delegate work to Copilot in one click without leaving your project management tool.

@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-numerics
See info in area-owners.md if you want to be subscribed.

Copilot AI changed the title [WIP] Fix System.Runtime.Numerics.Tests crash with exit code 137 [release/8.0] Fix OOM in BigInteger OuterLoop tests causing SIGKILL on Linux Mar 24, 2026
Copilot AI requested a review from danmoseley March 24, 2026 04:15
@danmoseley danmoseley changed the base branch from main to release/8.0 March 24, 2026 04:21
@danmoseley danmoseley force-pushed the copilot/fix-system-runtime-numerics-tests-crash branch from a69e62a to 058392a Compare March 24, 2026 04:25
@danmoseley danmoseley marked this pull request as ready for review March 24, 2026 04:26
Copilot AI review requested due to automatic review settings March 24, 2026 04:26
@danmoseley
Copy link
Member

@tannergooding I believe if you are OK with us taking this we need you to add the "Servicing approved" label (per the test only change backport flow)

@danmoseley
Copy link
Member

This is in 8.0 but it continues to cause noise as we continue to run tests there.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adjusts System.Runtime.Numerics.Tests BigInteger OuterLoop tests on release/8.0 to avoid extremely large BigInteger left-shifts that were triggering Linux OOM/SIGKILL in constrained/parallel test environments.

Changes:

  • Reduce the “very large” shift size in LargeValueLogTests from int.MaxValue / 10 to 1 << 24 and update the expected-log calculation accordingly.
  • Reduce the “very large” shift size in DoubleExplicitCastFromLargeBigIntegerTests from int.MaxValue / 10 to 1 << 24, preserving the infinity assertions while avoiding huge allocations.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
src/libraries/System.Runtime.Numerics/tests/BigInteger/log.cs Lowers per-iteration BigInteger growth in large Log tests to prevent OOM while keeping assertions meaningful.
src/libraries/System.Runtime.Numerics/tests/BigInteger/cast_from.cs Lowers per-iteration BigInteger growth in large double cast tests to prevent OOM while still producing ±Infinity.

Comment on lines +142 to +143
/// Test Log Method on Very Large BigInteger more than (1 &lt;&lt; (1 &lt;&lt; 24)) by base 2
/// Tested BigInteger are: pow(2, startShift + smallLoopShift * [1..smallLoopLimit] + (1 &lt;&lt; 24) * [1..bigLoopLimit])
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The XML doc now reads like a C# expression (1 << (1 << 24)), which is easy to misinterpret. Consider rephrasing to a clearer mathematical description (e.g., 2^(1<<24) or “1 shifted left by 16,777,216 bits”) so readers don’t confuse the shift-count with the shifted value.

Suggested change
/// Test Log Method on Very Large BigInteger more than (1 &lt;&lt; (1 &lt;&lt; 24)) by base 2
/// Tested BigInteger are: pow(2, startShift + smallLoopShift * [1..smallLoopLimit] + (1 &lt;&lt; 24) * [1..bigLoopLimit])
/// Test Log method on very large BigInteger values greater than 2^(1 &lt;&lt; 24), that is, greater than 1 shifted left by 16,777,216 bits, using base 2.
/// Tested BigInteger values are 2^(startShift + smallShift * [1..smallShiftLoopLimit] + (1 &lt;&lt; 24) * [1..bigShiftLoopLimit]).

Copilot uses AI. Check for mistakes.
Comment on lines +160 to +164
temp = temp << (1 << 24);
double expected =
(double)startShift +
smallShift * (double)(i + 1) +
(int.MaxValue / 10) * (double)(j + 1);
(1 << 24) * (double)(j + 1);
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(1 << 24) is a non-obvious tuning constant used both for the shift and expected-value calculation. Consider introducing a named const int (and optionally a brief comment explaining the choice) to make future adjustments less error-prone.

Copilot uses AI. Check for mistakes.
Comment on lines +709 to +710
/// Test cast to Double on Very Large BigInteger more than (1 &lt;&lt; (1 &lt;&lt; 24))
/// Tested BigInteger are: +/-pow(2, startShift + smallLoopShift * [1..smallLoopLimit] + (1 &lt;&lt; 24) * [1..bigLoopLimit])
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The XML doc uses (1 << (1 << 24)), which reads like code and is easy to misinterpret. Consider rephrasing to a clearer mathematical description (e.g., 2^(1<<24) or “1 shifted left by 16,777,216 bits”).

Suggested change
/// Test cast to Double on Very Large BigInteger more than (1 &lt;&lt; (1 &lt;&lt; 24))
/// Tested BigInteger are: +/-pow(2, startShift + smallLoopShift * [1..smallLoopLimit] + (1 &lt;&lt; 24) * [1..bigLoopLimit])
/// Test cast to Double on very large BigInteger values with magnitude greater than 2^(1 &lt;&lt; 24)
/// Tested BigInteger values are: +/-2^(startShift + smallLoopShift * [1..smallLoopLimit] + (1 &lt;&lt; 24) * [1..bigLoopLimit])

Copilot uses AI. Check for mistakes.
Comment on lines +725 to 726
temp = temp << (1 << 24);
VerifyDoubleExplicitCastFromBigInteger(double.PositiveInfinity, temp);
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(1 << 24) is a non-obvious tuning constant that controls how large the BigInteger gets. Consider introducing a named const int (shared with the doc comment if appropriate) so the intent is clearer and the value can be adjusted in one place.

Copilot uses AI. Check for mistakes.
@danmoseley
Copy link
Member

@copilot address feedback

…or bit shift

Co-authored-by: danmoseley <6385855+danmoseley@users.noreply.github.com>
Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/763c2d49-b5a3-4483-ae1e-484fe34e0b4f
Copy link
Contributor Author

Copilot AI commented Mar 25, 2026

@copilot address feedback

Done in cabb434. I've addressed all four review items:

  • Clarified the XML doc comments in both log.cs and cast_from.cs to use 2^(1 << 24) notation with an explicit mention of "16,777,216 bits"
  • Introduced a named const int LargeBitShift = 1 << 24 in both methods with a comment explaining the rationale (keeping allocations ~2 MB vs ~107 MB to avoid OOM in CI containers)

@danmoseley
Copy link
Member

@copilot revert back to the first commit. I commented on the wrong one -- for backports we don't want changes to the code.

Copy link
Contributor Author

Copilot AI commented Mar 25, 2026

@copilot revert back to the first commit. I commented on the wrong one -- for backports we don't want changes to the code.

Reverted in 1b5e9db — the branch now matches the first commit (058392a) with only the core fix (replacing int.MaxValue / 10 with 1 << 24) and no additional code changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[outerloop] System.Runtime.Numerics.Tests crash with exit code 137 in linux debug coreclr

4 participants