Skip to content

Clear entire range on decommit#128032

Closed
BenV wants to merge 1 commit into
dotnet:mainfrom
BenV:patch-1
Closed

Clear entire range on decommit#128032
BenV wants to merge 1 commit into
dotnet:mainfrom
BenV:patch-1

Conversation

@BenV
Copy link
Copy Markdown

@BenV BenV commented May 11, 2026

Fixes: #127892 (comment)

Detailed analysis from @cshung here: https://cshung.github.io/posts/memory-corruption-5/

This logic was also very recently touched by #127328 so we may want @pavelsavara to look as well, but I suspect we would see the same heap corruption that was happening with large pages in WASM builds as well.

I didn't add a comment for the change because it felt a bit odd to reference never_decommit_p or huge pages when the code was being removed, but I could add something to try and prevent this optimization from being re-introduced if you all feel appropriate (or feel free to just edit an appropriate comment in)

Thanks again for taking this seriously and helping us track this down!

CC: @mangod9

@dotnet-policy-service dotnet-policy-service Bot added the community-contribution Indicates that the PR has been added by a community member label May 11, 2026
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @dotnet/gc
See info in area-owners.md if you want to be subscribed.

Comment thread src/coreclr/gc/memory.cpp
decommit_succeeded_p));
if (require_clearing_memory_p)
{
uint8_t* clear_end = never_decommit_p ? heap_segment_used (region) : heap_segment_committed (region);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BenV I am not 100% sure the actual problem is at this place. Last week I've tried to add clearing to all places where we skip it for large pages. And I've found that if I keep it at all the other places except of this one, it still passed the test you've shared. I am going to look into it more today, but that seems to indicate that the clearing we are adding here may be just fixing what should have been done at another place.
I didn't have a chance to read through @cshung's analysis yet though, I am going to do that first.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cshung after reading your analysis (thank you for that!), I wonder - why is clearing the memory here better than clearing the necessary part after the compaction / replanning instead? Wouldn't we end up zeroing less memory that way?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good, let me know what you find and if you need anything else from us. Thanks again.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janvorli, the change will stop the bleeding for now, we can definitely try to optimize to zero less memory.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am admittedly biased but I agree it would be great to get the simple fix in and optimize later (particularly if we could backport this to .NET 10) - I'd rather zero a bit of extra memory if it is guaranteed to avoid corruption as GCLargePages is essentially a landmine right now. That being said if some additional investigation reveals a similarly simple fix elsewhere I would be all for that as well.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing, thanks @cshung - if @janvorli is able to verify it in .NET 10 I'm assuming we should close this out in favor of that change instead, right?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BenV, @cshung I can confirm that @cshung's change combined with the gc_heap::decommit_heap_segment early out change I've mentioned above passed the test that @BenV has shared.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janvorli Excellent, so would the plan be to backport your proposed gc_heap::decommit_heap_segment combined with #128217 to .NET 10 (and close this out)?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's the plan.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BenV, @cshung I've just merged the #128217 and I'll prepare the backport later today or tomorrow.

Copy link
Copy Markdown
Member

@janvorli janvorli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

@BenV
Copy link
Copy Markdown
Author

BenV commented May 18, 2026

Closing this in favor of #128217

@BenV BenV closed this May 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-GC-coreclr community-contribution Indicates that the PR has been added by a community member

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GC heap corruption with GCLargePages (Part 2: no aggressive GC required)

4 participants