Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase mmf file sizes based on emperical data for roslyn itself. #73108

Merged
merged 1 commit into from
Apr 20, 2024

Conversation

CyrusNajmabadi
Copy link
Member

No description provided.

@CyrusNajmabadi CyrusNajmabadi requested a review from a team as a code owner April 19, 2024 19:38
@dotnet-issue-labeler dotnet-issue-labeler bot added Area-IDE untriaged Issues and PRs which have not yet been triaged by a lead labels Apr 19, 2024

/// <summary>
/// The size in bytes of a memory mapped file created to store multiple temporary objects.
/// </summary>
/// <remarks>
/// <para>This value was arbitrarily chosen and appears to work well. Can be changed if data suggests
/// something better.</para>
/// <para>This value (8mb) creates roughly 35 memory mapped files (around 300MB) to store the contents of all of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

roughly 35

This isn't the 32 in the multiplation below?

Knowing nothing about this code, is there a reason that the MultiFileBlockSize is increasing? Could it still be the same by changing the 32 to a 16 since you doubled SingleFileThreshold?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not the 32. Each large block we allocate is currently 4mb. This doubles that to halve the number of total Mmfs we need for Roslyn itself

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Knowing nothing about this code, is there a reason that the MultiFileBlockSize is increasing?

Yes. It's basically saying: we'd like each large block to hold at least 32 large files (or a lot more of we're question working with small files.

It's a balance of not wanting to preallocate something huge, or something that grows is massive chunks (like, say 256 mb at a time), while also keeping it handle count low and not creating an excessive number of actual memory mapped files.

These numbers strike a good balance. Growing 8mb at a time is fine, and we can still pack most files into those chunks

Copy link
Member

@sharwell sharwell Apr 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two additional risks here:

  1. We use bump-pointer allocation to fit new data into these blocks. The larger the item allowed, the larger the potential gap of unused space is at the end of a segment. By doubling the block size at the same time as doubling the threshold, I'm hoping these balance out.

  2. We do not use compaction as part of our "GC strategy" for these blocks. If any single segment within the block is retained, the entire block will not be eligible for GC. If after running VS for a long period of time, you observe a significant number of MMF blocks of size 128KiB < size < 256KiB, this would indicate a potential problem with this change because many of those blocks could have increased in size to 8MiB (by being the last segment to retain a block).

    Another indicator of the same problem would be a significant number of 4MiB MMF blocks being retained by only a single segment of size 0 < size < 128KiB. Prior to this change, these segments would only end up retaining 4MiB, but this change would cause them to retain 8MiB.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use bump-pointer allocation to fit new data into these blocks. The larger the item allowed, the larger the potential gap of unused space is at the end of a segment. By doubling the block size at the same time as doubling the threshold, I'm hoping these balance out.

I didn't see any problematic gaps measuring this. And, as you said, both are doubled. So the expected %-loss should stay the same.

Prior to this change, these segments would only end up retaining 4MiB, but this change would cause them to retain 8MiB.

Both of thse are so small as to be irrelevant afaict. There is no realistic usage pattern i can think of either that would lead to excessive patterns being a problem with teh new numbers that wouldn't have already been a problem with the old ones.

Copy link
Contributor

@ToddGrun ToddGrun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@CyrusNajmabadi CyrusNajmabadi merged commit 03a462d into dotnet:main Apr 20, 2024
25 checks passed
@CyrusNajmabadi CyrusNajmabadi deleted the mmfWork branch April 20, 2024 00:16
@dotnet-policy-service dotnet-policy-service bot added this to the Next milestone Apr 20, 2024
@@ -65,33 +66,24 @@ internal sealed partial class TemporaryStorageService : ITemporaryStorageService

/// <summary>
/// The most recent memory mapped file for creating multiple storage units. It will be used via bump-pointer
/// allocation until space is no longer available in it.
/// allocation until space is no longer available in it. Access should be synchronized on <see cref="_gate"/>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 The old form of this comment was preferred.

/// <para>Access should be synchronized on <see cref="_gate"/>.</para>
/// </remarks>
/// <summary>The name of the current memory mapped file for multiple storage units. Access should be synchronized on
/// <see cref="_gate"/></summary>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 The access condition is not part of the description of this field. The old form of this comment was preferred.

@CyrusNajmabadi
Copy link
Member Author

@jasonmalinowski For review when you get back.

@dibarbet dibarbet modified the milestones: Next, 17.11 P1 Apr 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-IDE untriaged Issues and PRs which have not yet been triaged by a lead
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants