Skip to content

Follow-up: memory-based activation shedding #9577

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Jun 23, 2025

Conversation

DeagleGross
Copy link
Member

@DeagleGross DeagleGross commented Jun 18, 2025

Adding logs and refactoring configuration of #9532.
also testing whether the memory pressure works against CI agents here.

Microsoft Reviewers: Open in CodeFlow

@DeagleGross DeagleGross self-assigned this Jun 18, 2025
@@ -242,7 +240,7 @@ public List<ICollectibleGrainContext> ScanStale()
if (!activation.IsValid)
{
// This is not an error scenario because the activation may have become invalid between the time
// we captured a snapshot in 'DequeueQuantum' and now. We are not be able to observe such changes.
// we captured a snapshot in 'Dequeue_grainCollectionOptions.CollectionQuantum' and now. We are not be able to observe such changes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a find-and-replace error

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rollbacked

public EnvironmentStatistics GetEnvironmentStatistics()
{
EnvironmentStatistics stats = new();
if (!stats.IsValid())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wont this always be false?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left this code intentionally in this way to let people know that we try with stats, and we fallback to realStatisticsProvider in case something is wrong with the stats.

At this point we should probably just remove all "fake" stat providers - because why do we need them if we are ending up using the real one?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean next time this code is touched, all people will do is fake the stats somehow, and they will be valid, but faked. In case something went wrong stats are replaced

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason fake stats are used is for testing - i.e, to present some condition like high memory or CPU usage and verify that the behavior is expected.

But if a branch can never be taken, leaving it in the code is misleading. If I read some code and there is a branch, I expect that sometimes the branch will be taken, and sometimes it will not, depending on some condition. In this case, new EnvironmentStatistics().IsValid() is always false, so the branch will always be taken, and therefore the whole method may as well just return the real stats to begin with rather than misdirect the reader.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we can remove the entire type because there is an IEnvironmentStatisticsProvider available by default in Orleans now. That wasn't true when this test was written but the test wasn't updated since.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test can be changed to:

/// <summary>
/// Ensures <see cref="LoadSheddingValidator"/> fails when LoadSheddingLimit greater than 100.
/// </summary>
[Fact]
public async Task SiloBuilder_LoadSheddingValidatorAbove100ShouldFail()
{
    await Assert.ThrowsAsync<OrleansConfigurationException>(async () =>
    {
        await new HostBuilder().UseOrleans((ctx, siloBuilder) =>
        {
            siloBuilder
                .UseLocalhostClustering()
                .Configure<ClusterOptions>(options => options.ClusterId = "someClusterId")
                .Configure<EndpointOptions>(options => options.AdvertisedIPAddress = IPAddress.Loopback)
                .ConfigureServices(services => services.AddSingleton<IMembershipTable, NoOpMembershipTable>())
                .Configure<LoadSheddingOptions>(options =>
                {
                    options.LoadSheddingEnabled = true;
                    options.CpuThreshold = 101;
                });
        }).RunConsoleAsync();
    });
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed completely noop stats providers, now only meaningful set of stats or just taking the real ones.

@ReubenBond ReubenBond changed the title follow-up: memory shedding Follow-up: memory-based activation shedding Jun 23, 2025
@ReubenBond ReubenBond merged commit 0b3e926 into main Jun 23, 2025
32 of 35 checks passed
@ReubenBond ReubenBond deleted the dmkorolev/memory-shedding-2 branch June 23, 2025 18:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants