[API Proposal]: GC.RefreshMemoryLimit #70601

cshung · 2022-06-10T23:56:02Z

Background and motivation

In a cloud service scenario, demand comes and goes. To be cost-effective, it is important that our services can scale up and scale down on resource consumption as the demand fluctuates.

While the GC is capable of de-commit unused memory (using aggressive GC manually or waiting until the automatic aging de-commit kicks in), there is no guarantee that the application is not going to allocate more as requests come back. It is much safer if we allow the container to shrink its memory limit after the scale down happened so that the operator can put more instances on the box without worrying about running out of memory just because the GC heuristics believe they are more memory than it is.

As of now, we can change the container limit as we wish, but the GC won't notice. Here I propose an API to notify the GC when that happens.

API Proposal

namespace System;
public static class GC
{
        /// <summary>
        ///
        /// Instructs the Garbage Collector to reconfigure itself by detecting the various memory limits on the system.
        ///
        /// In addition to actual physical memory limit and container limit settings, these configuration settings can be overwritten:
        ///
        /// - GCHeapHardLimit
        /// - GCHeapHardLimitPercent
        /// - GCHeapHardLimitSOH
        /// - GCHeapHardLimitLOH
        /// - GCHeapHardLimitPOH
        /// - GCHeapHardLimitSOHPercent
        /// - GCHeapHardLimitLOHPercent
        /// - GCHeapHardLimitPOHPercent
        ///
        /// Instead of updating the environment variable (which will not be read), these are overridden setting a ulong value in the AppContext.
        ///
        /// For example, you can use AppContext.SetData("GCHeapHardLimit", (ulong) 100 * 1024 * 1024) to override the GCHeapHardLimit to a 100M.
        ///
        /// As a limitation, for 32-bit systems (e.g. x86, ARM) or OSX, if we do not already have a heap hard limit set, we will not establish a new heap hard limit.
        ///
        /// As of now, this API is feature preview only and subject to changes as necessary.
        ///
        /// </summary>
        ///
        /// <returns> A status code to indicate the operation's result:
        /// 0 - The refresh operation succeed
        /// 1 - The operation failed because the new memory limit is too low
        /// 2 - The new hard limit configuration settings is invalid
        ///
        /// We reserve the rights to add more status code as we evolve the implementation.
        ///
        /// </returns>
        [System.Runtime.Versioning.RequiresPreviewFeaturesAttribute("RefreshMemoryLimit is in preview.")]
        public static int RefreshMemoryLimit()
}

API Usage

GC.RefreshMemoryLimit();

Alternative Designs

An alternative design is to have some system mechanism that do that automatically when the container memory limit change happens. This will involve

Non-trivial work with various containers that we support, and
No way to communicate if the container size reduction is infeasible (*).

The API is meant to be used by people who host services. Therefore it makes sense to make sure this API is CLR Hosting API friendly.

Risks

(*) We reserve the right to refuse the container size reduction when it is infeasible (it might be too tight, leaving the GC no room to maneuver).

But the operator might have already reduced the container size - hmm - we need to do something about that.

Certain things can't easily change - it is easy to say that we wanted to reduce the number of cores as well, but it would be hard to implement changing the number of heaps.

Here is a proof-of-concept prototype, feel free to try it out and leave us feedback.

ghost · 2022-06-10T23:56:12Z

Tagging subscribers to this area: @dotnet/gc
See info in area-owners.md if you want to be subscribed.

Issue Details

Background and motivation

In a cloud service scenario, demand comes and goes. To be cost-effective, it is important that our services can scale up and scale down on resource consumption as the demand fluctuates.

While the GC is capable of de-commit unused memory (using aggressive GC manually or waiting until the automatic aging de-commit kicks in), there is no guarantee that the application is not going to allocate more as requests come back. It is much safer if we allow the container to shrink its memory limit after the scale down happened so that the operator can put more instances on the box without worrying about running out of memory just because the GC heuristics believe they are more memory than it is.

As of now, we can change the container limit as we wish, but the GC won't notice. Here I propose an API to notify the GC when that happens.

API Proposal

namespace System;
public static class GC
{
  // ...
  bool RefreshMemoryLimit();
}

API Usage

GC.RefreshMemoryLimit();

Alternative Designs

An alternative design is to have some system mechanism that do that automatically when the container memory limit change happens. This will involve

Non-trivial work with various containers that we support, and
No way to communicate if the container size reduction is infeasible (*).

Risks

(*) We reserve the right to refuse the container size reduction when it is infeasible (it might be too tight, leaving the GC no room to maneuver).

But the operator might have already reduced the container size - hmm - we need to do something about that.

Certain things can't easily change - it is easy to say that we wanted to reduce the number of cores as well, but it would be hard to implement changing the number of heaps.

Here is a proof-of-concept prototype, feel free to try it out and leave us feedback.

Author:	cshung
Assignees:	-
Labels:	`api-suggestion`, `area-GC-coreclr`
Milestone:	Future

zackliu · 2022-06-13T01:58:24Z

Reloading memory is awesome. I support the manual API instead of auto refreshing.

davidni · 2022-07-18T23:16:50Z

Related: Environment.ProcessorCount suffers from a similar issue. The result is cached on first access and the runtime assumes the number of processors cannot change, which is false. Relevant source (I think): util.cpp:GetCurrentProcessCpuCount().

See also: Environment.ProcessorCount behavior on Windows

normj · 2023-03-17T06:36:07Z

This API would be important for us at AWS. For .NET AWS Lambda functions the .NET runtime does not understand the memory limits put in place by the Lambda runtime. If we had this method available to us we could as part of our .NET Lambda startup code take the value from the AWS_LAMBDA_FUNCTION_MEMORY_SIZE environment variable set by the Lambda service into a value for DOTNET_GCHeapHardLimit and then call GC.RefreshMemoryLimit();

cshung · 2023-04-10T17:20:54Z

@normj, as I tried to implement the logic to read the DOTNET_GCHeapHardLimit during RefreshMemoryLimit, I am hitting issues with CLRConfig. In particular, the current config subsystem is optimized for loading environment variables only once during process startup, fixing that will require ugly hacks and impact the whole subsystem.

After discussing with @AaronRobinsonMSFT, we believe the better approach is to communicate these values through AppContext.SetData API here, will that work for you?

normj · 2023-04-13T04:58:10Z

@cshung If I understand correctly I could detect what the actual memory limit is based on the Lambda runtime environment and then call something like the following

long lambdaMemoryLimit = determineLambdaMemory();


AppContext.SetData("GCHeapHardLimit", lambdaMemoryLimit);

That seems cleaner then setting an in-process environment variable and then pinging the .NET runtime to go take another look at the environment variables.

normj · 2023-04-13T18:20:09Z

Out of curiosity assuming the memory value being set is still significantly greater then what is currently being consumed will there be any performance cost setting a new memory limit?

cshung · 2023-04-15T02:46:12Z

Out of curiosity assuming the memory value being set is still significantly greater than what is currently being consumed will there be any performance cost in setting a new memory limit?

In the short term - the act of setting the limit itself won't cost much, all it does is walk the heap data structures and set some integer fields.

In the long term - obviously - how the application performs after switching the limits depends on what the application does.

One way to think about the problem is to think of this API as switching gears. How fast the car runs depends on a lot of stuff, how much gas you give (i.e. allocations), whether are you running against a slope (i.e. survivals), and what gears you are on. Your car probably won't slow down right away when you tune down the gears, but hopefully, eventually, it will as the car moved longer and the pedals are off.

I'd love to see that after refreshing the memory limit, after a reasonable amount of time to let the current situation change, the application should behave as if the memory limit was specified during startup.

While that's an ideal situation I would like to see, the reality may or may not be what we wanted it to be. The best way to tell is to actually experiment with it.

normj · 2023-04-17T06:19:44Z

Thanks for the info @cshung. In AWS Lambda we are most sensitive to anything that affects startup which also includes bootstrapping the Lambda runtime. When we have this API we would make the call really early in our startup process so at that point there shouldn't be much memory usage. So I'm hoping us telling the .NET runtime the true memory limit early in the startup phase should have negligible affect to startup performance.

cshung · 2023-04-19T16:52:49Z

@normj, @davidni, @zackliu: A private implementation of the API is now implemented and merged in the main branch as #83707, it will be available as part of .NET 8 Preview 4.

This is currently implemented as a private API to allow trying out without committing to the API shape just yet. Please try it out and let me know if you hit any issues.

normj · 2023-04-19T19:10:51Z

@cshung My use cases for needing this API are on Linux so the current limitation for the PR being for windows only will make it hard to meaningfully try this out.

cshung · 2023-04-19T19:51:27Z

@cshung My use cases for needing this API are on Linux so the current limitation for the PR being for Windows only will make it hard to meaningfully try this out.

The code should work for all platforms since the change is not platform specific. In fact, it compiles and passed all the existing tests we have for all platforms we support. The PR comment was inaccurate and I updated it. The comment really meant I haven't got a setup to test it on Linux/OSX yet. If you could help and see if that actually works on Linux for your scenario, that would be great.

normj · 2023-04-19T19:54:41Z

@cshung Great! I'll give it a try when preview 4 is released

bartonjs · 2023-05-09T18:09:07Z

Video

We generally believe that the API should be returning an enum, instead of having the docs describing integer returns. But, given that there's a proposal that it should actually be using HRESULT on the basis that it's expected to be called by native hosts (and HRESULT is the host interface response, even on non-Windows), int seems OK.

namespace System;
public static partial class GC
{
    [System.Runtime.Versioning.RequiresPreviewFeaturesAttribute("RefreshMemoryLimit is in preview.")]
    public static int RefreshMemoryLimit()
}

jkotas · 2023-05-09T18:26:55Z

it should actually be using HRESULT on the basis that it's expected to be called by native hosts (and HRESULT is the host interface response, even on non-Windows), int seems OK.

This is not how we are designing hosting APIs these days. Anything people need for hosting, we design it as first-class managed API first and then tell people to wrap it in an interop wrapper for hosting

If we add this API in the current shape, it is creating bad precent for introducing unnaturally looking APIs any time somebody wants to call something from unmanaged host.

bartonjs · 2023-05-09T19:11:02Z

@jkotas So you think the enum approach is better, then?

jkotas · 2023-05-09T19:16:39Z

The return value indicates error conditions. The API should return void, and appropriate exception (e.g. InvalidOperationException) should be thrown for error conditions. We just had an offline discussion about it. I assume that you have no issues with changing the return value to void and indicate errors by throwing exceptions, correct?

stephentoub · 2023-05-09T19:38:21Z

I assume that you have no issues with changing the return value to void and indicate errors by throwing exceptions, correct?

That was the first thing I asked for in the API review meeting :) but was told we couldn't do that because common usage would be from native code via hosting APIs and exception throwing would be problematic for that. If that's not the case, please do make it void and throw for errors.

cshung · 2023-05-09T19:43:56Z

We are still planning to call this API through the hosting API, but we are working towards a scheme that would allow the caller to catch the exception by having some sort of injected managed wrapper functions.

Once we have got this sorted out we can move towards using exceptions instead of error codes.

normj · 2023-05-18T23:53:14Z

I'm trying to test this out as part of Preview 4 but I'm not seeing any changes. I suspect I'm calling the AppContext.SetData incorrectly.

What I'm doing is at the start of my app I'm running this method using reflection to call the current private _RefreshMemoryLimit. I can see that the method.Invoke is being called

    private static void AdjustMemorySettings()
    {
        AppContext.SetData("System.GC.HeapHardLimit", 209715200);

        var gcType = typeof(GC);
        var method = gcType.GetMethod("_RefreshMemoryLimit", BindingFlags.NonPublic | BindingFlags.Static);
        if (method == null)
            throw new Exception("Failed to find _RefreshMemoryLimit method through reflection");

        var result = method.Invoke(null, new object[] { });
        Console.WriteLine("Result of Refresh: " + result);
    }

But then later when I call GC.GetGCMemoryInfo().TotalAvailableMemoryBytes I'm not seeing any change in the returned available memory. I have tried System.GC.HeapHardLimit and GCHeapHardLimit as they key for SetData. The method.Invoke for _RefreshMemoryLimit returns 0. This is running in AWS Lambda which means Linux and no containers for this example.

cshung · 2023-05-19T03:10:27Z

Thanks for trying it out!

I think you need AppContext.SetData("System.GC.HeapHardLimit", (ulong)209715200);

normj · 2023-05-19T16:46:23Z

@cshung No luck still. I used your SetData call and even added a GC.Collect for good measure but the GC.GetGCMemoryInfo().TotalAvailableMemoryBytes is still not using the value I passed in for SetData. I have tested this now on both Linux and Windows with the same behavior of the SetData call being ignored.

My current version of adjusting memory

    private static void AdjustMemorySettings()
    {
        AppContext.SetData("System.GC.HeapHardLimit", (ulong)209715200);

        var gcType = typeof(GC);
        var method = gcType.GetMethod("_RefreshMemoryLimit", BindingFlags.NonPublic | BindingFlags.Static);
        if (method == null)
            throw new Exception("Failed to find _RefreshMemoryLimit method through reflection");

        var result = method.Invoke(null, new object[] { });
        GC.Collect();
        Console.WriteLine("Result of Refresh: " + result);
    }

cshung · 2023-05-19T20:18:43Z

Sorry for the confusion, I think I missed a couple of things.
The key should be GCHeapHardLimit instead of System.GC.HeapHardLimit.

When I ran locally, I am seeing a bad bug that crashes on the first GC. The problem is that I accidentally swapped an assignment statement. I am going to get it fixed as soon as I can.

normj · 2023-05-20T00:01:32Z

Using GCHeapHardLimit did the trick and I see GC.GetGCMemoryInfo().TotalAvailableMemoryBytes reported the adjust memory. Sadly I also got the crash from the GC collect but I'm able to get far enough in my testing that it isn't blocking.

When we add support for .NET 8 in AWS Lambda I can add in our runtime startup code something like the following to look for the AWS Lambda memory size and configure the .NET runtime with the configured value unless the user had already configured a value less then that.

private static void AdjustMemorySettings()
{
    int lambdaMemoryInMb;
    if (!int.TryParse(Environment.GetEnvironmentVariable("AWS_LAMBDA_FUNCTION_MEMORY_SIZE"), out lambdaMemoryInMb))
        return;

    ulong memoryInBytes = (ulong)lambdaMemoryInMb * 1048576;

    // If the user has already configured the hard heap limit to something lower then is available 
    // then make no adjustments to honor the user's setting.
    if ((ulong)GC.GetGCMemoryInfo().TotalAvailableMemoryBytes < memoryInBytes)
        return;

    AppContext.SetData("GCHeapHardLimit", memoryInBytes);

    // Replace with non-reflection version once RefreshMemoryLimit is public
    var gcType = typeof(GC);
    var method = gcType.GetMethod("_RefreshMemoryLimit", BindingFlags.NonPublic | BindingFlags.Static);
    if (method == null)
        throw new Exception("Failed to find _RefreshMemoryLimit method through reflection");

    method.Invoke(null, new object[] { });
}

cshung · 2023-05-20T03:43:39Z

@normj, the bug is fixed in this PR and it is merged. It should be available in the next daily build.

mrsharm · 2023-06-22T20:15:30Z

@cshung will validate this with another customer.

mangod9 · 2023-08-02T23:05:26Z

@cshung assume this issue is ok to close now?

normj · 2023-08-02T23:11:13Z

If this matters I was planning implementing the above code for real in AWS Lambda based on latest previews this month to make sure everything is working as expected.

mangod9 · 2023-08-03T20:05:49Z

Closing as done.

cshung added api-suggestion Early API idea and discussion, it is NOT ready for implementation area-GC-coreclr labels Jun 10, 2022

cshung added this to the Future milestone Jun 10, 2022

cshung mentioned this issue Jun 29, 2022

Testing GC Heap Counts with Containers #71413

Closed

cshung mentioned this issue Sep 27, 2022

How to tell how much memory a .NET 6 application is really using? #76249

Open

mangod9 mentioned this issue Dec 14, 2022

Cloud Native: Dynamic adaptation for bursty workloads in .NET GC #79658

Closed

4 tasks

cshung mentioned this issue Mar 30, 2023

Private API for RefreshMemoryLimit #83707

Merged

cshung modified the milestones: Future, 8.0.0 May 1, 2023

cshung added the api-ready-for-review API is ready for review, it is NOT ready for implementation label May 1, 2023

bartonjs added api-approved API was approved in API review, it can be implemented and removed api-ready-for-review API is ready for review, it is NOT ready for implementation labels May 9, 2023

mangod9 closed this as completed Aug 3, 2023

ghost locked as resolved and limited conversation to collaborators Sep 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[API Proposal]: GC.RefreshMemoryLimit #70601

[API Proposal]: GC.RefreshMemoryLimit #70601

cshung commented Jun 10, 2022 •

edited

Loading

ghost commented Jun 10, 2022

Background and motivation

API Proposal

API Usage

Alternative Designs

Risks

zackliu commented Jun 13, 2022

davidni commented Jul 18, 2022

normj commented Mar 17, 2023 •

edited

Loading

cshung commented Apr 10, 2023

normj commented Apr 13, 2023

normj commented Apr 13, 2023

cshung commented Apr 15, 2023

normj commented Apr 17, 2023

cshung commented Apr 19, 2023

normj commented Apr 19, 2023

cshung commented Apr 19, 2023

normj commented Apr 19, 2023

bartonjs commented May 9, 2023 •

edited by dotnet-api-review bot

Loading

jkotas commented May 9, 2023

bartonjs commented May 9, 2023

jkotas commented May 9, 2023

stephentoub commented May 9, 2023

cshung commented May 9, 2023

normj commented May 18, 2023

cshung commented May 19, 2023

normj commented May 19, 2023

cshung commented May 19, 2023

normj commented May 20, 2023 •

edited

Loading

cshung commented May 20, 2023

mrsharm commented Jun 22, 2023

mangod9 commented Aug 2, 2023

normj commented Aug 2, 2023

mangod9 commented Aug 3, 2023

[API Proposal]: GC.RefreshMemoryLimit #70601

[API Proposal]: GC.RefreshMemoryLimit #70601

Comments

cshung commented Jun 10, 2022 • edited Loading

Background and motivation

API Proposal

API Usage

Alternative Designs

Risks

ghost commented Jun 10, 2022

Background and motivation

API Proposal

API Usage

Alternative Designs

Risks

zackliu commented Jun 13, 2022

davidni commented Jul 18, 2022

normj commented Mar 17, 2023 • edited Loading

cshung commented Apr 10, 2023

normj commented Apr 13, 2023

normj commented Apr 13, 2023

cshung commented Apr 15, 2023

normj commented Apr 17, 2023

cshung commented Apr 19, 2023

normj commented Apr 19, 2023

cshung commented Apr 19, 2023

normj commented Apr 19, 2023

bartonjs commented May 9, 2023 • edited by dotnet-api-review bot Loading

jkotas commented May 9, 2023

bartonjs commented May 9, 2023

jkotas commented May 9, 2023

stephentoub commented May 9, 2023

cshung commented May 9, 2023

normj commented May 18, 2023

cshung commented May 19, 2023

normj commented May 19, 2023

cshung commented May 19, 2023

normj commented May 20, 2023 • edited Loading

cshung commented May 20, 2023

mrsharm commented Jun 22, 2023

mangod9 commented Aug 2, 2023

normj commented Aug 2, 2023

mangod9 commented Aug 3, 2023

cshung commented Jun 10, 2022 •

edited

Loading

normj commented Mar 17, 2023 •

edited

Loading

bartonjs commented May 9, 2023 •

edited by dotnet-api-review bot

Loading

normj commented May 20, 2023 •

edited

Loading