Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GC does not release memory easily on kubernetes cluster in workstation mode #49317

Closed
Kiechlus opened this issue Mar 8, 2021 · 71 comments
Closed
Labels
area-GC-coreclr needs-author-action An issue or pull request that requires more info or actions from the author. question Answer questions and provide assistance, not an issue with source code or documentation.
Milestone

Comments

@Kiechlus
Copy link

Kiechlus commented Mar 8, 2021

Description

  • Application is a simple Api for up/download of (large) files from blob storage
  • Deployed on kubernetes cluster
  • GC in workstation mode as we need memory to be released for monitoring, scaling etc.
  • In VS locally this works just fine
    • Directly or few minutes after 1 GB download memory is released
    • It is always a Generation 2G collection, smaller generations do not release
      image
  • However, on the cluster memory can stay up for hours:

alt text

Configuration

  • .Net Core 3.1 App
  • Running in a docker container with 2Gi memory limit
  • AKS cluster, machines with 4 cores 16GB Ram
  • htop inside container:
    image

Other information

@dotnet-issue-labeler dotnet-issue-labeler bot added area-GC-coreclr untriaged New issue has not been triaged by the area owner labels Mar 8, 2021
@ghost
Copy link

ghost commented Mar 8, 2021

Tagging subscribers to this area: @dotnet/gc
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

  • Application is a simple Api for up/download of (large) files from blob storage
  • Deployed on kubernetes cluster
  • GC in workstation mode as we need memory to be released for monitoring, scaling etc.
  • In VS locally this works just fine
    • Directly or few minutes after 1 GB download memory is released
    • It is always a Generation 2G collection, smaller generations do not release
      image
  • However, on the cluster memory can stay up for hours:
alt text

Configuration

  • .Net Core 3.1 App
  • Running in a docker container with 2Gi memory limit
  • AKS cluster, machines with 4 cores 16GB Ram
  • htop inside container:
    image

Other information

Author: Kiechlus
Assignees: -
Labels:

area-GC-coreclr, untriaged

Milestone: -

@Kiechlus Kiechlus changed the title GC does not release memory easily on kubernetes cluster in workspace mode GC does not release memory easily on kubernetes cluster in workstation mode Mar 8, 2021
@mangod9 mangod9 added this to the 6.0.0 milestone Mar 8, 2021
@mangod9 mangod9 added question Answer questions and provide assistance, not an issue with source code or documentation. and removed untriaged New issue has not been triaged by the area owner labels Mar 8, 2021
@mangod9
Copy link
Member

mangod9 commented Mar 8, 2021

Hey @Kiechlus, do you observe that it eventually get collected, or doesnt unless there is memory pressure on the K8s cluster?

@Kiechlus
Copy link
Author

Kiechlus commented Mar 8, 2021

@mangod9 When I issue another download of a 1 GB file, the memory does not rise, so it must have been collected. But without such pressure it just stays as is.
On very few occations we even got OOM exception in such a scenario, but cannot reproduce it. (Pod memory limit is 2 Gi)
Unfortunately I haven't found an easy way so far to log G2 collection events.

@Maoni0
Copy link
Member

Maoni0 commented Mar 8, 2021

@Kiechlus
Copy link
Author

Kiechlus commented Mar 8, 2021

@Maoni0 will those traces help to analyse the issue? In this case we can try to get them.
We were playing around with https://docs.microsoft.com/de-de/dotnet/core/diagnostics/dotnet-trace
but did not really know what to do with the outcome.

@Maoni0
Copy link
Member

Maoni0 commented Mar 8, 2021

@Kiechlus yes, this is always the first step for diagnosing memory perf problems.

@Kiechlus
Copy link
Author

Kiechlus commented Mar 9, 2021

Hi @Maoni0
Please find attached the trace from our cluster : trace-cluster2.zip

Setup

  • Start the service
  • Download a 900 MB file
  • Waited for a few seconds
  • Downloaded again

Observations

  • Memory is not freed after download finishes, eventhough Api Controller has returned 200
    • More than 50% of memory allocated
    • Pod has 2 Gi mem limit
  • When downloading again, memory is freed (2G collection)
  • After it stays up again
    image

Expected outcome

  • We would like to see free memory immediately after Api returns 200
  • Because we monitor and scale based on resource consumption

@L-Dogg
Copy link

L-Dogg commented Apr 21, 2021

Hi, my team is struggling with a similar issue; we're currently working on gathering some traces from our app. The premise, however, is exactly the same - we're uploading a large file into Azure Blob Storage and, while on local dev env everything works fine and after some time there's full GC invoked, on our k8s cluster we get frequent OOMs.
We set workstation GC mode for this app and tried to tinker with LatencyMode and LOH Compaction Mode but, alas, with no luck.
Currently I'm planning on investigating our code since i suspect the issue originates there, but maybe you have some insights. @Kiechlus can you share if you managed to fix the issue?

@davidfowl
Copy link
Member

Can somebody provide some sample code for what the download or upload looks like? There are some known issues in with ASP.NET Core's memory pool not releasing memory that might be the case here but its possible that the code could be tweaked to avoid the memory bloat in the first place.

@Kiechlus
Copy link
Author

Kiechlus commented Apr 21, 2021

Hi, @L-Dogg we are still facing this issue.
@davidfowl, this is internal code in one service and some libs, I cannot just share it. Could you give us some hints what to look for? I can then share relevant parts of it for sure.

@L-Dogg
Copy link

L-Dogg commented Apr 21, 2021

Thanks for your replies. We're trying to prepare a minimal example, I just hope such an example will be enough to reproduce this behaviour.
@Kiechlus which version of Azure.Storage.Blobs do you use?

@davidfowl
Copy link
Member

@Kiechlus can you collect a gc-verbose trace to see where your allocations are coming from? Are the connections HTTPS connections?

@Maoni0
Copy link
Member

Maoni0 commented Apr 21, 2021

@Kiechlus somehow I missed this issue...sorry about that. I just took a look at the trace you collected. it does release memory, if you open the trace in perfview and open the GCStats view you'll see this

image

at GC#10, the memory usage went from 831mb to 9.8mb. but there's allocations in LOH again which made the memory go up again. what would be your desired behavior? you are not under high memory pressure so GC doesn't need to aggressively shrink the heap size.

@davidfowl
Copy link
Member

@Kiechlus It seems like you're churning the LOH, why is that? Are you using Streams or are you allocating large buffers?

@L-Dogg

we're uploading a large file into Azure Blob Storage and, while on local dev env everything works fine and after some time there's full GC invoked, on our k8s cluster we get frequent OOMs.

Are you using streams or are you allocating big arrays? Also are you using IFormFile or are you using the MultipartReader?

@Kiechlus
Copy link
Author

Kiechlus commented Apr 22, 2021

@davidfowl We are allocating the stream like this: var targetStream = new MemoryStream(fileLength);, where fileLength can be several GB. We made the experience that if we create the Stream without initial capacity, it will allocate much more memory in the end as the actual file size.

@Maoni0 You are right, we ran into OOM only in very few occasions. So memory is freed under pressure. But what we would need that it is freed immedeately after the controller returns and the Stream got deallocated.

Because the needs in a Kubernetes cluster are different. There are e.g. three big machines and Kubernetes schedules many pods on them. Based on different metrics it creates or destroys pods (horizontal pod autoscaling) @egorchabala.

If now some pod does not free memory eventhough it could, that means Kubernetes cannot use that memory for scheduling other pods and the autoscaling does not work. Also the memory monitoring is more difficult.

Is there any possibility to make GC release memory immedeately as soon as it is possible eventhough there is no high pressure? Do you still need a different trace or something the like?

@L-Dogg we are currently using version 11.2.2.

@davidfowl
Copy link
Member

@davidfowl We are allocating the stream like this: var targetStream = new MemoryStream(fileLength);, where fileLength can be several GB. We made the experience that if we create the Stream without initial capacity, it will allocate much more memory in the end as the actual file size.

Don't do this. This is the source of your problems. Don't buffer a gig in memory. Why aren't you streaming ?

@Kiechlus
Copy link
Author

Kiechlus commented Apr 22, 2021

@davidfowl We are using this client-side encryption https://docs.microsoft.com/de-de/azure/storage/common/storage-client-side-encryption?tabs=dotnet.
We never figured out how to do with streaming the Http response to the (browser) client. Is it possible? Does it needs adaptions on the browser clients? They are not controlled by us.

@davidfowl
Copy link
Member

We never figured out how to do with streaming the Http response to the (browser) client. Is it possible? Does it needs adaptions on the browser clients? They are not controlled by us.

Without seeing any code snippets, it's obviously harder to recommend something but I would imagine you have something like this:

See this last step:

// Download and decrypt the encrypted contents from the blob.
MemoryStream targetStream = new MemoryStream(fileLength);
blob.DownloadTo(targetStream );

Then something is either copying the targetStream to the HttpResponse? If yes, then avoid the temporary stream and just copy it to the response directly.

@Kiechlus
Copy link
Author

Hi @davidfowl thanks for your reply!
Yes we have in some blobstore-related library exactly the code you said.

This is consumed by the service, goes through some layers, and in the end in the Controller it is:

var result = new FileStreamResult(serviceResponse.Content.DownloadStream, System.Net.Mime.MediaTypeNames.Application.Octet);
result.FileDownloadName = serviceResponse.Content.Filename;

return result;

I'm still not sure how to avoid writing it to some temporary stream. But would be very great if we could solve it.

@davidfowl
Copy link
Member

Where is the temporary stream doing all the buffering?

@Kiechlus
Copy link
Author

Kiechlus commented Apr 22, 2021

@davidfowl Do you mean this?

BlobRequestOptions optionsWithRetryPolicy = new BlobRequestOptions
        {
            EncryptionPolicy = encryptionPolicy,
            RetryPolicy = new Microsoft.Azure.Storage.RetryPolicies.LinearRetry(
                TimeSpan.FromSeconds(errorRetryTime),
                errorRetryAttempts),
            StoreBlobContentMD5 = true,
        };
StorageCredentials credentials = new StorageCredentials(accountName, accountKey);
BlobClient = new CloudBlobClientWrapper(new Uri(storageUrl), credentials, optionsWithRetryPolicy);

...

var container = BlobClient.GetContainerReference(...);
CloudBlockBlob destBlob = container.GetBlockBlobReference(blobId);
var targetStream = new MemoryStream(fileLength);
                
var downloadTask = destBlob.DownloadToStreamAsync(
        target: targetStream,
        accessCondition: null,
        options: null,
        operationContext: new OperationContext(),
        cancellationToken: cancellationToken);

return await ReadWriteRetryPolicy.ExecuteAsync(
        (context, token) => downloadTask,
        pollyContext,
        cancellationToken);

@davidfowl
Copy link
Member

Why isn't this code taking in the target Stream?

@Kiechlus
Copy link
Author

Kiechlus commented Apr 22, 2021

@davidfowl Do you mean in the Controller we should do something like this?

MemoryStream targetStream = new MemoryStream(fileLength);
blobLib.DownloadTo(targetStream );
return new FileStreamResult(targetStream );

If this makes a difference we will for sure try.

@davidfowl
Copy link
Member

The controller should look like this:

Response.ContentLength = fileLength;
await blobLib.DownloadToAsync(Response.Body);

@oruchreis
Copy link

hi @oruchreis I have been checking my twitter DM and did not see anything from you. could you please send it to @mangod9's work email (it's in his profile)? also this

even though I use the GC.Collect method with Aggressive and LOH compact mode, it doesn't release the unused memory to the operating system. GC.Collect only reduces the size of the heap. With the settings below, an application with a heap of 500mb shows 5-6GB on the k8s dashboard

doesn't sound right - if it is memory on the GC heap that is not used, it should be released unless you have pins that prevent us from releasing free memory inbetween them. if you have a dump that exhibits this, please definitely send it to us. also looping in @cshung.

I've sent email to @mangod9 with the dump link. I can also create new dump and trace and send them if you request.

Best Regards.

@ghost ghost added the no-recent-activity label Jun 23, 2023
@ghost
Copy link

ghost commented Jun 23, 2023

This issue has been automatically marked no-recent-activity because it has not had any activity for 14 days. It will be closed if no further activity occurs within 14 more days. Any new comment (by anyone, not necessarily the author) will remove no-recent-activity.

@oruchreis
Copy link

We are still struggling with high memory usage on only Linux k8s environments. I would appreciate it if this issue is not closed until the problem is solved.

@ghost ghost removed the no-recent-activity label Jul 3, 2023
@ghost ghost added the no-recent-activity label Jul 17, 2023
@ghost
Copy link

ghost commented Jul 17, 2023

This issue has been automatically marked no-recent-activity because it has not had any activity for 14 days. It will be closed if no further activity occurs within 14 more days. Any new comment (by anyone, not necessarily the author) will remove no-recent-activity.

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Jul 29, 2023
@mangod9 mangod9 modified the milestones: 8.0.0, 9.0.0 Aug 11, 2023
@ghost ghost removed the no-recent-activity label Aug 11, 2023
@ghost ghost added the no-recent-activity label Aug 25, 2023
@ghost
Copy link

ghost commented Aug 25, 2023

This issue has been automatically marked no-recent-activity because it has not had any activity for 14 days. It will be closed if no further activity occurs within 14 more days. Any new comment (by anyone, not necessarily the author) will remove no-recent-activity.

@ghost
Copy link

ghost commented Sep 8, 2023

This issue will now be closed since it had been marked no-recent-activity but received no further activity in the past 14 days. It is still possible to reopen or comment on the issue, but please note that the issue will be locked if it remains inactive for another 30 days.

@ghost ghost closed this as completed Sep 8, 2023
@oruchreis
Copy link

oruchreis commented Sep 14, 2023

After lots of research, dumping, tracing, I suppose we've found some clues about this not releasing memory issue by seeing those simmilar github issues:

We've solved this issue with these malloc env settings:

   - name: MALLOC_ARENA_MAX
      value: 2
   - name: MALLOC_TRIM_THRESHOLD_
      value: 85000

We'are using lots of dynamic code compilations with roslyn as well as some il emit codes. But I don't know why we have to set these malloc settings in linux container environment. There isn't any issue with Windows by the way.
Also I don't know if these settings are the final solution or a workaround?
Any thoughts @Maoni0 @mangod9 ?

Best Regards.

@ghost ghost removed the no-recent-activity label Sep 14, 2023
@oruchreis
Copy link

By the way I'm leaving the settings we used in linux container here in case someone else encounters this issue. But as mentioned in the github issues I posted above, these settings will vary from application to application. So you need to experiment a bit with these settings. By playing with these settings, we were able to make the production environment running in Linux container closer to the Windows environment:

MALLOC_ARENA_MAX: 2
MALLOC_TRIM_THRESHOLD_: 85000
DOTNET_gcServer: 1
DOTNET_GCConserveMemory: 7
DOTNET_GCHeapHardLimitPercent: "0x5a" # 90%
DOTNET_GCHeapAffinitizeRanges: "0-3"

In Windows we only set DOTNET_gcServer: 1.

@mangod9
Copy link
Member

mangod9 commented Sep 14, 2023

Adding @janvorli as well. Wonder if this is related to W^X possibly. @oruchreis what .NET version were you using?

@oruchreis
Copy link

Hi @mangod9 , it is .net 7 with latest minor version. also we had tried old libclrgc.so which didn't have much effect. As a reminder, I mentioned the whole scenario in my previous messages on this thread and sent you and maoni the dump and trace as per your request.

@oruchreis
Copy link

Should I create a new topic to better follow the issue about this specific env settings and the issue of not releasing memory?

@hoyosjs
Copy link
Member

hoyosjs commented Sep 22, 2023

@oruchreis MALLOC_ARENA_MAX is a source of grief we've faced before. It's because gnu's libc can be very lazy at returning memory pages to the OS otherwise. It might improve throughput, but it costs memory pressure. It's also the reason someone reported the switch to alpine to help - it's a completely different libc where the allocators behave differently.

@Leonardo-Ferreira
Copy link

On my issue #90640 I was referred to DATAS and it does seem very promising for this kind of problem here...

Messing with MALLOC_ARENAS_MAX and other seemed too much of a risk

@ValdikSS
Copy link

ValdikSS commented Oct 17, 2023

MALLOC_ARENAS_MAX is just a half of the story.

The general issue with glibc is that it does not return the memory to OS if the arena is not completely continuously empty. In other words, it does not punch holes in the allocated chunk automatically and require manual calls of malloc_trim() function, supposedly to prevent memory fragmentation.

Example C code and the description of the issue is available here for example: https://stackoverflow.com/questions/38644578/understanding-glibc-malloc-trimming

This is a very common issue for many applications which may have huge peak memory consumption but low regular memory usage. It hit me in squid, pm2, node.js.

The simplest solution is to use jemalloc, an alternative heap allocator. It's usually as easy as LD_PRELOAD. Or use distro which doesn't use glibc, such as Alpine with musl libc.

@oruchreis
Copy link

oruchreis commented Oct 19, 2023

Like I've mentioned above, I've tried many heap allocator like mimalloc, jemalloc,tcmalloc etc.. but it didn't effect much. We're using those two malloc settings in production with net 7 right now, and we didn't notice any memory pressure fortunately. I understand these values differ from app to app. I couldn't try alpine yet, because we use some native libraries which doesn't work with musl. I'll try alpine in future when I recompile native dependecies with musl. But this is a major issue for .net I think, we should not be setting c native configs such as malloc_arenas_max. If it is necessary to set these settings in order to use .net on a distro that uses glibc, shouldn't it be mentioned in the documentation?

@dotnet dotnet locked as resolved and limited conversation to collaborators Nov 18, 2023
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Feb 27, 2024
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-GC-coreclr needs-author-action An issue or pull request that requires more info or actions from the author. question Answer questions and provide assistance, not an issue with source code or documentation.
Projects
None yet