Large unmanaged memory growth (leak?) when upgrading from .NET 6 to 8 #95922

SamWilliamsGS · 2023-12-12T16:58:01Z

Description

We have a few different services hosted on kubernetes running on .NET. When we try to upgrade from .NET 6 to .NET 8, we see a steep but constant increase in memory usage, almost all in unmanaged memory. It seems to level off at around four times the memory usage in .NET 6, ignoring imposed memory limits, then continues to creep up more slowly depending on workload. So far we haven't seen an upper bound on the amount of unmanaged memory being leaked(?) here. Reproducing the problem in a minimal way has not been possible so far but we do have lots of data gathered about it. 🙂

Configuration

.NET 8, from the docker image mcr.microsoft.com/dotnet/aspnet:8.0, running on x86-64 machines on AWS EC2.

Regression?

Yes, see data below. This issue does not occur on .NET 6, only on 8. We think it might be part of the GC changes from .NET 6 to 7. Give us a shout and we can try to narrow this down by running it on .NET 7.

Data

Initially we switched from .NET 6 to .NET 8 and we monitored memory usage using prometheus metrics. This is what the memory usage graphs look like. Both pods actually reached the 512m limit we'd imposed, and was restarted. After that we reverted to .NET 6, and things went back to normal. On .NET 6, memory usage remained consistently around ~160MB, but as soon as we deployed the upgrade to .NET 8 the memory increased without limit and were restarted once at 15:30 after hitting 512MB, once we returned to .NET 6 things went back to normal.

We then tried increasing the available memory from 512MB to 1GB and re-deployed .NET 8. It increased rapidly as before, then levelled off at about 650MB and stayed that way until midnight. Service load increases drastically around that time and the memory grew again to about 950MB, where it stayed relatively level again until the service was unwittingly redeployed by a coworker. At that point we reverted back to .NET 6, where it went back to the lower memory level. I think it would have passed the 1GB memory limit after another midnight workload, but we haven't tested that again (yet).

After trying and failing to reproduce the issue using local containers, we re-deployed .NET 8 and attached the JetBrains dotMemory profiler to work out what was happening. This is the profile we collected, showing the unmanaged memory increases. Interestingly, the amount of managed memory actually goes down over time with GCs becoming more frequent, presumably .NET knows the available memory is running low as the total approaches 1GB. There also seem to be some circumstances where .NET will not allocate from unmanaged memory, since the spikes near the left hand side mirror each other for managed and unmanaged. We had to stop the profile before reaching the memory limit, since kubernetes would have restarted the pod and the profile would have been lost.

And the prometheus memory usage graph, for completeness (one pod is higher than the other because it was running the dotMemory profiler this time, and drops because of detaching the profiler):

Analysis

The only issue we could find that looked similar was this one, which also affects aspnet services running in kubernetes moving to .NET 7: #92490. As it's memory related we suspect this might be to do with the GC changes going from .NET 6 to 7. We haven't been able to get a clean repro (or any repro outside our hosted environments) yet, but please let us know if there's anything we can do to help narrow this down. 🙂

The text was updated successfully, but these errors were encountered:

ghost · 2023-12-12T17:15:50Z

Tagging subscribers to this area: @dotnet/gc
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

We have a few different services hosted on kubernetes running on .NET. When we try to upgrade from .NET 6 to .NET 8, we see a steep but constant increase in memory usage, almost all in unmanaged memory. It seems to level off at around four times the memory usage in .NET 6, ignoring imposed memory limits, then continues to creep up more slowly depending on workload. So far we haven't seen an upper bound on the amount of unmanaged memory being leaked(?) here. Reproducing the problem in a minimal way has not been possible so far but we do have lots of data gathered about it. 🙂

Configuration

.NET 8, from the docker image mcr.microsoft.com/dotnet/aspnet:8.0, running on x86-64 machines on AWS EC2.

Regression?

Yes, see data below. This issue does not occur on .NET 6, only on 8. We think it might be part of the GC changes from .NET 6 to 7. Give us a shout and we can try to narrow this down by running it on .NET 7.

Data

Initially we switched from .NET 6 to .NET 8 and we monitored memory usage using prometheus metrics. This is what the memory usage graphs look like. Both pods actually reached the 512m limit we'd imposed, and was restarted. After that we reverted to .NET 6, and things went back to normal. On .NET 6, memory usage remained consistently around ~160MB, but as soon as we deployed the upgrade to .NET 8 the memory increased without limit and were restarted once at 15:30 after hitting 512MB, once we returned to .NET 6 things went back to normal.

We then tried increasing the available memory from 512MB to 1GB and re-deployed .NET 8. It increased rapidly as before, then levelled off at about 650MB and stayed that way until midnight. Service load increases drastically around that time and the memory grew again to about 950MB, where it stayed relatively level again until the service was unwittingly redeployed by a coworker. At that point we reverted back to .NET 6, where it went back to the lower memory level. I think it would have passed the 1GB memory limit after another midnight workload, but we haven't tested that again (yet).

After trying and failing to reproduce the issue using local containers, we re-deployed .NET 8 and attached the JetBrains dotMemory profiler to work out what was happening. This is the profile we collected, showing the unmanaged memory increases. Interestingly, the amount of managed memory actually goes down over time with GCs becoming more frequent, presumably .NET knows the available memory is running low as the total approaches 1GB. There also seem to be some circumstances where .NET will not allocate from unmanaged memory, since the spikes near the left hand side mirror each other for managed and unmanaged. We had to stop the profile before reaching the memory limit, since kubernetes would have restarted the pod and the profile would have been lost.

And the prometheus memory usage graph, for completeness (one pod is higher than the other because it was running the dotMemory profiler this time, and drops because of detaching the profiler):

Analysis

The only issue we could find that looked similar was this one, which also affects aspnet services running in kubernetes moving to .NET 7: #92490. As it's memory related we suspect this might be to do with the GC changes going from .NET 6 to 7. We haven't been able to get a clean repro (or any repro outside our hosted environments) yet, but please let us know if there's anything we can do to help narrow this down. 🙂

Author:	SamWilliamsGS
Assignees:	-
Labels:	`tenet-performance`, `area-GC-coreclr`, `untriaged`
Milestone:	-

MichalPetryka · 2023-12-12T18:52:51Z

We think it might be part of the GC changes from .NET 6 to 7

Does setting export DOTNET_GCName=libclrgc.so(this reverts to the old GC) fix this issue?

mangod9 · 2023-12-12T22:11:23Z

It could also be related to this issue if its a continuous memory growth: #95362. Are you able to collect some GCCollectOnly traces so we can diagnose further?

taylorjonl · 2023-12-13T14:32:21Z

We are experiencing the same issue and are running on 7.0.4. From the below graph you can see that the native memory goes up while the GC heap size stays flat:
Then the PODs recycle either from a release or from manual intervention. I will attempt rolling back to the old GC but it is peak season so management may not allow it. We also are in a security hardened environment so running any profilers and/or diagnostic tools would likely take an act of congress so I am eager to see how your testing goes since it appears you are able to do more profiling.

SamWilliamsGS · 2023-12-15T11:34:49Z

Sorry for the slow response here. Good to hear we're not the only ones seeing this @taylorjonl!

@MichalPetryka we tried the old GC setting but unfortunately no dice, the memory graphs look the same as before 😢. We'll try out more of the suggestions here after new year but I'm on holiday until then, so this issue might go quiet for a bit. Thanks everyone for the help so far!

MichalPetryka · 2023-12-15T17:18:10Z

Maybe it's W^X or #95362 like mentioned before. Can you try export DOTNET_EnableWriteXorExecute=0?

janvorli · 2023-12-19T10:22:05Z

W^X should not cause unbound native memory growth.
There are other sources of native growth I have seen in my debugging similar issues for customers in the past few months:

Using XmlSerializer created using the constructor with XmlAttributeOverrides argument. See https://learn.microsoft.com/en-us/dotnet/api/system.xml.serialization.xmlserializer?view=net-8.0#dynamically-generated-assemblies for the reasons.
A leak from OpenSSL related stuff - this is currently being investigated, as it occurs at some rare case and it is not clear yet what is the culprit.

It is also possible that the native memory leak is caused by a tiny GC memory leak - a case when tiny managed object holds a large block of native memory alive. You would not see such a leak on the GC memory graph. That is also a possible cause related to OpenSSL where runtime uses SafeHandle derived types to reference possibly large data structures - like client certificates - allocated by the OpenSSL. I've seen cases when there was a certificate chain upto a 1GB large.

To try to figure out the culprit, it would be helpful to take a dump of the running process at a point when it has already consumed a large amount of memory and then investigate it using a debugger with SOS plugin or the dotnet-dump analyze command. I can provide more details on that.

Also, if you'd be able to create a repro that you'd be able to share with me, I'd be happy to look into it myself.

SamWilliamsGS · 2024-01-02T14:36:59Z

It could also be related to this issue if its a continuous memory growth: #95362. Are you able to collect some GCCollectOnly traces so we can diagnose further?

@mangod9 just to double check, since this is happening on linux, would perfcollect with -gccollectonly flag as described here work the same way? 🙂

SamWilliamsGS · 2024-01-02T16:51:04Z

Maybe it's W^X or #95362 like mentioned before. Can you try export DOTNET_EnableWriteXorExecute=0?

@MichalPetryka we gave disabling W^X a go today and unfortunately no difference. Is there a nightly docker image we could use to try out the TLS fix? 🙂 (I tried looking at the docs but got a bit mixed up with how backporting works in this repo, sorry!)

am11 · 2024-01-02T18:14:28Z

Is there a nightly docker image we could use to try out the TLS fix?

For .NET 9 daily build testing, install script can be used as follow:

VERSION=9
DEST="$HOME/.dotnet$VERSION"

# recreate destination directory
rm -rf "$DEST"
mkdir "$DEST"

# download and install
curl -sSL https://dot.net/v1/dotnet-install.sh | bash /dev/stdin --quality daily --channel "$VERSION.0" --install-dir "$DEST"

# add nuget feed
cat > "$HOME/.nuget/NuGet/NuGet.Config" <<EOF
<?xml version="1.0" encoding="utf-8"?>
<configuration>
  <packageSources>
    <add key="nuget.org" value="https://api.nuget.org/v3/index.json" protocolVersion="3" />
    <add key="dotnet$VERSION" value="https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet$VERSION/nuget/v3/index.json" />
  </packageSources>
</configuration>
EOF

PATH="$DEST":$PATH
DOTNET_ROOT="$DEST"
export PATH DOTNET_ROOT
# dotnet --info
# dotnet publish ..
# etc.

after changing net8.0 to net9.0 in *.csproj files, rebuild your docker image. After the testing, revert these changes and rebuild the docker image again (to bring back net8.0).

jtsalva · 2024-01-03T13:53:47Z

We're also seeing the exact same issue

@janvorli do you still need any memory dumps? Happy to share privately

janvorli · 2024-01-03T13:59:25Z

@jtsalva that would be great! My email address is my github username at microsoft.com

danmoseley · 2024-01-03T14:28:01Z

@am11 I wonder whether that docker file might be useful to have in the docs in this repo.

SamWilliamsGS · 2024-01-03T15:45:49Z

[snip]

To try to figure out the culprit, it would be helpful to take a dump of the running process at a point when it has already consumed a large amount of memory and then investigate it using a debugger with SOS plugin or the dotnet-dump analyze command. I can provide more details on that.

Also, if you'd be able to create a repro that you'd be able to share with me, I'd be happy to look into it myself.

Thank you very much for the detailed response @janvorli it's really appreciated 🙂 We have a profile from JetBrains' dotMemory and a few snapshots taken with dotTrace. Would those be helpful to you in lieu of a dotnet-dump? (We can probably also get the latter if that's better to have, will need to get clearance to send to you regardless). I've had difficulty getting a clean repro for this since it only seems to happen when it's hosted on kubernetes but will keep working on it 😄

janvorli · 2024-01-03T16:28:48Z

@SamWilliamsGS I need a dump that contains whole dump of the process memory. I am not sure if the dotMemory profile contains that or not. In case you have sensitive data in the dump, I can just explain how to look at the interesting stuff in the dump and you can do it yourself.

SamWilliamsGS · 2024-01-04T10:23:36Z

For .NET 9 daily build testing, install script can be used as follow:

Thanks @am11. Looking at #95362 it looks like this fix was backported to .NET 8, is that correct? And does that mean we'd expect to see the fix already in the standard .NET 8 docker images at this point? 🙂

martincostello · 2024-01-04T11:03:10Z

The backport PR (#95439) has a milestone of 8.0.2, so I don't think you'll see it in non-daily Docker images for .NET 8 until February's servicing release.

dlxeon · 2024-01-04T17:32:33Z

@martincostello I'm curious is there any place where information about next planning servicing release date and fixes is published?

So far I only found this page referring to "Patch Tuesday" every month. https://dotnet.microsoft.com/en-us/platform/support/policy/dotnet-core#servicing
Github milestone doesn't have planned release date either https://github.com/dotnet/runtime/milestone/133

We've updated some of services to .net8 and faced similar unmanaged memory leaks.

martincostello · 2024-01-04T18:17:45Z

I'm afraid I don't know the definitive answer (I'm not a member of the .NET team), I just know from prior experience that there's typically a release on the second Tuesday of every month to coincide with Patch Tuesday. What makes those releases though is often just detective work from looking at what's going on in GitHub PR/issues as releases with security fixes don't get worked on in public.

nhart12 · 2024-01-05T20:13:07Z

Has anyone gained further insight into this? I'm also experiencing very similar issues and am working on getting some dumps and traces now to further analyze. It seems to be impacting some of our kubernetes (k3s) services deployed to edge locations with tight memory constraints. Previously these services were on .net 6 and would be fairly stable with limited memory (ranges of 128 Mib - 256 Mib).. Since uplifting them to .NET 8 we are experiencing higher base memory usage plus OOMKilling going on quite frequently as memory seems to consistently grow over time with just k8s probes/health-checks running... Enabling DATAS and GCConserve = 9 does seem to greatly improve things, but I still have tests that fail that used to pass on .net 6. The tests in question all do some batch operations that require more memory than normal load and with the higher usage in .net 8 they just cause the POD to get OOMKilled.
It's hard to narrow down exact changes as with every .net uplift there's also countless dependency uplifts also. Most of these services in question also went from EFCore 6 -> EFCore 8. I have tried some of the nightly docker images for aspnet 8 but still seeing these tests fail when doing larger batch operations so there must be some additional regression causing a higher memory footprint

mangod9 · 2024-01-05T20:31:20Z

as memory seems to consistently grow over time

can you quantify the amount of growth? Could be related to #95439 as suggested above.

nhart12 · 2024-01-05T20:39:42Z

With DATAS enabled it doesn't seem to grow (or possibly just has more aggressive GC to account for the unmanaged leak?) .. but memory usage is simply just higher. I'll try to collect some metrics next week. I'll likely have to revert some PODS to .net 6 to get some baselines and compare as we weren't watching it as closely until the OOM's.
Without DATAS enabled, one of our pods (which normally averages ~100Mib) would slowly over a few hours trickle more and more memory until getting killed at 128Mib)

Wouldn't the nightly aspnet images have the TLS leak fix in them?

denislohachev1991 · 2024-07-04T12:28:47Z

I'm also facing this problem. I used heaptrack as recommended by @janvorli in some discussion and this is what I got. In my case, almost all leaks occur in lib*.so files

Leonardo-Ferreira · 2024-07-04T12:34:11Z

I actually had the same problem and I found out that it was related to Azure EventHub SDK... one of the guys was instantiating the EventHubProducerClient sending 1 event and DISPOSING it. But never the less a leak is there. When we started reusing the client, the problem resolved.

denislohachev1991 · 2024-07-04T12:55:13Z

@Leonardo-Ferreira Thank you for your attention and time, but as far as I know we do not use the Azure EventHub SDK

rzikm · 2024-07-04T13:19:33Z

@denislohachev1991 it is hard to gleam any useful information from the screenshot you shared, can you share the trace/report file and in which tool it can be opened? Also, knowing more about the application (e.g. how long the data collection was running, how much traffic it served) would be helpful when examining the trace.

rzikm · 2024-07-04T13:24:43Z

I recently noticed that static file downloads were very slow using http3 and I decided to disable it, leaving only h1 and h2, and both the download and memory problems were resolved.

@Leandropintogit, regarding HTTP/3, do you know what the target server was? We are running HTTP/3 benchmarks and are aware of some performance gaps compared to HTTP/2, but it should still be very usable. Since we wanted to focus a bit on HTTP/QUIC perf for .NET 9, we might want to investigate.

Leandropintogit · 2024-07-04T13:50:59Z

I recently noticed that static file downloads were very slow using http3 and I decided to disable it, leaving only h1 and h2, and both the download and memory problems were resolved.

@Leandropintogit, regarding HTTP/3, do you know what the target server was? We are running HTTP/3 benchmarks and are aware of some performance gaps compared to HTTP/2, but it should still be very usable. Since we wanted to focus a bit on HTTP/QUIC perf for .NET 9, we might want to investigate.

What do you mean target server?

My setup
Netcore 8.0.6 with kerstrel
Debian Docker image running on kubernets at Google cloud
8 vcpus
32 gb ram

Rps +/- 300

denislohachev1991 · 2024-07-04T19:50:36Z

@rzikm How can I share the trace file with you? Я использовал Heaptrack GUI.

This is a simple site with a bot configured that simply requests the home page every 5 minutes. It also performs several background tasks, such as fetching emails from the database and sending them. But this site is configured for testing and there is no data to process for background tasks. All static files are stored on S3 and requested via AWS SDK for .NET

rzikm · 2024-07-08T07:53:00Z

What do you mean target server?

@Leandropintogit I mean if you know what HTTP/3 implementation the other server is using (specifically, if it is running .NET as well), Basically enough information that I can attempt to replicate your observations and investigate them.

rzikm · 2024-07-08T08:03:21Z

@denislohachev1991

Looking at the second screenshot, I am not 100% sure we're looking at a memory leak. The heaptrack tool works by tracking malloc/free calls, and then it reports all unfreed memory as leaks. But if you terminate the trace collection while the mallocd memory is actively being used, then it will still get reported as a leak (i.e. it is false positive). Based on the description of the workload the numbers seem appropriate to me and may simply represent a steady state of the application.

To identify something as a leak with greater confidence, you need to either

see disproportionate amount of memory being allocated and not freed (we're talking tens or hundreds of MB)
observe the suspected allocations long-term and see their number steadily increasing over time

denislohachev1991 · 2024-07-08T08:31:20Z

@rzikm I'm not sure if this is due to a leak or if this is normal behavior. We have several instances of an application that, over time (it takes about a month or more) consume all the server memory. I was recently looking through the code and found several places where resources were not freed. After that, I started monitoring the test application and from the start of the launch (~200 MB) to one day of work it gains up to (~450 MB). But even these numbers are very different from running the application on a Windows server. On a Windows server, the application consumes ~200 MB. That's why I assumed that the issue was related to a memory leak.

rzikm · 2024-07-08T08:53:53Z

I started monitoring the test application and from the start of the launch (~200 MB) to one day of work it gains up to (~450 MB).

Yep, that is good indication of a leak. it would be good to run it with heaptrack for long enough for these 100+MB to show in the report, it will be easier to isolate the leak from the rest of the live memory. I suggest using dotnet-symbol on all .so files in the application directory (assuming self-contained publish of the app) to download symbols (will show better callstacks in heaptrack).

Another possible issue you are hitting is #101552 (comment), see linked comment for diagnosis step and possible workaround.

janvorli · 2024-07-08T13:19:59Z

@denislohachev1991 could you please get symbols for the .NET shared libraries, like libcoreclr.so etc? Without the symbols, we cannot see where the allocations were coming from. You can fetch the symbol files using the dotnet-symbol tool - just call it on the related .so file and it will fetch its .so.dbg file to the same directory where the library is located. The heaptrack should then be able to see them. You can use wildcards to fetch symbols for all the libxxxx.so in the dotnet runtime location.
You can install the dotnet-symbol using the dotnet tool install -g dotnet-symbol command.

denislohachev1991 · 2024-07-08T14:21:16Z

@janvorli Hello. I did as advised, here are my settings for self-contained the application.

After that, I get the symbols for all *.so files.

I transferred all the files to the Linux server and launched the application using heaptrack. Now I'm waiting for the application to work for a long time. After that, I can provide you with the heaptrack.dotnet.13121.zst file if you need it. I also want to dump the file using dotnet-dump collect after the application has been running overnight. Thanks for your time.

denislohachev1991 · 2024-07-09T07:22:06Z

After the application worked for about 2 hours I got the following results heaptrack.

I don't know if this will be useful. Our system works like this: we have 1 server as a load balancer, nginx is installed on it and certificates are stored on this server. We also have 2 servers where the application itself runs under Kestrel. Nginx works as a proxy.

rzikm · 2024-07-09T07:59:11Z

This shows the same thing as on your earlier report, those 37 MB "leaked" can very well be live memory.

To be able to see anything useful, we need a report where we can see the 200 MB increase you mentioned in your previous message

I started monitoring the test application and from the start of the launch (~200 MB) to one day of work it gains up to (~450 MB).

Can you try running the collection for one day or more?

janvorli · 2024-07-09T08:32:23Z

@denislohachev1991 on glibc based linux distros, each thread consumes 8MB of memory for its stack by default. It looks like most of the memory in your log comes from that. You can try to lower that size e.g. to 1.5MB by setting the DOTNET_DefaultStackSize=0x180000 and see if that reduces the memory size significantly. I would try that with running your app for the ~2 hours as you did for the previous results so that it is comparable.
Also, as @rzikm said, it would be great to let your app run after that longer until the consumption goes to the high numbers you were seeing before.

denislohachev1991 · 2024-07-09T11:23:24Z

@janvorli As soon as you advised, I set DOTNET_DefaultStackSize=0x180000 and started the application.

I'll watch how it consumes memory after these changes. I also launched another instance of the long-term monitoring application heaptrack.

Leandropintogit · 2024-07-09T13:00:18Z

What do you mean target server?

@Leandropintogit I mean if you know what HTTP/3 implementation the other server is using (specifically, if it is running .NET as well), Basically enough information that I can attempt to replicate your observations and investigate them.

Hi
There is no other server. Just kestrel listening on 443 port using H1/H2/H3 protocols.

Before
listenOptions.Protocols = HttpProtocols.Http1AndHttp2AndHttp3

After
listenOptions.Protocols = HttpProtocols.Http1AndHttp2;

Tested using Chrome and Edge

denislohachev1991 · 2024-07-10T07:50:07Z

I set the DOTNET_DefaultStackSize=0x180000 variable and monitored the application after that. This change had virtually no effect on the operation of the application. I also received heaptrack after 21 hours of using the application. Having examined heaptrack I do not see significant changes from the previous ones.

Therefore, it seems to me that this is normal application behavior, since an application on Linux consumes significantly more memory compared to Windows, this is just my assumption and perhaps I could be wrong.

janvorli · 2024-07-10T08:35:44Z

@denislohachev1991 could you please share the heaptrack log with me so that I can drill into it in more detail? It is strange that the env variable didn't have any effect.

darthShadow · 2024-07-10T08:55:00Z

Slight OT, isn't the DOTNET_DefaultStackSize supposed to be specified without the preceding 0x since it's implicitly hexadecimal?
Atleast that's what I have understood from the previous comments in this and other threads but I am not sure if the preceding 0x is ignored anyway.

janvorli · 2024-07-10T10:52:37Z

@darthShadow it doesn't matter, both ways work. We use strtoul to perform the conversion of the env var contents to number and it can take the 0x prefix optionally. See https://en.cppreference.com/w/cpp/string/byte/strtoul.

janvorli · 2024-07-10T12:15:17Z

@denislohachev1991 I've looked at the dump you've shared with me. It seems there was no permanent growth of the memory consumption over the time, there are few spikes, but the memory consumption stays about the same. Looking at the bottom-up tab in the heaptrack gui, around 25MB are coming from the openssl and about 14.5MB from the coreclr ClrMalloc, which is used by C++ new and C malloc implementations. On Windows, the HTTPS communication doesn't use openssl and IIRC, the memory consumed by that is not attributed to a specific process, so you won't see it in the working set of the process.
Overall, there seems to be nothing wrong.

denislohachev1991 · 2024-07-10T12:24:33Z

@janvorli Thank you for your work and time spent.

yaseen22 · 2024-08-09T13:23:17Z

Thanks for this great thread.
It gave us a lot of valuable insights.

We had the same issue, which is memory growth of our Kubernetes pods after migrating to .NET8 from .NET6
What works for us was switching to Alpine Linux distribution instead of Debian one.

We tried adding this configuration DOTNET_DefaultStackSize=0x180000 to our Debian images, but it didn't work.

If you can explain to me in some details the root cause, why lowering the size of the Default Stack or using the Alpine linux which has low stack size by default (from my understanding), help fixing the issue ?

nhart12 · 2024-08-09T14:42:45Z

Thanks for this great thread. It gave us a lot of valuable insights.

We had the same issue, which is memory growth of our Kubernetes pods after migrating to .NET8 from .NET6 What works for us was switching to Alpine Linux distribution instead of Debian one.

We tried adding this configuration DOTNET_DefaultStackSize=0x180000 to our Debian images, but it didn't work.

If you can explain to me in some details the root cause, why lowering the size of the Default Stack or using the Alpine linux which has low stack size by default (from my understanding), help fixing the issue ?

If you are on latest patch of dotnet 8 which contains this fix #100502
you are likely just seeing a difference due to glibc vs musl. You can play around with tunables for your application needs in debian via:
https://www.gnu.org/software/libc/manual/html_node/Memory-Allocation-Tunables.html

Likely lowering MALLOC_ARENA_MAX or MALLOC_TRIM_THRESHOLD_ can get you to similar memory utilization as alpine

krishnap80 · 2024-08-31T16:31:15Z

I am facing the same issue in .NET 6 API as well. Is there any solution identified?

mangod9 · 2024-09-01T00:25:20Z

hey @krishnap80, most of the discussions on this issue were around .NET 8. Since this issue has been closed I would suggest creating a new issue with details about your specific scenario. Ideally please try to move to .NET 8 too since 6 would soon be out of support. Thanks.

krishnap80 · 2024-09-04T17:29:14Z

Thanks all for the reply. We will update to .net 8 in a few weeks. I will check after that and if needed will get back to you

…

On Sat, Aug 31, 2024 at 7:25 PM Manish Godse ***@***.***> wrote: hey @krishnap80 <https://github.com/krishnap80>, most of the discussions on this issue were around .NET 8. Since this issue has been closed I would suggest creating a new issue with details about your specific scenario. Ideally please try to move to .NET 8 too since 6 would soon be out of support. Thanks. — Reply to this email directly, view it on GitHub <#95922 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/BK5SW6FAYRXD7FFUUD4IOOLZUJNJFAVCNFSM6AAAAABARZREOOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMRTGA4DKNRSGM> . You are receiving this because you were mentioned.Message ID: ***@***.***>

SamWilliamsGS added the tenet-performance Performance related issue label Dec 12, 2023

dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Dec 12, 2023

ghost added the untriaged New issue has not been triaged by the area owner label Dec 12, 2023

vcsjones added area-GC-coreclr and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Dec 12, 2023

mangod9 removed the untriaged New issue has not been triaged by the area owner label Dec 12, 2023

mangod9 added this to the 9.0.0 milestone Dec 12, 2023

janvorli mentioned this issue Dec 19, 2023

High memory consumption of .Net 6 app on Linux #96091

Open

yaseen22 mentioned this issue Aug 19, 2024

After migration to .NET 8(Isolated) from .NET 6(In-Proc), Azure Isolated function (queue triggered) throwing timeout issue. Azure/Azure-Functions#2510

Open

Large unmanaged memory growth (leak?) when upgrading from .NET 6 to 8 #95922

Large unmanaged memory growth (leak?) when upgrading from .NET 6 to 8 #95922

Comments

SamWilliamsGS commented Dec 12, 2023

Description

Configuration

Regression?

Data

Analysis

ghost commented Dec 12, 2023

Description

Configuration

Regression?

Data

Analysis

MichalPetryka commented Dec 12, 2023

mangod9 commented Dec 12, 2023

taylorjonl commented Dec 13, 2023 • edited Loading

SamWilliamsGS commented Dec 15, 2023

MichalPetryka commented Dec 15, 2023

janvorli commented Dec 19, 2023

SamWilliamsGS commented Jan 2, 2024

SamWilliamsGS commented Jan 2, 2024

am11 commented Jan 2, 2024

jtsalva commented Jan 3, 2024

janvorli commented Jan 3, 2024

danmoseley commented Jan 3, 2024

SamWilliamsGS commented Jan 3, 2024

janvorli commented Jan 3, 2024

SamWilliamsGS commented Jan 4, 2024

martincostello commented Jan 4, 2024

dlxeon commented Jan 4, 2024

martincostello commented Jan 4, 2024

nhart12 commented Jan 5, 2024

mangod9 commented Jan 5, 2024

nhart12 commented Jan 5, 2024

denislohachev1991 commented Jul 4, 2024

Leonardo-Ferreira commented Jul 4, 2024

denislohachev1991 commented Jul 4, 2024

rzikm commented Jul 4, 2024

rzikm commented Jul 4, 2024

Leandropintogit commented Jul 4, 2024

denislohachev1991 commented Jul 4, 2024

rzikm commented Jul 8, 2024

rzikm commented Jul 8, 2024

denislohachev1991 commented Jul 8, 2024

rzikm commented Jul 8, 2024

janvorli commented Jul 8, 2024

denislohachev1991 commented Jul 8, 2024

denislohachev1991 commented Jul 9, 2024 • edited Loading

rzikm commented Jul 9, 2024

janvorli commented Jul 9, 2024

denislohachev1991 commented Jul 9, 2024

Leandropintogit commented Jul 9, 2024

denislohachev1991 commented Jul 10, 2024

janvorli commented Jul 10, 2024

darthShadow commented Jul 10, 2024

janvorli commented Jul 10, 2024

janvorli commented Jul 10, 2024

denislohachev1991 commented Jul 10, 2024

yaseen22 commented Aug 9, 2024

nhart12 commented Aug 9, 2024

krishnap80 commented Aug 31, 2024

mangod9 commented Sep 1, 2024

krishnap80 commented Sep 4, 2024 via email

taylorjonl commented Dec 13, 2023 •

edited

Loading

denislohachev1991 commented Jul 9, 2024 •

edited

Loading