Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release/2.1] Determine memory load based on cgroup usage. #19650

Merged
merged 1 commit into from Aug 31, 2018

Conversation

Projects
None yet
6 participants
@tmds
Copy link
Member

commented Aug 24, 2018

Port #19518 to 2.1

CC @janvorli

Determine memory load based on cgroup usage. (#19518)
cgroup usage is used to trigger oom kills. It includes rss and file cache
of the cgroup.

The implementation was only using the process rss to determine memory load.
This is less than the cgroup usage and leads to oom kills due to GC not
being triggered soon enough.
@janvorli
Copy link
Member

left a comment

LGTM, thank you!

@janvorli

This comment has been minimized.

Copy link
Member

commented Aug 24, 2018

Description

.NET core applications are killed by OpenShift because they exceed their assigned memory. OpenShift/Kubernetes informs the app via the sysfs limit_in_bytes. Then memory is monitored by the oom killer based on sysfs usage_in_bytes. .NET Core is using /proc/self/statm instead, which includes only RSS and not memory in cache as the usage_in_bytes.

Customer Impact

Customer’s applications are killed by OpenShift because they exceed their assigned memory.

Regression?

No
 

Risk

No

Original issue: #19060

@chrisgilbert

This comment has been minimized.

Copy link

commented Aug 29, 2018

Thanks for fixing this! We think we've been hitting this issue in ECS, so I'll give a +1 for having it fixed in a 2.1 patch.

@jamshedd

This comment has been minimized.

Copy link
Member

commented Aug 30, 2018

Approved for 2.1.5

@danmosemsft danmosemsft merged commit a25682c into dotnet:release/2.1 Aug 31, 2018

17 checks passed

CROSS Check Build finished.
Details
CentOS7.1 x64 Checked Innerloop Build and Test Build finished.
Details
CentOS7.1 x64 Debug Innerloop Build Build finished.
Details
Linux-musl x64 Debug Build Build finished.
Details
OSX10.12 x64 Checked Innerloop Build and Test Build finished.
Details
Ubuntu arm Cross Checked Innerloop Build and Test Build finished.
Details
Ubuntu arm64 Cross Debug Innerloop Build Build finished.
Details
Ubuntu x64 Checked Innerloop Build and Test Build finished.
Details
Ubuntu x64 Formatting Build finished.
Details
WIP ready for review
Details
Windows_NT x64 Checked Innerloop Build and Test Build finished.
Details
Windows_NT x64 Formatting Build finished.
Details
Windows_NT x64 full_opt ryujit CoreCLR Perf Tests Correctness Build finished.
Details
Windows_NT x86 Checked Innerloop Build and Test Build finished.
Details
Windows_NT x86 Release Innerloop Build and Test Build finished.
Details
Windows_NT x86 full_opt ryujit CoreCLR Perf Tests Correctness Build finished.
Details
license/cla All CLA requirements met.
Details
@danmosemsft

This comment has been minimized.

Copy link
Member

commented Aug 31, 2018

This is expected to go out in October patch.

@danmosemsft danmosemsft modified the milestones: 2.1.x, 2.1.5 Sep 13, 2018

@devKlausS

This comment has been minimized.

Copy link

commented Oct 3, 2018

will this solve #18044 as well?

@tmds

This comment has been minimized.

Copy link
Member Author

commented Oct 3, 2018

@devKlausS it should, .NET Core now uses the same limit as the Linux OOM killer.

@devKlausS

This comment has been minimized.

Copy link

commented Oct 3, 2018

@tmds thanks for the quick reply! I did a retest with 2.1.5 and I am afraid the issued is not solved yet. See my update #18044 (comment) If you need more details please let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.