Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERL-525: better diagnostics for memory allocation failures #3489

Open
OTP-Maintainer opened this issue Nov 26, 2017 · 1 comment
Open

ERL-525: better diagnostics for memory allocation failures #3489

OTP-Maintainer opened this issue Nov 26, 2017 · 1 comment
Labels
enhancement help wanted Issue not worked on by OTP; help wanted from the community priority:low team:VM Assigned to OTP team VM

Comments

@OTP-Maintainer
Copy link

Original reporter: mikael pettersson
Affected versions: OTP-18.3.4.2, OTP-17.5, R16B03-1
Component: erts
Migrated from: https://bugs.erlang.org/browse/ERL-525


When the VM terminates due to a memory allocation failure, it includes a small note at the start of the crash dump file with a message similar to "failed to allocate N bytes of type T".

With the VM's layered and highly complex memory allocation framework, this is not enough to deduce the root cause of the allocation failure.  In particular, if an OS-level call failed, this information should be included.  For example:

failed to allocate N bytes of type heap, due to mmap(<actual mmap params>) failing with ENOMEM in file F.c line L.

and similar for any failed mremap, sbrk, brk, malloc, etc.

We have been chasing infrequent and highly non-deterministic crashes due to allocation failures for quite some time.  The non-deterministic nature of them, coupled with the fact that the hosts had plenty of free RAM when the VM claimed "out of memory", made it difficult to pinpoint the root cause.  A more detailed message in the crash dump, like the above, would have steered us in the right direction much earlier.

(The root cause turned out to be external to the VM, so the only issue with the VM  is the lack of details in the allocation failure message.)
@OTP-Maintainer
Copy link
Author

rickard said:

This would be nice, but we unfortunately have quite a lot of other higher priority stuff in the pipe. I've added a ticket (OTP-14818) to our backlog though.

A pull-request implementing this is of course welcome.

I think the easiest way to accomplish this is to pass information about allocation type and information about whether or not a failure is fatal down to the primitive mapping memory. This info should fit in a {{Uint32}}. If the primitive fails, print out all info wanted and terminate at that place.


@OTP-Maintainer OTP-Maintainer added enhancement help wanted Issue not worked on by OTP; help wanted from the community team:VM Assigned to OTP team VM priority:low labels Feb 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement help wanted Issue not worked on by OTP; help wanted from the community priority:low team:VM Assigned to OTP team VM
Projects
None yet
Development

No branches or pull requests

1 participant