New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BEAM crashes with segmentation fault #7683
Comments
I've narrowed it down to a |
Scratch that, just encountered a segfault again and random values appearing in our ets tables :( |
It would help if you could provide a reproducer in Erlang. Looks like it's happening in |
Yeah, I'm trying to reproduce it but I'm having hard times in Elixir let alone Erlang :( I now have a script that reproduces it fairly reliably on OTP26 but not OTP24 and OTP25. We'll keep trying, and keep posting updates here. Everytime it crashes though it crashes with at |
Anyone with some advice or tips for debugging segfaults in erlang? The |
Update, I now encountered an example where the crash reason was different:
(not sure if this helps) I'll attach a full log in the description. |
I've updated also the description: the crash also happens in production (debian bookworm running in docker). |
Check the Types and Flavors section in the development guide. If you can reproduce the error in the debug emulator, that would help a lot as the resulting core file will contain a lot more information. |
Thanks @garazdawi! |
I've managed to reproduce it with a debug enabled beam. I used
I've attached in the description new dumps, but here are the two example thread crashes in question:
Let me know if there's anything else that I can do. |
Update: I can confirm again that the crashes occur on OTP 24 and OTP 25. However, on OTP 24 the reasons are different:
beam.debug.smp-2023-09-29-122001.txt
|
So, there is a term in the heap after GC that is corrupt. Can you make the beam.debug.smp and core file (from a debian if possible) available to me somehow? If you don't want to post it here, you can e-mail lukas@erlang.org. If you cannot give it to me, then we'll have to do this the slow way by me posting gdb command to dig out more information. |
I will try to replicate this on a debian and get back to you. But don't you also need the source code of the project to trigger this? This is currently the command I'm using to trigger it:
I can setup a debian machine with the full source code and give you ssh access (the reproduction case does not involve any of our customer data), if that's an option. |
That would be even better. If we can get it to reproduce there we can hopefully use the beautiful tool rr and this will be fixed in no time. |
Nice! I'll get on this right away, I'll let you know by email once I got this set up. |
(shameless plug) if you're able to reproduce it locally, then you're likely to be able to use https://max-au.com/debugging-the-beam/ technique to run the BEAM under debugger. |
If the body of a matchspec would return a flatmap with a variable ('$1', '$_' etc) as one of the keys and the variable was not an immidiate, the key term would not be copied to the receiving processes heap. This would later corrupt the term in the table as the GC could place move markers in it. Also fixed a bug in the stack estimation logic when a flatmap with all constant values, but not constant keys was encountered. Closes erlang#7683
If the body of a matchspec would return a hashmap with a variable ('$1', '$_' etc) as one of the keys or values and the variable was not an immidiate, the term would not be copied to the receiving processes heap. This would later corrupt the term in the table as the GC could place move markers in it. Also fixed an issue with the stack-estimation logic for when such a hashmap was encountered. Closes: erlang#7683
If the body of a matchspec would return a hashmap with a variable ('$1', '$_' etc) as one of the keys or values and the variable was not an immidiate, the term would not be copied to the receiving processes heap. This would later corrupt the term in the table as the GC could place move markers in it. Also fixed an issue with the stack-estimation logic for when such a hashmap was encountered. Closes: erlang#7683
Thanks, @garazdawi!! 🙌🏻 |
Describe the bug
We're getting randomly segmentation faults on our project, mostly when running an import script.
To Reproduce
I can share access to our private repo where it can be reproduced (with some luck :).
Here is the crash report:beam.smp-2023-09-26-134305_ips.txt (text formatted)beam.smp-2023-09-26-134305_ips_json.txtMore crash reports:beam.smp-2023-09-27-222920.txtHere is the crash report with debug enabled:
beam.debug.smp-2023-09-29-112754.txt
beam.debug.smp-2023-09-29-113729.txt
OTP 24.3.4.13 crash reports:
Expected behavior
No segmentation faults.
Affected versions
Debian Bookworm (docker), Mac OS, tried Elixir 1.15.6 with OTP 24, 25, 26
Additional context
The app makes heavy use of ETS tables and we're suspecting it has to do something with that.
Is there anything else we can do to debug where this is coming from?
The text was updated successfully, but these errors were encountered: