Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Breakpad for crash dump generation #56014

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

hhyyrylainen
Copy link

@hhyyrylainen hhyyrylainen commented Dec 17, 2021

I started working on a simple Breakpad crash reporting integration to Godot related to this proposal: godotengine/godot-proposals#1896

Some current caveats:

  • I have not tested this with Godot 4, I originally made this against the 3.4 stable label and cherry picked to this branch (that other branch is here: https://github.com/Revolutionary-Games/godot/tree/crash_dumper_3.x). I only did a very quick compile fix and checked that Godot starts on this branch. There didn't seem to be a working sample project available in the asset library(?) so I couldn't test with a project.
  • There's some cleanup left to do regarding which files to include and which need to be compiled
  • My local install of clang-format doesn't work with the options in this repo so I couldn't format the files with it (clang-format failed with code 1: /home/hhyyrylainen/Projects/godot/.clang-format:151:1: error: unknown key 'SpacesInLineCommentPrefix') update to Fedora 35 fixed this by having a newer clang version
  • I noticed that Windows crash reporter on destruction doesn't disable itself. seems like nothing else either disables it on shutdown. Is that intended? It's not consistent with Linux. For my own testing I added the line to disable it in the destructor, but I didn't commit that here I added this code
  • I haven't yet tested whether the created crashdumps can be decoded with stackwalk (I'm going to work on my build scripts next to make this work, I'm not confident enough with the Godot build process yet to make a PR to the build scripts repo). Also Windows is less tested than Linux Windows crash dumps only work with the special mingw supporting stackwalk, but no special version is required in the Godot repo when building Godot, so this is just a minor inconvenience.
  • I don't have a mac to develop on, so support for that needs to be done later

@bruvzg

This comment has been minimized.

@hhyyrylainen hhyyrylainen marked this pull request as draft December 17, 2021 10:16
@hhyyrylainen
Copy link
Author

Oh yeah, thanks. I misremembered where that button was supposed to be, and couldn't see it there...

modules/breakpad/SCsub Outdated Show resolved Hide resolved
"src/client/ios/handler/ios_exception_minidump_generator.mm",
]

# if solaris:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you using Godot on Illumos / Solaris? :D Cool!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well there are files for solaris in Breakpad, so I thought it would be at least theoretically useful to keep the list of files for solaris here.

@fire
Copy link
Member

fire commented Dec 17, 2021

I have approved cicd.

modules/breakpad/SCsub Outdated Show resolved Hide resolved
@Calinou Calinou added this to the 4.0 milestone Dec 17, 2021
@fire
Copy link
Member

fire commented Dec 19, 2021

Some more static check failures and try to collapse the commits by squashing. We prefer one unless you have a logical reason (we allow to have a few commits).

@hhyyrylainen
Copy link
Author

I have ran clang-format and renamed that one place where a file was incorrectly referred to in a file copyright header. I haven't pushed those changes yet as I wanted to check the Windows compile and determine if some of the common files are even needed for any platform. I'll do a manual squash as the last thing once I don't need to tweak the code anymore, as that'll get rid of some intermediate files that would otherwise be in git history.

@hhyyrylainen
Copy link
Author

All issues detected by CI should now hopefully be fixed. I also removed the commented out parts in the build config file and did some last tweaks. Also made sure it compiles on Windows, I'm not super confident about how the string conversion should be used in 4.0, in 3.4 I think I was able to figure out it pretty well.

I think this now has all the changes on the engine side that are needed for crash dump generation. Though, I've not yet gotten to test the entire flow of creating a crash dump and being able to decode it. As an additional difficulty Windows MinGW builds seem a bit problematic regarding extracting symbols but this breakpad fork: https://github.com/DaemonEngine/breakpad seems to have a working tool for that.

@hhyyrylainen hhyyrylainen marked this pull request as ready for review December 19, 2021 15:26
@hhyyrylainen
Copy link
Author

hhyyrylainen commented Dec 19, 2021

I've just ran into a pretty big issue, it seems even though I found a tool that can dump the symbols of a mingw created exe on Linux, it doesn't seem to contain any symbols besides the standard library. It doesn't even detect that any of the Godot source files are included.
I tried to use a modified build script with LINKFLAGS=-Wl,--build-id=md5 CCFLAGS=-g CFLAGS=-g CXXFLAGS=-g however that didn't either result in any more found symbols. Is there some option I'm missing to get Windows debug symbols for Godot?
I don't want to fallback to compiling the Windows version on Windows as that basically requires Windows 10 to run correctly (though I suppose officially Windows 8 is only getting one more year of updates from Microsoft), and I have not found a way to make it work on earlier versions. According to all Microsoft documentation the universal C runtime that visual studio 2019 uses, should be installable on older Windows versions, but I haven't seen anyone able to get that working...

@fire
Copy link
Member

fire commented Dec 19, 2021

Let me get back to you but I think I posted a llvm-mingw workflow.

@fire
Copy link
Member

fire commented Dec 19, 2021

Here's the exact command I use.

PATH=/opt/llvm-mingw/bin:$PATH scons werror=no platform=windows target=release_debug -j`nproc` use_lto=no deprecated=no use_mingw=yes use_llvm=yes use_thinlto=yes LINKFLAGS=-Wl,-pdb= CCFLAGS='-g -gcodeview' debug_symbols=no"

The important part is llvm-mingw and LINKFLAGS=-Wl,-pdb= CCFLAGS='-g -gcodeview'

@hhyyrylainen
Copy link
Author

hhyyrylainen commented Dec 19, 2021

Kind of expected, but the Godot podman build image scripts don't install llvm, so I get this error with that:

sh: line 1: x86_64-w64-mingw32-clang++: command not found

it shouldn't be too difficult to modify my local copy to also install llvm in the build image to test out that way of making the build. First I'll try that thinlto one with the mingw gcc to see if that has any effect.

Edit: turning off lto and turning on the thin lto, seems to have increased the symbols file size to 100 MB, which now seems to have quite many Godot symbols in it.

SConstruct Outdated
@@ -135,6 +135,7 @@ opts.Add(BoolVariable("opengl3", "Enable the OpenGL/GLES3 video driver", True))
opts.Add("custom_modules", "A list of comma-separated directory paths containing custom modules to build.", "")
opts.Add(BoolVariable("custom_modules_recursive", "Detect custom modules recursively for each specified path.", True))
opts.Add(BoolVariable("use_volk", "Use the volk library to load the Vulkan loader dynamically", True))
opts.Add(BoolVariable("breakpad_enabled", "Enable Breakpad crash dump creation.", False))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The common pattern is "use_breakpad". Thoughts?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can change it easily enough.

Copy link
Member

@fire fire Dec 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you change this to "use_breakpad" unless you have reasoning against?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I can. I got stuck yesterday trying to fix that problem I commented about (#56014 (comment)), and I didn't want to switch over to changing this as I have a bunch of experimental local changes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed the scons option and the defines used in C++ as volk seemed to also use that format of preprocessor defines.

@hhyyrylainen
Copy link
Author

I ran into another issue: in release mode on Linux when Godot doesn't register the signal handlers, but instead I leave that to Breakpad, it doesn't work for some reason. I might need to do a workaround where the Godot handlers are always installed, or perhaps the breakpad version I got has a bug, but that would be a pretty serious slip by Google...

@fire
Copy link
Member

fire commented Dec 21, 2021

How do I recreate your bug case? Maybe posting some info here can reveal the answer as I'm not in this area.

@hhyyrylainen
Copy link
Author

After debugging with gdb I think I figured it out. The mono runtime actually messes with the signals. I noticed when putting breakpoints on signal and sigaction that the mono initialization actually overrides the handlers that breakpad installs, however for some reason those signal handlers from mono don't mess with signal handlers installed using the signal function so only breakpad is impacted as it uses the more extensive sigaction function.

Here's where mono messes with the signal handlers:

(gdb) bt
#0  __GI___sigaction (sig=35, act=0x0, oact=0x7fffffffc2d0) at sigaction.c:26
#1  0x0000000000cdcd6d in mono_threads_suspend_search_alternative_signal ()
#2  0x0000000000cdcf65 in mono_threads_suspend_init_signals ()
#3  0x0000000000add5ba in mini_init ()
#4  0x0000000000a353eb in mono_jit_init_version ()
#5  0x0000000000d80bb6 in (anonymous namespace)::gd_initialize_mono_runtime ()
    at modules/mono/mono_gd/gd_mono.cpp:199
#6  GDMono::initialize (this=0x4ce3bf0) at modules/mono/mono_gd/gd_mono.cpp:378
#7  0x0000000001088d1a in CSharpLanguage::init (this=0x4257ec0)
    at modules/mono/csharp_script.cpp:113
#8  0x000000000250b7df in ScriptServer::init_languages ()
    at core/script_language.cpp:170
#9  0x0000000000d4f4bb in Main::setup2 (p_main_tid_override=<optimized out>)
    at main/main.cpp:1489
#10 0x0000000000d547a3 in Main::setup (execpath=<optimized out>, 
    argc=<optimized out>, argv=<optimized out>, p_second_phase=<optimized out>)
    at main/main.cpp:1232
#11 0x00000000009f8469 in main (argc=1, argv=0x7fffffffd9f8)
    at platform/x11/godot_x11.cpp:48

This means that using a workaround of always registering the Godot crash handler signal handlers and passing that onto breakpad, should work. I don't have enough time to test that today, though.

@hhyyrylainen
Copy link
Author

Seems like not even adding code like:

#elif defined(BREAKPAD_ENABLED)

	signal(SIGSEGV, handle_crash);
	signal(SIGFPE, handle_crash);
	signal(SIGILL, handle_crash);

	initialize_breakpad(false);
#endif

works, even though that's the same as in debug mode. Note that the editor properly runs the crash handlers.

So there's definitely something in the Linux Mono runtime when running in release mode exported game that makes it override Godot's and also Breakpad's signal handlers. I doubt that even Godot's own crash handler was able to work in this mode, but then again it's always disabled in the release mode, so I guess no one has encountered this before.

It doesn't seem there's an easily tweakable place in gd_mono.cpp where this happens. So I might just have to put in a workaround in gd_mono.cpp (or some place higher up in the callstack), where I re-initialize Breakpad (on Linux) just after Mono has been loaded to fix the signals it messed up.

@hhyyrylainen
Copy link
Author

Seems like I can't built the non-editor binary of 4.0 at all:

[ 73%] Linking Program        ==> bin/godot.linuxbsd.opt.64
/usr/bin/ld: modules/text_server_adv/libicu_builtin.linuxbsd.opt.64.a(stringtriebuilder.linuxbsd.opt.64.o): warning: relocation against `__cxa_pure_virtual' in read-only section `.rodata._ZTVN6icu_7017StringTrieBuilder15LinearMatchNodeE[_ZTVN6icu_7017StringTrieBuilder15LinearMatchNodeE]'
/usr/bin/ld: warning: creating DT_TEXTREL in a PDE
[ 73%] scons: done building targets.
[Time elapsed: 00:03:21.887]

so I can't currently test it at all, but I tested earlier that the editor does create a crash dump even on the 4.0 branch. And I just verified that it still does work.

@akien-mga akien-mga self-requested a review January 4, 2022 13:14
@hhyyrylainen
Copy link
Author

Will this be reviewed again soon? I'm planning on releasing a game version that would have this in it in a few weeks. I'll use the 3.4 branch as a base but it would be a bit cleaner for me to make the custom branch if this was already merged into master and cherry-picked to the 3.x versions.

@mhilbrunner mhilbrunner self-requested a review January 24, 2022 13:03
@mhilbrunner
Copy link
Member

mhilbrunner commented Jan 24, 2022

Will this be reviewed again soon? I'm planning on releasing a game version that would have this in it in a few weeks. I'll use the 3.4 branch as a base but it would be a bit cleaner for me to make the custom branch if this was already merged into master and cherry-picked to the 3.x versions.

Perfectly understandable, but it is currently unlikely to be merged and cherry-picked into 3.x in the next few weeks as the core team is pretty busy with releasing Godot 4.0 alpha 1, and a change such as this will take time and multiple reviews/approvals by different people as it is pretty core. I'll try to review it soon, but you should assume this will take some time to get discussed/reviewed. (We also have quite some backlog of PRs to get down.)

I apologize for the inconvenience :) Hope your release goes well!

@hhyyrylainen
Copy link
Author

hhyyrylainen commented Jan 24, 2022

Thanks.
I assumed as much not getting a response for a few days. I already made the 3.4 based branch, which didn't have too many merge conflicts to solve: https://github.com/Revolutionary-Games/godot/tree/crash_dumper_3.4.2
Which builds fine now that I found a workaround for godotengine/build-containers#101 also looks like the breakpad dumper version I found doesn't work with that newer mingw version, which I still need to solve. But that isn't a problem with the code that is in this PR.

Edit: the mingw dumper seems to have issues with GCC 11 compiled binaries (so when the Godot build containers update to be based on Fedora 35 the issue will appear). I opened an issue for them about this: DaemonEngine/breakpad#9

@fire
Copy link
Member

fire commented May 25, 2022

There seems to be some conflicts, but the github interface doesn't show the exact list.

@hhyyrylainen
Copy link
Author

It shows 4 files for me?

COPYRIGHT.txt
main/main.cpp
modules/mono/csharp_script.cpp
thirdparty/README.md 

I can fix the merge conflicts if this has a chance to get merged once I do, I don't want to end up fixing merge conflicts over and over...

@fire
Copy link
Member

fire commented May 26, 2022

Well, due to the lack of proposal consensus, I wanted to try it on my branch and see if it does what it says it does.

Wanted to ask for a rebase before trying to update the pr myself.

We have a lot of crashes in our game / game editor and I wanted to try this breakpad pr so I can have a more effective proposal for the core maintainers.

@hhyyrylainen
Copy link
Author

I have now fixed the merge conflicts.

@fire
Copy link
Member

fire commented May 28, 2022

Doing some initial review. The musl folder doesn't seem to be used and maybe others.

@hhyyrylainen
Copy link
Author

I won't say I 100% remember, but I'm at least somewhat sure there was some build error without that folder...

@hhyyrylainen
Copy link
Author

I removed the musl folder and the one elf_reader file which seemed to be the only thing depending on it. Kind of looks like that was only necessary in the breakpad dumper binaries so the final game executable shouldn't need it.
Also I rebased this onto the latest master now.

@hhyyrylainen hhyyrylainen force-pushed the crash_dumper branch 2 times, most recently from d08ac3c to 8572550 Compare August 9, 2022 10:13
Fixed compiling on Windows

Make crash dump printing work better on Windows

Tweaked Breakpad build

Fixed Breakpad build for Godot 4

Switched to WINDOWS_ENABLED

Removed some non buildable files listed in the files to build

Ran clang format

Fixed the other formatting issues detected by CI

Removed a comment and added clarifying comment on crash dump message

as to why it is printed twice on Windows

Make an ugly string conversion to make Windows build work

Tweaked the build configuration and formatted again

removed lss

Add lss properly

Reinitialize breakpad after mono initialization on Linux

otherwise the breakpad signal handlers are not active

Disable Windows crash handler on destruction similarly to Linux

Renamed breakpad_enabled to use_breakpad

Forgot to wrap one piece of code inside ifdef USE_BREAKPAD

Updated copyright years in the added files

Fix register types for breakpad

Fixed dir access

Removed musl and elf_reader

which was the only thing seemingly depending on it

Updated header guards

Removed the memdelete call
@hhyyrylainen
Copy link
Author

With 4.0 out now, is there a chance that this could get merged at some point? If there's a chance I'll work through the merge conflicts and test everything again to check that this still works. I also now have a development mac mini so I could also try to get the mac side of things working if that means there's a better chance for this to be merged.

@mrTag
Copy link
Contributor

mrTag commented Jun 6, 2023

Thank you @hhyyrylainen ! We added this PR to our game, Halls of Torment, shortly after our release, because players reported crashes and we had no way of tracking them down. We only activated it in the windows version for now, so I can't speak for the other platforms, but it worked wonderfully! We added a simple Windows MessageBox when it crashes where we explain how players can send us the crash dump. Players send us the dump files manually and we tracked down most crashes.

I think generating crash dumps is essential for a game engine, so this PR absolutely gets my stamp of approval! 😃

@fire
Copy link
Member

fire commented Jun 6, 2023

The 4.1 feature frozen but I will try my best to get the support from the development team.

As I want to try breakpad, do you know how difficult it is to have a master branch and if 4.0 is cherry pickable from it.

@hhyyrylainen
Copy link
Author

I suspect this is currently at least slightly broken for 4.x as this was written for Mono. Other than updating the hooks a bit and testing that dotnet runtime doesn't destroy the signal listeners, this shouldn't be too difficult to update.
If it does destroy the listeners then a similar workaround as for mono is needed. For mono re-registering the handlers after after mono was initialized worked fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants