Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use breakpad + symbolic to generate and interpret minidump-format core dumps #4202

Open
siddontang opened this Issue Feb 13, 2019 · 5 comments

Comments

2 participants
@siddontang
Copy link
Contributor

siddontang commented Feb 13, 2019

Edit: We're going to try to integrate breakpad + symbolic to generate compact "minidumps" (via breakpad), and interpret them offline (via symbolic). Next step is to prototype breakpad and symbolic on a toy project to learn how to use them.

Feature Request

Is your feature request related to a problem? Please describe:

Sometimes TiKV may meet some problems like segment fault and crash directly, but unfortunately, our official deployment through Ansbile doesn't enable core because we worry generating too many core dump files may exhaust disk space.

Although we enable core, the generated core files may be too large and can't be sent through the network and we have to debug it on the users' machine directly(of course, this is not allowed in most of the users' environments).

Describe the feature you'd like:

Mostly we only want to know the panic backtrace. Instead of the core file, we can use minidump or just output the panic backtrace.

Teachability, Documentation, Adoption, Migration Strategy:

For minidump, we can use https://github.com/google/breakpad, in Rust, we may try https://github.com/getsentry/symbolic
Another way is to output backtrace directly, refer to https://github.com/gby/libcrash and https://www.scribd.com/doc/3726406/Crash-N-Burn-Writing-Linux-application-fault-handlers.

/cc @ethercflow

@brson

This comment has been minimized.

Copy link
Contributor

brson commented Feb 14, 2019

The op mentions panics, but this is really about hard crashes. That said, if we distributed builds with panic = abort we could treat crashes and panics the same, with minidump.

It seems like using breakpad + symbolic should work pretty fine for tikv.

Stripping debuginfo for breakpad would mostly fix #4107, and help with #4150.

This seems promising.

@brson brson added this to To do in Improve compile times via automation Feb 14, 2019

@siddontang

This comment has been minimized.

Copy link
Contributor Author

siddontang commented Feb 14, 2019

@brson

Do you think is it fine to let contributors help us? of course, we can mentor them.

@brson

This comment has been minimized.

Copy link
Contributor

brson commented Feb 16, 2019

@siddontang yes I think this could be an interesting issue for a contributor. It needs some more description of the next steps though - I don't think the original description is quite enough to get started. Feel free to write more about specifically what we need to do next, or else I'll come back and think about it later.

@siddontang

This comment has been minimized.

Copy link
Contributor Author

siddontang commented Feb 17, 2019

@brson

I think we can try to use breakpad or symbolic directly, I browse the breakpad source code and find it has already registered a signal handler to generate minidump directly. but we should verify whether it can work in Rust or not.

So I think at first, we should try to use these in Rust, then we can introduce to TiKV and ensure it can work ok in TiKV too.

After that, we can use breakpad to extract the symbol debug info, save to another file (we don't need to include the symbol file in release tar) and reduce the binary size. When the users meet coredump, they can only send us the minidump files, and we can debug it directly in our local computer with the symbol file.

@brson

This comment has been minimized.

Copy link
Contributor

brson commented Feb 22, 2019

Thanks @siddontang. It sounds like the next step is to prototype integrating breakpad and symbolic into a toy rust project.

@siddontang do you have time to mentor? I understand if not; we can look for somebody else.

@brson brson changed the title consider generating minidump or outputting backtrace when receives abort signal Use breakpad + symbolic to generate and interpret minidump-format core dumps Feb 22, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.