Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce Crash Monitoring #22

Open
2 tasks
Yoric opened this issue Feb 5, 2016 · 5 comments
Open
2 tasks

Introduce Crash Monitoring #22

Yoric opened this issue Feb 5, 2016 · 5 comments

Comments

@Yoric
Copy link

Yoric commented Feb 5, 2016

The Foxbox is supposed to run for several years. So we need to survive crashes – and fix them.

This means:

  • having a watchdog that can monitor the fxbox process(es) and relaunch them as needed;
  • having a crash reporter to upload crash stacks and metadata.

This bug will probably be split in sub-bugs once we have a clearer idea how to do both.

@Yoric
Copy link
Author

Yoric commented Feb 5, 2016

Apparently, we should be able to reuse the Crash Monitor.

Filing a bug to be able to upload data to Socorro.

@Yoric
Copy link
Author

Yoric commented Feb 5, 2016

Extracts from irc:

7:10:21 PM - ted: https://chromium.googlesource.com/breakpad/breakpad/+/master/docs/linux_starter_guide.md
7:10:32 PM - ted: Yoric: if you're only targeting linux then it's pretty simple
7:11:07 PM - ted: use ExceptionHandler to generate dumps, do something in the callback to invoke another process to submit them (and maybe restart yourself)
7:11:26 PM - ted: then also run dump_syms on your binaries as part of your build and upload the results to the symbol server
7:22:35 PM - ted: well i don't really have an answer for rust

7:22:52 PM - ted: but you could write a little c++ file that did that and exposed a tiny C API around it
7:23:08 PM - ted: foo* set_exception_handler()
7:23:17 PM - ted: unset_exception_handler(foo*)
7:23:40 PM - ted: i feel like someone in #rust or #servo was looking into integrating breakpad
7:23:47 PM - ted: but i don't know where that got to
7:24:09 PM - ted: also the more basic question of what crash reporting means in rust code is not something i understand terribly well, tbh
7:24:21 PM - ted: handling OS-level crashes still makes sense
7:24:29 PM - ted: but dealing with unhandled panic! etc

@samgiles
Copy link
Contributor

samgiles commented Feb 7, 2016

We'd need some sort of process manager/API within the FoxBox process too.

A large part of the tunnel_controller is managing the process running the HTTP tunnel to the cloud service. I'm wondering if there's some overlap here - will each device_adapter be a process as well?

@Yoric
Copy link
Author

Yoric commented Feb 7, 2016

will each device_adapter be a process as well?

This would make sense. It would make it easier to add/upgrade drivers without having to reboot the entire FoxBox feature.

@Yoric
Copy link
Author

Yoric commented Feb 8, 2016

Filing a bug to be able to upload data to Socorro.

Ok, we apparently don't need any authorization to upload to Socorro. We just need to decide (internally) of a product id.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants