Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

epoxy_client reboot retry logic is broken #98

Closed
nkinkade opened this issue Nov 17, 2020 · 1 comment · Fixed by #99
Closed

epoxy_client reboot retry logic is broken #98

nkinkade opened this issue Nov 17, 2020 · 1 comment · Fixed by #99
Assignees

Comments

@nkinkade
Copy link
Contributor

epoxy_client is supposed to attempt to contact ePoxy repeatedly, and after some time, if it can't, it should reboot the machine. This reboot logic is broken, apparently something to do with sysrq not being enabled by default in Ubuntu:

https://github.com/m-lab/epoxy/blob/master/cmd/epoxy_client/main.go#L93

@nkinkade
Copy link
Contributor Author

Investigating this, Ubuntu is in fact configured to support SYSRQ commands. I forced mlab1-lga0t to get stuck in stage1 (deleted ePoxy GCD entity), and then manually caused the machine to boot from the USB. Logging into the machine, I saw that epoxy_client was failing to get a reply from ePoxy. I then SSH'd into the machine and ran:

root@mlab1-lga0t:~# cat /proc/sys/kernel/sysrq
176
root@mlab1-lga0t:~# echo 1 > /proc/sys/kernel/sysrq
root@mlab1-lga0t:~# echo b > /proc/sysrq-trigger

The machine promptly rebooted. The problem must be somewhere else.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant