New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add OnRestartFailed to reload, health and prometheus plugins #2659
Labels
Comments
miekg
added a commit
that referenced
this issue
May 4, 2019
Add OnReStartFailed which makes the health plugin stay up if the Corefile is corrupt and we revert to the previous version. Also needs a fix for the channel handling See #2659 Testing it will log the following when restarting with a corrupted Corefile ~~~ 2019-05-04T18:01:59.431Z [INFO] linux/amd64, go1.12.4, CoreDNS-1.5.0 linux/amd64, go1.12.4, [INFO] SIGUSR1: Reloading [INFO] Reloading [ERROR] Restart failed: Corefile:5 - Error during parsing: Unknown directive 'bdhfhdhj' [ERROR] SIGUSR1: starting with listener file descriptors: Corefile:5 - Error during parsing: Unknown directive 'bdhfhdhj' ~~~ After which the curl still works. This also needed a change to reset the channel used for the metrics go-routine which gets closed on shutdown, otherwise you'll see: ~~~ ^C[INFO] SIGINT: Shutting down panic: close of closed channel goroutine 90 [running]: github.com/coredns/coredns/plugin/health.(*health).OnFinalShutdown(0xc000089bc0, 0xc000063d88, 0x4afe6d) ~~~ Signed-off-by: Miek Gieben <miek@miek.nl>
ready and prometheus are next. |
yongtang
pushed a commit
that referenced
this issue
May 4, 2019
Add OnReStartFailed which makes the health plugin stay up if the Corefile is corrupt and we revert to the previous version. Also needs a fix for the channel handling See #2659 Testing it will log the following when restarting with a corrupted Corefile ~~~ 2019-05-04T18:01:59.431Z [INFO] linux/amd64, go1.12.4, CoreDNS-1.5.0 linux/amd64, go1.12.4, [INFO] SIGUSR1: Reloading [INFO] Reloading [ERROR] Restart failed: Corefile:5 - Error during parsing: Unknown directive 'bdhfhdhj' [ERROR] SIGUSR1: starting with listener file descriptors: Corefile:5 - Error during parsing: Unknown directive 'bdhfhdhj' ~~~ After which the curl still works. This also needed a change to reset the channel used for the metrics go-routine which gets closed on shutdown, otherwise you'll see: ~~~ ^C[INFO] SIGINT: Shutting down panic: close of closed channel goroutine 90 [running]: github.com/coredns/coredns/plugin/health.(*health).OnFinalShutdown(0xc000089bc0, 0xc000063d88, 0x4afe6d) ~~~ Signed-off-by: Miek Gieben <miek@miek.nl>
miekg
added a commit
that referenced
this issue
May 5, 2019
Add OnRestartFailed to the ready plugin and some various cleanups. Document slightly better how things are supposed to work with multiple `ready`'s in the multiple Server Blocks. All manually tested with this Corefile: ~~~ . { log ready } example.org { log chaos ready } ~~~ And then `kill -SIGUSR1` and curling the ready endpoint. This works well, the FailedReload is triggered by adding a syntax error in the Corefile. See #2659 Signed-off-by: Miek Gieben <miek@miek.nl>
miekg
added a commit
that referenced
this issue
May 5, 2019
Fix metrics endpoint on a failed reload, follows the same lines as the previous PRs, see for e.g. 076b8d4. Test with a Corefile with 2 server blocks and metrics enabled and then introducing a syntax error: ~~~ [ERROR] Restart failed: Corefile:5 - Error during parsing: Unknown directive 'jfkdjk' [ERROR] SIGUSR1: starting with listener file descriptors: Corefile:5 - Error during parsing: Unknown directive 'jfkdjk' ~~~ And then curl-ing the metrics endpoint. See #2659 and as this is the last one. Fixes: #2659 Getting this all right turns out to be tricky, also it's not easy testable which is something I should fix. Signed-off-by: Miek Gieben <miek@miek.nl>
yongtang
pushed a commit
that referenced
this issue
May 13, 2019
Fix metrics endpoint on a failed reload, follows the same lines as the previous PRs, see for e.g. 076b8d4. Test with a Corefile with 2 server blocks and metrics enabled and then introducing a syntax error: ~~~ [ERROR] Restart failed: Corefile:5 - Error during parsing: Unknown directive 'jfkdjk' [ERROR] SIGUSR1: starting with listener file descriptors: Corefile:5 - Error during parsing: Unknown directive 'jfkdjk' ~~~ And then curl-ing the metrics endpoint. See #2659 and as this is the last one. Fixes: #2659 Getting this all right turns out to be tricky, also it's not easy testable which is something I should fix. Signed-off-by: Miek Gieben <miek@miek.nl>
miekg
added a commit
that referenced
this issue
May 23, 2019
Add OnRestartFailed to the ready plugin and some various cleanups. Document slightly better how things are supposed to work with multiple `ready`'s in the multiple Server Blocks. All manually tested with this Corefile: ~~~ . { log ready } example.org { log chaos ready } ~~~ And then `kill -SIGUSR1` and curling the ready endpoint. This works well, the FailedReload is triggered by adding a syntax error in the Corefile. See #2659 Signed-off-by: Miek Gieben <miek@miek.nl>
miekg
added a commit
that referenced
this issue
Jun 9, 2019
Add OnRestartFailed to the ready plugin and some various cleanups. Document slightly better how things are supposed to work with multiple `ready`'s in the multiple Server Blocks. All manually tested with this Corefile: ~~~ . { log ready } example.org { log chaos ready } ~~~ And then `kill -SIGUSR1` and curling the ready endpoint. This works well, the FailedReload is triggered by adding a syntax error in the Corefile. See #2659 Signed-off-by: Miek Gieben <miek@miek.nl>
dna2github
pushed a commit
to dna2fork/coredns
that referenced
this issue
Jul 19, 2019
Add OnReStartFailed which makes the health plugin stay up if the Corefile is corrupt and we revert to the previous version. Also needs a fix for the channel handling See coredns#2659 Testing it will log the following when restarting with a corrupted Corefile ~~~ 2019-05-04T18:01:59.431Z [INFO] linux/amd64, go1.12.4, CoreDNS-1.5.0 linux/amd64, go1.12.4, [INFO] SIGUSR1: Reloading [INFO] Reloading [ERROR] Restart failed: Corefile:5 - Error during parsing: Unknown directive 'bdhfhdhj' [ERROR] SIGUSR1: starting with listener file descriptors: Corefile:5 - Error during parsing: Unknown directive 'bdhfhdhj' ~~~ After which the curl still works. This also needed a change to reset the channel used for the metrics go-routine which gets closed on shutdown, otherwise you'll see: ~~~ ^C[INFO] SIGINT: Shutting down panic: close of closed channel goroutine 90 [running]: github.com/coredns/coredns/plugin/health.(*health).OnFinalShutdown(0xc000089bc0, 0xc000063d88, 0x4afe6d) ~~~ Signed-off-by: Miek Gieben <miek@miek.nl>
dna2github
pushed a commit
to dna2fork/coredns
that referenced
this issue
Jul 19, 2019
Fix metrics endpoint on a failed reload, follows the same lines as the previous PRs, see for e.g. 076b8d4. Test with a Corefile with 2 server blocks and metrics enabled and then introducing a syntax error: ~~~ [ERROR] Restart failed: Corefile:5 - Error during parsing: Unknown directive 'jfkdjk' [ERROR] SIGUSR1: starting with listener file descriptors: Corefile:5 - Error during parsing: Unknown directive 'jfkdjk' ~~~ And then curl-ing the metrics endpoint. See coredns#2659 and as this is the last one. Fixes: coredns#2659 Getting this all right turns out to be tricky, also it's not easy testable which is something I should fix. Signed-off-by: Miek Gieben <miek@miek.nl>
dna2github
pushed a commit
to dna2fork/coredns
that referenced
this issue
Jul 19, 2019
Add OnRestartFailed to the ready plugin and some various cleanups. Document slightly better how things are supposed to work with multiple `ready`'s in the multiple Server Blocks. All manually tested with this Corefile: ~~~ . { log ready } example.org { log chaos ready } ~~~ And then `kill -SIGUSR1` and curling the ready endpoint. This works well, the FailedReload is triggered by adding a syntax error in the Corefile. See coredns#2659 Signed-off-by: Miek Gieben <miek@miek.nl>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
As @fturib wrote:
Thanks to the new "OnRestartFailed" event that was introduced few months ago in Caddy.
We can handle the failed restart (when the Corefile is invalid).
i guess we could have:
The text was updated successfully, but these errors were encountered: