Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

plugin/health: add OnRestartFailed #2812

Merged
merged 1 commit into from May 4, 2019
Merged

plugin/health: add OnRestartFailed #2812

merged 1 commit into from May 4, 2019

Conversation

miekg
Copy link
Member

@miekg miekg commented May 4, 2019

Add OnReStartFailed which makes the health plugin stay up if the
Corefile is corrupt and we revert to the previous version.

Also needs a fix for the channel handling

See #2659

Testing it will log the following when restarting with a corrupted
Corefile

2019-05-04T18:01:59.431Z [INFO] linux/amd64, go1.12.4,
CoreDNS-1.5.0
linux/amd64, go1.12.4,
[INFO] SIGUSR1: Reloading
[INFO] Reloading
[ERROR] Restart failed: Corefile:5 - Error during parsing: Unknown directive 'bdhfhdhj'
[ERROR] SIGUSR1: starting with listener file descriptors: Corefile:5 - Error during parsing: Unknown directive 'bdhfhdhj'

After which the curl still works.

This also needed a change to reset the channel used for the metrics
go-routine which gets closed on shutdown, otherwise you'll see:

^C[INFO] SIGINT: Shutting down
panic: close of closed channel

goroutine 90 [running]:
github.com/coredns/coredns/plugin/health.(*health).OnFinalShutdown(0xc000089bc0, 0xc000063d88, 0x4afe6d)

Add OnReStartFailed which makes the health plugin stay up if the
Corefile is corrupt and we revert to the previous version.

Also needs a fix for the channel handling

See #2659

Testing it will log the following when restarting with a corrupted
Corefile

~~~
2019-05-04T18:01:59.431Z [INFO] linux/amd64, go1.12.4,
CoreDNS-1.5.0
linux/amd64, go1.12.4,
[INFO] SIGUSR1: Reloading
[INFO] Reloading
[ERROR] Restart failed: Corefile:5 - Error during parsing: Unknown directive 'bdhfhdhj'
[ERROR] SIGUSR1: starting with listener file descriptors: Corefile:5 - Error during parsing: Unknown directive 'bdhfhdhj'
~~~

After which the curl still works.

This also needed a change to reset the channel used for the metrics
go-routine which gets closed on shutdown, otherwise you'll see:

~~~
^C[INFO] SIGINT: Shutting down
panic: close of closed channel

goroutine 90 [running]:
github.com/coredns/coredns/plugin/health.(*health).OnFinalShutdown(0xc000089bc0, 0xc000063d88, 0x4afe6d)
~~~

Signed-off-by: Miek Gieben <miek@miek.nl>
@corbot corbot bot requested a review from jameshartig May 4, 2019 18:20
@corbot
Copy link

corbot bot commented May 4, 2019

Thank you for your contribution. I've just checked the OWNERS files to find a suitable reviewer. This search was successful and I've asked fastest963 (via plugin/health/OWNERS) for a review.

If you have questions or suggestions for this bot, please file an issue against the miekg/dreck repository.

The bot understands the commands that are listed here.

@codecov-io
Copy link

Codecov Report

Merging #2812 into master will decrease coverage by 0.02%.
The diff coverage is 33.33%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2812      +/-   ##
==========================================
- Coverage   55.31%   55.29%   -0.03%     
==========================================
  Files         203      203              
  Lines       10228    10229       +1     
==========================================
- Hits         5658     5656       -2     
- Misses       4153     4155       +2     
- Partials      417      418       +1
Impacted Files Coverage Δ
plugin/health/setup.go 63.04% <0%> (-1.41%) ⬇️
plugin/health/health.go 80% <100%> (+3.33%) ⬆️
plugin/route53/route53.go 84.28% <0%> (-2.15%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e178291...7f084fc. Read the comment docs.

@yongtang yongtang merged commit 076b8d4 into master May 4, 2019
@corbot corbot bot deleted the restart-health branch May 4, 2019 20:06
dna2github pushed a commit to dna2fork/coredns that referenced this pull request Jul 19, 2019
Add OnReStartFailed which makes the health plugin stay up if the
Corefile is corrupt and we revert to the previous version.

Also needs a fix for the channel handling

See coredns#2659

Testing it will log the following when restarting with a corrupted
Corefile

~~~
2019-05-04T18:01:59.431Z [INFO] linux/amd64, go1.12.4,
CoreDNS-1.5.0
linux/amd64, go1.12.4,
[INFO] SIGUSR1: Reloading
[INFO] Reloading
[ERROR] Restart failed: Corefile:5 - Error during parsing: Unknown directive 'bdhfhdhj'
[ERROR] SIGUSR1: starting with listener file descriptors: Corefile:5 - Error during parsing: Unknown directive 'bdhfhdhj'
~~~

After which the curl still works.

This also needed a change to reset the channel used for the metrics
go-routine which gets closed on shutdown, otherwise you'll see:

~~~
^C[INFO] SIGINT: Shutting down
panic: close of closed channel

goroutine 90 [running]:
github.com/coredns/coredns/plugin/health.(*health).OnFinalShutdown(0xc000089bc0, 0xc000063d88, 0x4afe6d)
~~~

Signed-off-by: Miek Gieben <miek@miek.nl>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants