Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

register: clean-up fabio service entries in Consul on dirty exit #664

Merged
merged 5 commits into from
Jul 8, 2019

Conversation

pires
Copy link
Contributor

@pires pires commented Jun 13, 2019

This is achieved by:

  • Adding a TTL check to any services fabio may register into Consul, including its own;
  • Repurpose the goroutine (that exists per service that fabio registers) that periodically guarantees the service is registered in Consul to also reset the respective TTL check clock;
  • Set a short deadline in Consul to remove the service in case the TTL check is critical after said deadline is expired (time is non-deterministic as fabio doesn't control when the Consul reaper runs);

This PR also fixes a few nits, such as typos, and adds missing documentation for registry.consul.register.deregisterCriticalServiceAfter option.

Fixes #663

@CLAassistant
Copy link

CLAassistant commented Jun 13, 2019

CLA assistant check
All committers have signed the CLA.

@aaronhurt
Copy link
Member

@pires Thank you for putting this together. It looks quite clean and doesn't add any additional go-routines. I think my concerns from the original issue were overstated after looking at the code.

Thank you again. LGTM!

Copy link
Member

@pschultz pschultz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good on the surface apart from a few minor things. I haven't had the time tried it though.

Thanks for the contribution.

registry/consul/register.go Outdated Show resolved Hide resolved
registry/consul/register.go Outdated Show resolved Hide resolved
registry/consul/register.go Outdated Show resolved Hide resolved
@pires
Copy link
Contributor Author

pires commented Jun 14, 2019

@pschultz please, don't approve just yet. As I was adding changes to the CHANGELOG.md, I found a bug on which I'm working on right now. Nevermind, didn't sleep much. Everything works as expected :)

pires added a commit to pires/fabio that referenced this pull request Jun 14, 2019
Signed-off-by: Pires <pjpires@gmail.com>
@pires
Copy link
Contributor Author

pires commented Jun 14, 2019

@leprechau @pschultz added changes to CHANGELOG.md. Please, re-review.

@pires
Copy link
Contributor Author

pires commented Jun 18, 2019

@leprechau @pschultz sorry for disturbing but can I expect this to have a review soon and eventually be merged or are there any more concerns from you or simply the lack of availability?

@aaronhurt
Copy link
Member

@pires This LGTM and thank you for addressing the concerns and updating the change log. I'll defer to @pschultz for his approval on the changes he requested.

@pires
Copy link
Contributor Author

pires commented Jun 18, 2019

Thank you, @leprechau!

@pschultz
Copy link
Member

I would still like to try this out and see it in action. I can't say yet if I'll manage that today.

Copy link
Member

@pschultz pschultz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I've tested this out now and it seems to be working well apart from a small delay after first registering a service (see inline comments).

Unless I'm missing something, the deregisterCriticalServiceAfter option never did much; as long as fabio is running it'll re-register the services even after they have been removed by Consul. It did cause the service to be deregistered when fabio crashed, but with this change the TTL's DeregisterCriticalServiceAfter setting trumps the HTTP check.

I think we should deprecate the deregisterCriticalServiceAfter option now and remove it in 1.5.13 or 1.6.0. Thoughts, @leprechau, @magiconair?

CHANGELOG.md Outdated Show resolved Hide resolved
registry/consul/register.go Outdated Show resolved Hide resolved
registry/consul/register.go Outdated Show resolved Hide resolved
registry/consul/register.go Outdated Show resolved Hide resolved
pires added a commit to pires/fabio that referenced this pull request Jun 19, 2019
Signed-off-by: Pires <pjpires@gmail.com>
Signed-off-by: Pires <pjpires@gmail.com>
This is achieved by:
- Adding a TTL check to any services fabio may register into Consul, including
  its own;
- Repurpose the goroutine (that exists per service that fabio registers) that
  periodically guarantees the service is registered in Consul to also reset the
  respective TTL check clock;
- Set a short deadline in Consul to remove the service in case the TTL check is
  `critical` after said deadline is expired (time is non-deterministic as fabio
  doesn't control when the Consul reaper runs);

Signed-off-by: Pires <pjpires@gmail.com>
Signed-off-by: Pires <pjpires@gmail.com>
@pires
Copy link
Contributor Author

pires commented Jun 20, 2019

@pschultz all your feedback has been addressed and I have reworked the commit log to reduce the number of commits. I'll be awaiting your feedback.

Copy link
Member

@pschultz pschultz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me now. Thanks a lot!

Just waiting for feedback on deprecating the deregisterCriticalServiceAfter option.

@pires
Copy link
Contributor Author

pires commented Jun 20, 2019

@pschultz I suggest you take it as a separate issue to be addressed in a separate PR. Here, I just wanted to add the missing documentation for something that already exists.

And if this wasn't clear before, the option exposed in the configuration has no impact in the TTL implementation.

@pschultz
Copy link
Member

The option exposed in the configuration has no impact in the TTL implementation.

But the TTL check affects the deregisterCriticalServiceAfter option. Before, the service would be deregistered 90 minutes after fabio exits at the latest. Now it takes at bit more than one minute because of the TTL check's configuration. Essentially, the deregisterCriticalServiceAfter option is pointless now. We either have to deprecate it (which is what I prefer), or apply it to the TTL check also.

@pires
Copy link
Contributor Author

pires commented Jun 20, 2019

Not necessarily. For instance, let's assume someone set-up a firewall rule that blocks the fabio port used by the HTTP check. In this case, Consul wouldn't be able to pass the HTTP check but the TTL check would still pass. Therefore, TTL deregisterCriticalServiceAfter would not kick-in but the HTTP deregisterCriticalServiceAfter would, eventually.

@pschultz
Copy link
Member

We already established that deregisterCriticalServiceAfter doesn't do anything as long as fabio is running, because it will (re-)register the service periodically. And deregisterCriticalServiceAfter cannot possibly be set to a value less than the TTL check's because that's the minium of one minute. So deregisterCriticalServiceAfter is effectively ignored.

@pires
Copy link
Contributor Author

pires commented Jun 20, 2019

I'm not arguing with that at all. I should have made my remarks clear and I'm sorry for failing to do so. I just meant that both checks deregisterCriticalServiceAfter are not related and have their own lifecycle within Consul.

With that said, yes, TTL as is could deprecate said option and that is my preferred take here. If you find quorum to have a minor release out (due to breaking change) I'll be glad to do it as part of this PR.

@pires
Copy link
Contributor Author

pires commented Jun 26, 2019

@pschultz given the lack of feedback, can the two of us find agreement on how to proceed?

@pschultz
Copy link
Member

I will not make a unilateral decision, sorry. Please be patient.

@pires
Copy link
Contributor Author

pires commented Jun 26, 2019

It's OK. Just trying to prevent a situation where this PR stays in the limbo until it is eventually closed (not merged) after all the work we (me and the reviewers) put so far. That would be a shame. Thank you!

@aaronhurt
Copy link
Member

Sorry folks, been swamped lately. I'll give this another thorough read through today.

@aaronhurt
Copy link
Member

@pschultz @pires Thank you again for the patience. I read through the code and the above conversations and agree with the decisions you have both made.

There is a small conflict on the CHANGELOG that should be resolvable with a simple rebase. Once that's done I say we get this merged.

@aaronhurt aaronhurt merged commit ea3a365 into fabiolb:master Jul 8, 2019
@aaronhurt
Copy link
Member

It was a small cleanup, I took care of it. Thanks again everyone and sorry for the long delay.

@pires pires deleted the pires/register_svc_ttl branch July 8, 2019 23:02
@pires
Copy link
Contributor Author

pires commented Jul 8, 2019

Thank you all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

fabio service entries may stay in Consul on dirty exit
5 participants