-
-
Notifications
You must be signed in to change notification settings - Fork 5.8k
Speeding up consul catalog health change detection #1694
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speeding up consul catalog health change detection #1694
Conversation
keis
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool stuff :)
provider/consul/consul_catalog.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than doing a merge make the channel a parameter of both watch* and send the same one to both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, good catch!
Corrected.
provider/consul/consul_catalog.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By only looking at the service name wouldn't that mean that health changes in 1 node in a bigger pool would not trigger a message. Or maybe I'm confused what ServiceName is here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just filtering only services with real names without consul agents. But now I adjusted the code to work only with Catalog Services response. Should be more cleaner approach now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The order in Go (for stdlib testing) is got first, want second; both in the comparison expression and the error message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed the order. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The order in Go (for stdlib testing) is got first, want second; both in the comparison expression and the error message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The order was changed.
provider/consul/consul_catalog.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log.Errorf("Failed to list services: %s", err)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
provider/consul/consul_catalog.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log.Errorf("Failed to list services: %s", err)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done it in appropriate way.
emilevauge
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great @vholovko !
I left few comments but overall, seems good :)
provider/consul/consul_catalog.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we add a else after this block ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not seeing here any functionality that can be placed in the else block.
This branch statement was copied from original watchServices function and there wasn't any else block for it.
From the source it is clear that there are two places when nil for data is possible
https://github.com/hashicorp/consul/blob/master/api/catalog.go#L140
https://github.com/hashicorp/consul/blob/master/api/catalog.go#L150
it is when we get an error which is handled at earlier phase in the code above.
Thus it seems that this if statement is just a precaution in case json.NewDecoder can bring nil without notifying us with an error (the actual JSON null in the consul response).
I doubt consul response body will be null within 200 OK GET response.
Even in case there are not services at all the JSON response should include the empty map and this situation is handled already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for you detailed explanation 👍
provider/consul/consul_catalog.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment, shouldn't we add a else block ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think anything should be placed in the 'else' block. The explanation is given in the earlier comment.
emilevauge
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @vholovko
LGTM
|
@vholovko could you squash your commits? |
|
@ldez |
|
@vholovko No problem 😃 You are in a specific case, because you have merge In the following commands, I suppose your git local repository setup is: $ git remote -v
origin git@github.com:vholovko/traefik.git (fetch)
origin git@github.com:vholovko/traefik.git (push)
upstream git@github.com:containous/traefik.git (fetch)
upstream git@github.com:containous/traefik.git (push)The commands: # Update your repo
git fetch --all
# Delete your local branch
git branch -D consul_catalog_hot_detect_newly_added_containers
# Create a branch with the same name as the previous.
# This branch must be created on the HEAD of the containous/traefik master branch.
git checkout -b consul_catalog_hot_detect_newly_added_containers upstream/master
# Squash all your branch in this new branch
git merge --squash origin/consul_catalog_hot_detect_newly_added_containers
# Create a new commit
git commit -m "Speeding up health change detection by separating it from catalog services check."
# Force push the branch.
git push --force-with-lease origin consul_catalog_hot_detect_newly_added_containersNext time, when you want to sync your branch with the master, use |
336f2d4 to
8b909a9
Compare
|
thanks @ldez , your hints were very helpful |
8b909a9 to
2fe3ce8
Compare
ldez
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks 👍
LGTM
Implemented long polling for services and for the health in the separate co-routines and react on either trigger independently by using same channel for both.
Basically two things improved:
Detailed description: #1107 (comment)
Improvement for #1107
Relates to #1514