-
-
Notifications
You must be signed in to change notification settings - Fork 5.8k
Add support for readiness checks. #1883
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
78a2c7f to
da238f9
Compare
provider/marathon/readiness.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to reviewers: Ideally, I'd like to support logging in "trace level" more generally (i.e., without having to pass the trace flag into a particular function that needs this information).
I'm open to suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe rename logf to tracef
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, much better. Done.
provider/marathon/readiness.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to reviewers: The readiness checker logic should be fairly clear except for this corner case (hence the lengthy comment). Internally, we went through several iterations to improve the code and iron out corner cases (though with Bamboo instead of Traefik, the logic is identical -- I basically copy & pasted the implementation). It's been running in our production system for several weeks now, so it should have gained a fair level of robustness.
Nevertheless, I'd happy for suggestions to make it simpler/better.
da238f9 to
18e7e69
Compare
6dcc278 to
4546613
Compare
18e7e69 to
50489fd
Compare
|
@timoreimann wow, impressive job 👏 |
|
The Javaness in me is less of a love and more of a bad, hard-to-shake-off habit. 😬 I'm very open to shortening the option name. While an integration test would be nice, I can say from experience with this feature that covering all cases will be difficult: It depends a lot on how applications/tasks behave and the timing involved. It took us two or three internal patches to get this right with Bamboo. On the plus side, this means the algorithm is battle-tested to a certain degree already. And users can turn on/off readiness check filtering on demand. That said, if someone can help out on the upstream libcompose blocker, I'd be more than happy to contribute an integration test. 👼 |
50489fd to
beafbc7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Do you know if the problem of "bad readyness check result" will be fixed in marathon, because this add a lots of useless and complicated code in traefik just in order to handle this "marathon bug".
provider/marathon/readiness.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe rename logf to tracef
provider/marathon/readiness_test.go
Outdated
| }, | ||
| } | ||
|
|
||
| for _, c := range cases { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer test or case instead of c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't use case because that's a reserved keyword in Go, so I renamed everything to test.
Last time I checked I haven't seen a bug report on the matter. I'll double-check, file a report if I find nothing, and post it here. |
ldez
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
nmengin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👏 👏
This change adds support for Marathon readiness checks: If the option is set on Traefik and applications are found to come with configured readiness checks, the check results (as conducted by Marathon and exposed through its API) will be taken into account at deployment times in order to decide if a task should be taken into load-balancing rotation or not. Note that I had to extend the event filter that controls how often the Marathon provider polls the API so that the readiness checker is able to pick up all necessary readiness state transitions.
eb7c72f to
7fb888a
Compare
This change adds support for Marathon readiness checks: If the option is set on Traefik and applications are found to come with configured readiness checks, the check results (as conducted by Marathon and exposed through its API) will be taken into account at deployment times
in order to decide if a task should be taken into load-balancing rotation or not.
Note that I had to extend the event filter that controls how often the Marathon provider polls the API so that the readiness checker is able to pick up all necessary readiness state transitions.
Fixes #1185, #1559
Based on #1871 for now; will be rebased onto master once the dependent branch gets merged.