New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't throw away valid configuration updates #5952
Conversation
This comment has been minimized.
This comment has been minimized.
Thx for your PR! By fixing this bug like this, I see a side effect. With your version, if we receive a "same configuration", it will now trigger the WDYT if you put the configuration compare in the ThrottleGoroutine https://github.com/containous/traefik/blob/16288d171cedf4cc1ff429d9099152721b90e8fb/pkg/server/configurationwatcher.go#L216 and just compare with a save of the previous configuration that was sent in the |
@juliens thank you for taking a look at this. I could take or leave the change you are suggesting. My main concern was around correctness rather than speed of config update as it seemed that the throttle design was already purposefully introducing delays. I chose separation that I did because it kept two concepts separate: selecting a "most recent" update (the "throttle"), and idempotently applying the configuration updates that are selected. The code that I changed already existed and I simply moved it to the place which seemed to actually achieve the goal given the design as I saw it. I didn't want to take on a different design, which is what I see the change you suggesting leading to. That is to change the "throttle" into a comprehensive configuration updates gating (de-duplication, rate limiting, validation) mechanism. I think that is a valid design, but not something I wanted to take on in this fix. |
I agree that it changes a little bit the design, but your change changes the behaviour too, because, yes the throttle add some delay, but if you read the code carefully, you will see that the first received configuration is apply, and then we add the throttle duration. |
@juliens, I'd like to be clear on what you are saying. Are you saying that you won't accept this fix unless I make the change you are stating? Or are you saying that, if I have time, could I please make this change? If you are asking if I have time. Then the answer is that I really don't. You are welcome to make the change to address the other issue that you are worried about if you want to address that. I'll make the change if you are demanding that, but unhappily. I have provided a fix that addresses the issue I've reported. I've already spent a large amount of time on this bug to explain it to one of your collegues and have pretty much moved on by this point. The only thing holding me to this now is that I want to make sure we aren't running on a patched version of the code forever (or until we move off of traefik because of this bug). I am not relishing the idea of taking this on as, without some more thinking about it, I'm not convinced of its correctness. I'd rather not spend that time right now. |
@zaphod42, to be clear, what I am saying is that I will not merge your PR as is because it fixes your bug, but it does add a new bug. Having said that, I can really understand that contributors don't always have enough time to quickly manage comments, and that you don't have time to manage that. This is why, if you agree, since the fix I am offering does not take much time, I can modify your PR in order to handle this, and I would be really happy if you could take a look and confirm that it fixes your bug (even if I am already pretty sure that it fixes it thanks to the test that you added) |
@zaphod42 I made the fix, could you confirm the fix works for you too? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The configuration reloading code takes various actions in order to limit the number of reloads that have to be performed. One of those actions is to check if the new configuration is the same as the current configuration. If the new configuration is the same, then there is no need to apply it and it can be safely ignored. However, the check was performed at the wrong location. The location assumed that the c.currentConfigurations variable is up to date. This assumption does not hold given the throttling and asynchronous handling of configuration updates. The effect of this incorrect assumption is that a configuration update can be thrown away and leave a prior configuration in place. This resolves the problem by moving the check to the loadMessage method. This is the point where the c.currentConfigurations variable is modified and so there is no chance that the new configuration will be compared against an incorrect version.
Didn't run go fmt on the previous commit. This was pointed out by a failing test.
26f6a3d
to
c39614d
Compare
Hey guys, I wanted to come back and thank you for working on this. I'm sorry about snapping at you earlier, I let my stress get to me and it came out on you. You don't deserve that. I took a look at the change and it looks like it addresses the race issue as well as the timing issue that you raised, @juliens. |
What does this PR do?
The configuration reloading code takes various actions in order to limit the number of reloads that have to be performed.
One of those actions is to check if the new configuration is the same as the current configuration.
If the new configuration is the same, then there is no need to apply it and it can be safely ignored.
However, the check was performed at the wrong location.
The location assumed that the
c.currentConfiguration
s variable is up to date.This assumption does not hold given the throttling and asynchronous handling of configuration updates.
The effect of this incorrect assumption is that a configuration update can be thrown away and leave a prior configuration in place.
This resolves the problem by moving the check to the
loadMessage
method.This is the point where the
c.currentConfigurations
variable is modified and so there is no chance that the new configuration will be compared against an incorrect version.Motivation
Fixes #5901
More