-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLOUDSTACK-9319: Use timeout when applying config to virtual router #1451
Conversation
@insom any chance you can supply some test-mech? |
@insom with this changes there is still one place where the applyConfigToVR is not provided with this timeout parameter (line 183, same class). This function is actually called only in those 2 places. If it comes to a state where you will always supply the timeout, remove the function which doesn't require you to supply it. Also, if you plan to always supply the timeout, a change into lines 378 and 379 would provide more coersion: if (timeout < 120) {
timeout = 120;
} into: if (timeout < VRScripts.DEFAULT_EXECUTEINVR_TIMEOUT) {
timeout = VRScripts.DEFAULT_EXECUTEINVR_TIMEOUT;
} |
@alexandrelimassantana Agreed, thanks for the review. I've added a commit removing the prototype with the default. That has allowed me to remove that if altogether and place it inside the @DaanHoogland Because the original commit is about the wrong signature for the method being used I'm not quite sure how to add tests for it. I hope that this second commit will help as it means the compiler can enforce things. Thanks for the review! (Also it's Friday and I don't have a Maven install on this laptop - should pass the CI tests, if not, I may need to come back to it on Tuesday). |
@@ -180,7 +179,7 @@ private Answer applyConfig(NetworkElementCommand cmd, List<ConfigItem> cfg) { | |||
boolean finalResult = false; | |||
for (ConfigItem configItem : cfg) { | |||
long startTimestamp = System.currentTimeMillis(); | |||
ExecutionResult result = applyConfigToVR(cmd.getRouterAccessIp(), configItem); | |||
ExecutionResult result = applyConfigToVR(cmd.getRouterAccessIp(), configItem, VRScripts.DEFAULT_EXECUTEINVR_TIMEOUT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@insom Since you're now checking the timeout's validity when calling applyConfigToVR
shouldn't the last parameter passed through this call be timeout
instead of VRScripts.DEFAULT_EXECUTEINVR_TIMEOUT
?
@cristofolini there is no timeout variable in the scope you commented. His change is valid because VRScripts.DEFAULT_EXECUTEINVR_TIMEOUT is mapped to the previous default timeout value. This is equivalent to the previous applyConfigToVR() call without passing the timout value. |
@alexandrelimassantana Ah, I see that now. Thanks for the clarification! :) |
@@ -180,7 +179,7 @@ private Answer applyConfig(NetworkElementCommand cmd, List<ConfigItem> cfg) { | |||
boolean finalResult = false; | |||
for (ConfigItem configItem : cfg) { | |||
long startTimestamp = System.currentTimeMillis(); | |||
ExecutionResult result = applyConfigToVR(cmd.getRouterAccessIp(), configItem); | |||
ExecutionResult result = applyConfigToVR(cmd.getRouterAccessIp(), configItem, VRScripts.DEFAULT_EXECUTEINVR_TIMEOUT); | |||
if (s_logger.isDebugEnabled()) { | |||
long elapsed = System.currentTimeMillis() - startTimestamp; | |||
s_logger.debug("Processing " + configItem + " took " + elapsed + "ms"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @insom .
I know that this isn't your code but could you use String.format to create the string that is used by logger?
It turns the code more readable than using multiple strings concatenation.
Ty.
@pedro-martins Thanks for the review but I don't want to make this change because (a) it'll mix two intents into on PR - fixing the bug and improving the logging - and (b) it would be less consistent with the rest of the CloudStack code base which, for better or worse, largely uses string concatenation for debug logging. |
ACS CI BVT RunSumarry: Link to logs Folder (search by build_no): https://www.dropbox.com/sh/yj3wnzbceo9uef2/AAB6u-Iap-xztdm6jHX9SjPja?dl=0 Failed tests:
Skipped tests: Passed test suits: |
@bvbharatk Hi, am I supposed to do something with this CI output? It looks like all of the errors are related to other parts of CloudStack that aren't touched by my commits. I came back to check on this bug because I got bitten by this again today (timeout when adding rules to a virtual router, but the timeout is actually not getting applied properly). |
I got contacted directly by someone else hit by this bug and looking for a fix. They saw I had commits relating to their problem. What needs to happen to get it merged? |
many people thought that changing router.aggregation.command.each.timeout in global setting will increase the timeout. Actually it is not, at least for KVM. for KVM hypervisors, we need to put router.aggregation.command.each.timeout=??? in agent.properties and restart cloudstack-agent. This PR increases the minimal timeout to 120 seconds which is helpful but does not really fix the issue. I think it is better to add host-level setting for each host in the future, like we did for domain/account, zone/cluster. |
@ustcweizhou A side effect of this PR is setting the minimal timeout to 120, but that's what the original code intended - except that because of not passing through the timeout to What this PR really does is make the code work as intended. I'm up for improvements but if this is blocking upgrades (and at least iWeb and one other CloudStack user were hit by it) seems like merging this and then moving it to a per-host setting would be a good idea? |
@insom do not worry, actually I am +1 with this PR. I agree with what you said. |
Hi, last week we experience some problems because of the timeouts which mentioned on this PR. Could we get this PR to upcoming versions asap please. Without this timeout configuration functionality working, its very hard to make upgrades on VR, timeout issue generates real headaches. |
Code LGTM. BVTs are good(iso failures are URL access issues and not related to this PR). merging this now |
CLOUDSTACK-9319: Use timeout when applying config to virtual routerFrom the [JIRA issue](https://issues.apache.org/jira/browse/CLOUDSTACK-9319): > The timeout parameter is not passed down to `applyConfigToVR` inside `VirtualRoutingResource` in all cases. > > This timeout is worked out as 3 seconds per command or 120 seconds (whichever is larger), but because it's not passed to the first invocation, the default (120 seconds, DEFAULT_EXECUTEINVR_TIMEOUT) is used. > > In a recent upgrade of our Virtual Routers, the timeout was being hit and increasing `router.aggregation.command.each.timeout` had no effect. I built a custom 4.8 agent with the timeout increased to allow the upgrade to continue. * pr/1451: Remove dangerous prototype of applyConfigToVR Use timeout when applying config to virtual router Signed-off-by: Rajani Karuturi <rajani.karuturi@accelerite.com>
ACS CI BVT RunSumarry: Link to logs Folder (search by build_no): https://www.dropbox.com/sh/yj3wnzbceo9uef2/AAB6u-Iap-xztdm6jHX9SjPja?dl=0 Failed tests:
Skipped tests: Passed test suits: |
From the JIRA issue: