You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Router gets killed on Start due to timeout before configuration has completed
STEPS TO REPRODUCE
Deploy a Virtual Router with ~600 DHCP entries
EXPECTED RESULTS
VR should deploy properly
ACTUAL RESULTS
Timeout was reached
The story is that during a upgrade from 4.10 to 4.11.1 we (PCextreme) encountered a problem that Virtual Routers would not start.
During their Start and configuration they ran into a timeout which caused the VR to get killed.
For example we saw in the logs:
2018-10-29 06:38:07,041 DEBUG [resource.virtualnetwork.VirtualRoutingResource] (agentRequest-Handler-6:null) (logid:ded92662) Aggregate action timeout in seconds is 665
2018-10-29 06:38:07,041 DEBUG [kvm.resource.LibvirtComputingResource] (agentRequest-Handler-6:null) (logid:ded92662) Creating file in VR, with ip: 169.254.3.223, file: VR-d09aa357-27e3-4176-a283-9a7afedbae27.cfg
2018-10-29 06:38:07,464 DEBUG [kvm.resource.LibvirtComputingResource] (agentRequest-Handler-6:null) (logid:ded92662) Executing: /usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh vr_cfg.sh 169.254.3.223 -c /var/cache/cloud/VR-d09aa357-27e3-4176-a283-9a7afedbae27.cfg
2018-10-29 06:38:07,466 DEBUG [kvm.resource.LibvirtComputingResource] (agentRequest-Handler-6:null) (logid:ded92662) Executing while with timeout : 665700
So in this case the timeout was 665 seconds, about 11 minutes.
We tried to increase router.aggregation.command.each.timeout both on the Management Server side and in agent.properties, but that did not seem to make any change.
For each DHCP entry a ~1 second timeout seems to be calculated. This VR has 609 DHCP entries:
10 minutes is a long time, that is something that would need improving as well, but apart from that I just would not start.
My colleague created PR #2977 as this fixed the issue for us. So we need to investigate if his fix is the proper one or that the (default) timeout should be increased.
The text was updated successfully, but these errors were encountered:
ISSUE TYPE
COMPONENT NAME
CLOUDSTACK VERSION
CONFIGURATION
OS / ENVIRONMENT
Basic Networking
SUMMARY
Router gets killed on Start due to timeout before configuration has completed
STEPS TO REPRODUCE
EXPECTED RESULTS
ACTUAL RESULTS
The story is that during a upgrade from 4.10 to 4.11.1 we (PCextreme) encountered a problem that Virtual Routers would not start.
During their Start and configuration they ran into a timeout which caused the VR to get killed.
For example we saw in the logs:
So in this case the timeout was 665 seconds, about 11 minutes.
We tried to increase router.aggregation.command.each.timeout both on the Management Server side and in agent.properties, but that did not seem to make any change.
For each DHCP entry a ~1 second timeout seems to be calculated. This VR has 609 DHCP entries:
10 minutes is a long time, that is something that would need improving as well, but apart from that I just would not start.
My colleague created PR #2977 as this fixed the issue for us. So we need to investigate if his fix is the proper one or that the (default) timeout should be increased.
The text was updated successfully, but these errors were encountered: