-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improvement: systemvm slow operations #1604
improvement: systemvm slow operations #1604
Conversation
@@ -228,7 +228,7 @@ def processCLItem(self, num, nw_type): | |||
if('localgw' in self.qFile.data['cmd_line']): | |||
dp['gateway'] = self.qFile.data['cmd_line']['localgw'] | |||
else: | |||
dp['gateway'] = 'None' | |||
dp['gateway'] = self.qFile.data['cmd_line'].get('gateway', 'None') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there could be a possibility of having an ipv6 gateway, in that case, we would be looking at ip6gateway
.
@romain-dartigues do we know why the |
Right, but IPv6 does not use ARP but NDP (RFC 4861). It seems that a
Apparently because the file "/etc/cloudstack/ips.json" is generated with a gateway at "None" even when the gateway is present in the input "cmd_line.json" (but my memory could fail me at this time of the day). One month ago we found the "problem" and wanted a fix; trying to understand why it did happen made me start #1575 (now #1603). I though it would be good (at least for me) to get a better understanding of the code; doing hundreds of arping everytime is probably inefficient as each of them is supposed to refresh the ARP table of all equipment on the way... |
Thanks @romain-dartigues based on your comments the code LGTM |
we do have to remember that arping is ipv4 only so we need to fix this for ipv6 too |
@blueorangutan package |
@rhtyd a Trillian-Jenkins job has been kicked to build packages. I'll keep you posted as I make progress. |
Packaging result: ✔centos6 ✔centos7 ✖debian repo: http://packages.shapeblue.com/cloudstack/pr/1604 |
cmd = "arping -c 1 -I %s -A -U -s %s %s" % ( | ||
self.dev, self.address['public_ip'], self.address['gateway']) | ||
CsHelper.execute(cmd) | ||
CsHelper.execute(cmd, wait=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Flag arguments are an anti-pattern. Please consider introducing and executeAndWait
function instead.
@romain-dartigues can you fix the merge conflict, squash the changes, open a JIRA issue referencing that in the commit message. |
fce10c5
to
9c47e97
Compare
@murali-reddy can you help review this one? |
21f4e6d
to
67bfc7c
Compare
@romain-dartigues @murali-reddy ping @blueorangutan package |
@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress. |
Packaging result: ✔centos6 ✔centos7 ✔debian. JID-208 |
67bfc7c
to
ebe6537
Compare
@rhtyd pong (is it OK like that? please advise) |
Thanks @romain-dartigues that looks good. I'll kick a test round soon. |
@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress. |
Packaging result: ✔centos6 ✔centos7 ✔debian. JID-227 |
@blueorangutan test |
@rhtyd a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests |
Trillian test result (tid-379)
|
I'll need help to understand what gone wrong. I just peeked at But |
@romain-dartigues I'll have a look at the failure, meanwhile can you see my comments against the changes https://github.com/apache/cloudstack/pull/1604/files |
ebe6537
to
a0d0c8d
Compare
@rhtyd < separate function done. |
okay @romain-dartigues I'll kick some tests |
@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress. |
Packaging result: ✖centos6 ✖centos7 ✖debian. JID-238 |
@blueorangutan package |
@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress. |
Packaging result: ✔centos6 ✔centos7 ✔debian. JID-243 |
@murali-reddy @abhinandanprateek can you help review this, thanks |
@rhtyd a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests |
Trillian test result (tid-425)
|
@romain-dartigues we fixed issue where ips.json has wrong gateway due to error in parsing the cmd_line.json. its committed in 4.8 and above [1]. Do you still see (or can you please verify) performance hit due to arpping to wrong gateway on 4.8/4.9/master? [1] 93ac134 |
@murali-reddy I'm not in position to run tests anymore |
Several test failures, unless the PR author can fix, this is on hold. |
@romain-dartigues @dpassante any update on this? |
No comment, since it does not seems required anymore I close this request. |
On our setup we had issues with vrouters when a large number of VM were created at once.
The vrouters try to arping the gateway (to refresh the ARP cache I suppose), but in some case it take too much time and the vrouters got killed for being unresponsive.
The following patches attempt to solve this by:
This patch has been running for a month on our test environment and seems to solve the initial problem.
An operation (starting 220 VM on XenServer, shared storage) which took 46 mn, now take 4 mn (without errors).