Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLOUDSTACK-9323: Fix cancel host maintenance can… #1454

Merged
merged 1 commit into from
Apr 28, 2016

Conversation

abhinandanprateek
Copy link
Contributor

Fix cancel host maintenance so that if maintenance is cancelled the host come back to normal state gracefully.

Added marvin tests for host maintennace.

@jburwell
Copy link
Contributor

@abhinandanprateek I don't see the Marvin test case in the PR. Have you pushed the latest commit?

Also, most of the changes seem to be formatting changes in non-related parts of the class which hides the actual fix. Would it be possible to reverse these formatting changes to reduce the size of the patch to only the change in doCancelMaintenance?

@abhinandanprateek abhinandanprateek force-pushed the host-maint branch 2 times, most recently from 124ab4e to ee59d62 Compare March 28, 2016 03:46
@abhinandanprateek
Copy link
Contributor Author

@jsb added the marvin file and reverted to pre-commit formatted code.

@abhinandanprateek
Copy link
Contributor Author

Marvin test output:

root@ccp:~/cloudstack(host-maint)# ./host_maint.sh
++ date

  • echo Mon Mar 28 11:47:45 IST 2016
    Mon Mar 28 11:47:45 IST 2016
  • TMP=/tmp
  • CLOUDDIR=/root/cloudstack
  • mkdir -p /tmp/simulator/smoke/misc
  • nosetests --with-xunit --xunit-file=/tmp/quagga/test_quagga.xml --with-marvin --marvin-config=/root/cloudstack/advanced.cfg /root/cloudstack/test/integration/component/test_host_maintenance.py -s -a tags=advanced,required_hardware=false --zone=Bootcamp --hypervisor=XenServer

==== Marvin Init Started ====

=== Marvin Parse Config Successful ===

=== Marvin Setting TestData Successful===

==== Log Folder Path: /tmp//MarvinLogs//Mar_28_2016_11_47_46_LB5B4I. All logs will be available here ====

=== Marvin Init Logging Successful===

==== Marvin Init Successful ====
test_01_cancel_host_maintenace (integration.component.test_host_maintenance.TestHostMaintenance)
Hypervisor = 34777316-62ae-4590-868a-71daa23dad3d
Hypervisor = 2e872579-efb9-4bb2-ae00-dfb21a994f08
Create VMs as there are not enough vms to check host maintenance
Creating vms = 5
Using template 7e374f74-b471-46bd-bbbe-f457d5376acc
Using service offering 94589b7c-ef32-484b-8ef7-59ad1ba6b69b
Using template 7e374f74-b471-46bd-bbbe-f457d5376acc
Using service offering 9c95b1bf-0ca3-4bba-a00b-1fe61eb078be
VM create = a0901dc0-a568-4d78-8d46-ddf976f6af95
VM create = ebba3c2b-d950-4969-ae34-2cc785023ef9
VM create = 0a9635be-5599-4259-a89b-ad9017e2d4e7
VM create = 9a885588-db3f-4aca-9885-c931fbd6ea71
Host with id 34777316-62ae-4590-868a-71daa23dad3d is in prepareHostForMaintenance
Host with id 34777316-62ae-4590-868a-71daa23dad3d is in cancelHostMaintenance
Host with id 2e872579-efb9-4bb2-ae00-dfb21a994f08 is in prepareHostForMaintenance
Host with id 2e872579-efb9-4bb2-ae00-dfb21a994f08 is in cancelHostMaintenance
=== TestName: test_01_cancel_host_maintenace | Status : SUCCESS ===

===final results are now copied to: /tmp//MarvinLogs/test_host_maintenance_CXM1EL===
++ date

  • echo Mon Mar 28 11:58:36 IST 2016
    Mon Mar 28 11:58:36 IST 2016

@abhinandanprateek abhinandanprateek force-pushed the host-maint branch 2 times, most recently from abc0f7c to b4c427f Compare March 28, 2016 12:55
@@ -2112,11 +2112,13 @@ private boolean doCancelMaintenance(final long hostId) {

/* TODO: move to listener */
_haMgr.cancelScheduledMigrations(host);

boolean vms_migrating = false;
final List<VMInstanceVO> vms = _haMgr.findTakenMigrationWork();
for (final VMInstanceVO vm : vms) {
if (vm != null && vm.getHostId() != null && vm.getHostId() == hostId) {
s_logger.info("Unable to cancel migration because the vm is being migrated: " + vm);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you turn the if on line 2119 to a method call like isVmMigrating(vm, hostId)? Or even ( if vm can never be null ) vm.isMigrating(hostId)

I think that it will improve the readability of this segment you are working. Also... is there a need to check all VMs ? Once you find one that is migrating do you still need to keep checking if they are migrating? If there is not a need, try changing the loop for a while, or issuing a break.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexandrelimassantana
The reason for not breaking from the for loop is to log info about vms that are under migration. Probably the log level should be increased to warn. These log messages would be valuable for trouble shooting.
On readability front yes the code will be improved further.

@abhinandanprateek
Copy link
Contributor Author

Output from marvin test:

TMP=/tmp
CLOUDDIR=/root/cloudstack-apple
mkdir -p /tmp/simulator/smoke/misc
nosetests --with-xunit --xunit-file=/tmp/test_quagga.xml --with-marvin --marvin-config=/root/cloudstack-apple/advanced.cfg /root/cloudstack-apple/test/integration/component/test_host_maintenance.py -s -a tags=advanced,required_hardware=true --zone=Bootcamp --hypervisor=KVM
==== Marvin Init Started ====

=== Marvin Parse Config Successful ===

=== Marvin Setting TestData Successful===

==== Log Folder Path: /tmp//MarvinLogs//Mar_30_2016_10_02_53_F4MC43. All logs will be available here ====

=== Marvin Init Logging Successful===

==== Marvin Init Successful ====
1 Hypervisor = 3e270f2d-f054-4cbe-85e1-0fbc30e8a414
1 Hypervisor = 6a34ca0a-ac2f-44db-a8f4-96b7de01dcd4
Host with id 3e270f2d-f054-4cbe-85e1-0fbc30e8a414 is in prepareHostForMaintenance
Host with id 3e270f2d-f054-4cbe-85e1-0fbc30e8a414 is in cancelHostMaintenance
Host with id 6a34ca0a-ac2f-44db-a8f4-96b7de01dcd4 is in prepareHostForMaintenance
Host with id 6a34ca0a-ac2f-44db-a8f4-96b7de01dcd4 is in cancelHostMaintenance
=== TestName: test_01_cancel_host_maintenace_with_no_migration_jobs | Status : SUCCESS ===

2 Hypervisor = 3e270f2d-f054-4cbe-85e1-0fbc30e8a414
2 Hypervisor = 3e270f2d-f054-4cbe-85e1-0fbc30e8a414
2 Hypervisor = 6a34ca0a-ac2f-44db-a8f4-96b7de01dcd4
2 Hypervisor = 6a34ca0a-ac2f-44db-a8f4-96b7de01dcd4
Create VMs as there are not enough vms to check host maintenance
Create VMs as there are not enough vms to check host maintenance
Creating vms = 5
Creating vms = 5
Using template a18cc46d-d4e8-476f-bdf6-9f3e5818a200
Using template a18cc46d-d4e8-476f-bdf6-9f3e5818a200
Using service offering 5bee807e-602a-4203-b9c3-ed72797b1d93
Using service offering 5bee807e-602a-4203-b9c3-ed72797b1d93
VM create = b67468ff-8120-42d7-87fe-2c7361c4afd3
VM create = b67468ff-8120-42d7-87fe-2c7361c4afd3
VM create = 8309716e-03ac-442a-ab18-a80748d5988c
VM create = 8309716e-03ac-442a-ab18-a80748d5988c
VM create = af8d8991-0ee2-44ca-bee4-d706f883dc0a
VM create = af8d8991-0ee2-44ca-bee4-d706f883dc0a
VM create = 1bacebb1-9be7-40a1-99f3-7b7e9deb45b7
VM create = 1bacebb1-9be7-40a1-99f3-7b7e9deb45b7
VM create = eac8b28d-0bea-4912-b6e4-3402ac07e802
VM create = eac8b28d-0bea-4912-b6e4-3402ac07e802
Host with id 3e270f2d-f054-4cbe-85e1-0fbc30e8a414 is in prepareHostForMaintenance
Host with id 3e270f2d-f054-4cbe-85e1-0fbc30e8a414 is in prepareHostForMaintenance
Vms found = 2
Vms found = 2
VirtualMachine on Hyp id = 8309716e-03ac-442a-ab18-a80748d5988c is in Migrating
VirtualMachine on Hyp id = 8309716e-03ac-442a-ab18-a80748d5988c is in Migrating
Host with id 3e270f2d-f054-4cbe-85e1-0fbc30e8a414 is in cancelHostMaintenance
Host with id 3e270f2d-f054-4cbe-85e1-0fbc30e8a414 is in cancelHostMaintenance
Host with id 6a34ca0a-ac2f-44db-a8f4-96b7de01dcd4 is in prepareHostForMaintenance
Host with id 6a34ca0a-ac2f-44db-a8f4-96b7de01dcd4 is in prepareHostForMaintenance
Vms found = 4
Vms found = 4
VirtualMachine on Hyp id = 8309716e-03ac-442a-ab18-a80748d5988c is in Migrating
VirtualMachine on Hyp id = 8309716e-03ac-442a-ab18-a80748d5988c is in Migrating
Host with id 6a34ca0a-ac2f-44db-a8f4-96b7de01dcd4 is in cancelHostMaintenance
Host with id 6a34ca0a-ac2f-44db-a8f4-96b7de01dcd4 is in cancelHostMaintenance
=== TestName: test_02_cancel_host_maintenace_with_migration_jobs | Status : SUCCESS ===

===final results are now copied to: /tmp//MarvinLogs/test_host_maintenance_RF5DFR===
++ date

echo Wed Mar 30 10:06:16 IST 2016 Wed Mar 30 10:06:16 IST 2016

@abhinandanprateek abhinandanprateek force-pushed the host-maint branch 3 times, most recently from e1f763e to a461fa9 Compare March 30, 2016 11:45
@cristofolini
Copy link
Contributor

@abhinandanprateek Looks like jenkins found some problems with your index. I'd suggest rebasing against the current master making all the needed merges and pushing again.

@swill
Copy link
Contributor

swill commented Apr 3, 2016

@abhinandanprateek would you mind rebasing to current master so I can test this in my CI? Thanks...

@abhinandanprateek
Copy link
Contributor Author

@cristofolini @swill rebased the PR, up for CI now, Thank you.

@swill
Copy link
Contributor

swill commented Apr 4, 2016

CI RESULTS

HAS FAILURES, NEEDS WORK!

Please address the following issue.

======================================================================
ERROR: test_02_cancel_host_maintenace_with_migration_jobs (integration.component.test_host_maintenance.TestHostMaintenance)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/data/git/cs1/cloudstack/test/integration/component/test_host_maintenance.py", line 277, in test_02_cancel_host_maintenace_with_migration_jobs
    self.vmlist = self.createVMs(listHost[0].id, no_vm_req)
  File "/data/git/cs1/cloudstack/test/integration/component/test_host_maintenance.py", line 108, in createVMs
    self.logger.debug("Using template %s " % self.template.id)
AttributeError: 'str' object has no attribute 'id'

Associated Uploads

test_host_maintenance_P9AKTO:

Uploads will be available until 2016-06-04 00:00:00 +0000 GMT

Comment created by upr comment.

@abhinandanprateek
Copy link
Contributor Author

@swill In my environment I am using macchinina templates for testing. These are not there by default but are pretty handy in testing due to their small size. This is the reason for the error above. Adding these to CI environment will speed up many such tests. If it is lot of work I will revert to using standard CentOS templates, comments ?

                     "template_name_xen" : "macchinina-xen",
                     "template_name_kvm" : "macchinina-kvm",

cc @jburwell

@abhinandanprateek
Copy link
Contributor Author

On comparing test execution times with macchinina Vs Centos builtin template I did not find much of a difference. So modifying the test to go with the standard builtin templates. @swill

@abhinandanprateek
Copy link
Contributor Author

Updated marvin test to use the builtin template. @swill

@swill
Copy link
Contributor

swill commented Apr 5, 2016

@abhinandanprateek: Thank you. I will run the whole set of tests against it again tonight. I have to let the current tests finish and run this on its own because I have to change the tests run specifically for this PR. I should have results in the morning.

@swill
Copy link
Contributor

swill commented Apr 5, 2016

@abhinandanprateek slight delay. I am going to blow away my CI setup and reinstall it now to get it running on SSD drives so I can better parallelize my tests. This will delay this test till at least tomorrow evening.

@abhinandanprateek
Copy link
Contributor Author

@swill if it is possible to run integration/component/test_host_maintenance.py first than please do that as it is a new test added for this fix, followed by full marvin suite.

@swill
Copy link
Contributor

swill commented Apr 7, 2016

CI RESULTS

84/85 TESTS PASSED

The test that failed is a test that commonly fails in my environment and has been verified to be an environment issue.

Associated Uploads

test_host_maintenance_X027L5:

test_vpc_routers_093RH7:

Uploads will be available until 2016-06-07 02:00:00 +0200 CEST

Comment created by upr comment.

if (vm != null && vm.getHostId() != null && vm.getHostId() == hostId) {
s_logger.info("Unable to cancel migration because the vm is being migrated: " + vm);
return false;
if (vm.getHostId() != null && vm.getHostId() == hostId) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this conditional right?
vm.getHostId() != null && vm.getHostId() == hostId

It is looking a little weird to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad, I was not thinking straight.
The conditional is ok

@swill
Copy link
Contributor

swill commented Apr 7, 2016

@abhinandanprateek tests have passed, thanks for the updates. Can you do a push -f to kick off a Jenkins run so we can try to get this PR all green.

@jburwell Does this pass your code review now?

I am trying to get two independent LGTM code reviews before merging whenever possible, so if you have reviewed the code, please let me know.

@rafaelweingartner
Copy link
Member

I have gone through the code, it LGTM


if callback is None:
return INVALID_INPUT
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function should always return an Array value. If the callback is None, why not raise a ValueError rather than return a incompatible value? Not only is it more idiomatic Python, but it preserves the type semantics of the function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to following two reason I preferred returning an error code instead of an exception.

  1. Returning a pre-defined error code makes it usage more flexible as raising an exception or continue will be defined by the user of the method.
  2. "INVALID_INPUT" is a Marvin error code used by other utility methods to signal bad inputs and this maintains that pattern.

Copy link
Contributor

@jburwell jburwell Apr 15, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abhinandanprateek while it is permissible Python, it is considered an anti-pattern for a Python function to return different types. In this case, it can return a scalar value or a multi-return. Raising a ValueError is the idiomatic Python approach to handling an invalid parameter value. In addition to ensuring that the function always returns the same type, raising an error in this manner will cause the test to fail fast without callers having the check the return to properly fail.

@jburwell
Copy link
Contributor

jburwell commented Apr 8, 2016

@swill just reviewed.

@abhinandanprateek I have a few more comments to be addressed on the test cases.

@swill
Copy link
Contributor

swill commented Apr 18, 2016

@abhinandanprateek, I believe you are away this week, but can your address @jburwell's comments when you have a chance. Thanks.

@abhinandanprateek abhinandanprateek force-pushed the host-maint branch 2 times, most recently from 838385a to 67f3527 Compare April 25, 2016 09:57
…celled the host come back to normal state gracefully.

Added marvin tests for host maintennace.
@abhinandanprateek
Copy link
Contributor Author

@swill @jburwell the concerns noted above are taken care off.

@swill
Copy link
Contributor

swill commented Apr 25, 2016

@jburwell can I get your LGTM? I will run CI on this again today because some code has changed since the last run.

@jburwell
Copy link
Contributor

LGTM for code

@swill
Copy link
Contributor

swill commented Apr 28, 2016

Sorry for the delay on this one guys. I think this one is ready now...

@asfgit asfgit merged commit 182ab64 into apache:master Apr 28, 2016
asfgit pushed a commit that referenced this pull request Apr 28, 2016
CLOUDSTACK-9323: Fix cancel host maintenance canFix cancel host maintenance so that if maintenance is cancelled the host come back to normal state gracefully.

Added marvin tests for host maintennace.

* pr/1454:
  CLOUDSTACK-9323: Fix Cancel maintenance so that if maintenance is cancelled the host come back to normal state gracefully. Added marvin tests for host maintennace.

Signed-off-by: Will Stevens <williamstevens@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants