Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLOUDSTACK-9691: Fixed unhandeled excetion in list snapshot command when a primary store is deleted related to it #1847

Merged
merged 2 commits into from
Feb 20, 2017

Conversation

anshul1886
Copy link

@anshul1886 anshul1886 commented Dec 21, 2016

@mike-tutkowski After support for snapshots on solidifire there are many places which are prone to these NullPointer exceptions resulting in various issues. Root cause for these issues is that we get the primary storage associated with snapshot and then figure out how to handle but if that store is deleted then it results in NullPointer exceptions. Without solidfire we don't need to access primary storage.

Should we handle that as issues are found or could there be other way to fix all of these issues?

@mike-tutkowski
Copy link
Member

Can you tell me which line was throwing a NullPointerException in the code that you fixed? Thanks!

@anshul1886
Copy link
Author

@mike-tutkowski Its in bug description. Are you not seeing this issue on your end?

2016-08-26 14:34:36,709 DEBUG [c.c.a.ApiServlet] (catalina-exec-1:ctx-90c9ba3a) (logid:115e39ad) ===START=== 10.233.88.59 – GET command=listSnapshots&response=json&listAll=true&page=1&pagesize=20&_=1472202277072
2016-08-26 14:34:36,747 ERROR [c.c.a.ApiServer] (catalina-exec-1:ctx-90c9ba3a ctx-94284178) (logid:115e39ad) unhandled exception executing api command: [Ljava.lang.String;@77f27ce8
com.cloud.utils.exception.CloudRuntimeException: Unable to locate datastore with id 1
at org.apache.cloudstack.storage.datastore.manager.PrimaryDataStoreProviderManagerImpl.getPrimaryDataStore(PrimaryDataStoreProviderManagerImpl.java:61)
at org.apache.cloudstack.storage.datastore.DataStoreManagerImpl.getDataStore(DataStoreManagerImpl.java:48)
at com.cloud.api.ApiResponseHelper.getDataStoreRole(ApiResponseHelper.java:571)
at com.cloud.api.ApiResponseHelper.createSnapshotResponse(ApiResponseHelper.java:537)
at org.apache.cloudstack.api.command.user.snapshot.ListSnapshotsCmd.execute(ListSnapshotsCmd.java:117)
at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:132)
at com.cloud.api.ApiServer.queueCommand(ApiServer.java:707)
at com.cloud.api.ApiServer.handleRequest(ApiServer.java:538)
at com.cloud.api.ApiServlet.processRequestInContext(ApiServlet.java:297)
at com.cloud.api.ApiServlet$1.run(ApiServlet.java:129)
at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
at com.cloud.api.ApiServlet.processRequest(ApiServlet.java:126)
at com.cloud.api.ApiServlet.doGet(ApiServlet.java:86)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)

@serg38
Copy link

serg38 commented Jan 3, 2017

@mike-tutkowski @anshul-gangwar @nvazquez I believe the same issue is addressed in PR1735 but in a more consistent fashion

@anshul1886
Copy link
Author

@serg38 seems so both addressing same issue.

@nvazquez
Copy link
Contributor

nvazquez commented Jan 3, 2017

Hi @anshul1886,
We addressed the same issue along with @serg38 in PR #1735. We proposed a way to fix the problem on it.

@mike-tutkowski
Copy link
Member

Let me provide a bit of background on this and then we can decide which way we want to correct this side effect.

Here is the PR that went in a while ago that enabled CloudStack to support volume snapshots that reside on primary storage:

#1403

The idea being these types of snapshots are faster than the back-up-to-secondary-storage approach CloudStack does by default and they can be a lot more space efficient, as well.

As part of this process, I went through and tried to identify all of the locations where we assumed a volume snapshot resided on secondary storage (and I put in code to see if it really resides there or, instead, if it's on primary storage).

As we have noted, a couple places were missed and this PR (as wells as #1735) were opened to address those issues.

The way this particular PR's code is written should work fine. In the case where the original primary storage has been removed, an exception will be thrown, caught, logged, and then we will default to returning secondary storage as the location (which it should be).

Instead of the try/catch approach, though, it might be better if we see if dataStore is null.

DataStore dataStore = dataStoreMgr.getDataStore(storagePoolId, DataStoreRole.Primary);

If that comes back null, then we apparently have removed primary storage, which can only be done if your snapshots don't reside on it. If dataStore == null, return DataStoreRole.IMAGE.

@anshul1886
Copy link
Author

@mike-tutkowski was in python mode so used try except mode instead of null check which is recommended in java. Pushed the changes for null check.

@mike-tutkowski
Copy link
Member

To clarify this line:

"If that comes back null, then we apparently have removed primary storage, which can only be done if your snapshots don't reside on it. If dataStore == null, return DataStoreRole.IMAGE."

I meant this for managed storage. For unmanaged storage, the original primary storage from which the snapshot came can be deleted and the snapshot can remain.

@mike-tutkowski
Copy link
Member

Code = LGTM

Copy link
Member

@sateesh-chodapuneedi sateesh-chodapuneedi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for code changes.

@sateesh-chodapuneedi
Copy link
Member

@karuturi Seems this is ready for merge, with code LGTMs and based on test results published by @cloudmonger ?
The 4 test failures seems not relevant to this patch with code changes in API layer.

@karuturi
Copy link
Member

karuturi commented Feb 1, 2017

@sateesh-chodapuneedi @anshul1886 @mike-tutkowski
whats the relation with #1735? Is one of them required or both?

@anshul1886
Copy link
Author

@karuturi, One will be enough to fix the issue. #1735 has added the test which can be used. Other then test that PR needs some fixing for actual fix.

@karuturi
Copy link
Member

@anshul1886 can you and @nvazquez work together and create a single PR?

@anshul1886
Copy link
Author

@nvazquez What would you like to do?

@nvazquez
Copy link
Contributor

Hi @anshul1886,
I've deployed and tested your PR by replicating issue we had in our environment with Vmware and passed successfully! I think your solution is much cleaner and simpler than mine. Do you agree if I close mine and we go ahead on your PR? Would you mind adding marvin test written in mine to your PR (under test_snapshots.py)?

@anshul1886
Copy link
Author

@nvazquez, Yeah I am fine with it. I will add the test from your PR to mine PR

@nvazquez
Copy link
Contributor

@anshul1886 great, thanks!

@anshul1886
Copy link
Author

@karuturi @nvazquez , Added the marvin test from #1735.

Copy link
Member

@sateesh-chodapuneedi sateesh-chodapuneedi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for code.
Thanks @anshul1886 for fix, and @nvazquez for integration test.

@karuturi
Copy link
Member

merging

@karuturi
Copy link
Member

oops.. will wait for travis

@cloudmonger
Copy link

ACS CI BVT Run

Sumarry:
Build Number 352
Hypervisor xenserver
NetworkType Advanced
Passed=104
Failed=1
Skipped=7

Link to logs Folder (search by build_no): https://www.dropbox.com/sh/yj3wnzbceo9uef2/AAB6u-Iap-xztdm6jHX9SjPja?dl=0

Failed tests:

  • test_routers_network_ops.py

  • test_03_RVR_Network_check_router_state Failed

Skipped tests:
test_01_test_vm_volume_snapshot
test_vm_nic_adapter_vmxnet3
test_static_role_account_acls
test_11_ss_nfs_version_on_ssvm
test_nested_virtualization_vmware
test_3d_gpu_support
test_deploy_vgpu_enabled_vm

Passed test suits:
test_deploy_vm_with_userdata.py
test_affinity_groups_projects.py
test_portable_publicip.py
test_over_provisioning.py
test_global_settings.py
test_scale_vm.py
test_service_offerings.py
test_routers_iptables_default_policy.py
test_loadbalance.py
test_routers.py
test_reset_vm_on_reboot.py
test_deploy_vms_with_varied_deploymentplanners.py
test_network.py
test_router_dns.py
test_non_contigiousvlan.py
test_login.py
test_deploy_vm_iso.py
test_list_ids_parameter.py
test_public_ip_range.py
test_multipleips_per_nic.py
test_regions.py
test_affinity_groups.py
test_network_acl.py
test_pvlan.py
test_volumes.py
test_nic.py
test_deploy_vm_root_resize.py
test_resource_detail.py
test_secondary_storage.py
test_vm_life_cycle.py
test_disk_offerings.py

@anshul1886
Copy link
Author

@karuturi Test failing is unrelated to this PR.

=== TestName: test_add_user_to_project | Status : EXCEPTION ===
=== TestName: test_add_user_to_project | Status : EXCEPTION ===

Test failing in travis is | ContextSuite con | exceptions.Excep | 36.025 | test_project_lim |
| text=TestResourc | tion | | its |
| eLimitsProject>: | | | |
| teardown

@karuturi
Copy link
Member

Hi @anshul1886, I just observed that the related PR #1735 was for 4.9. Since this bug also exists in 4.9, can you change the base branch of this PR to 4.9 and rebase?

@anshul1886
Copy link
Author

Ok, Doing that.

@anshul1886 anshul1886 changed the base branch from master to 4.9 February 20, 2017 05:54
@anshul1886
Copy link
Author

@karuturi, Done

@borisstoyanov
Copy link
Contributor

@blueorangutan package

@blueorangutan
Copy link

@borisstoyanov a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result: ✔centos6 ✔centos7 ✔debian. JID-506

@borisstoyanov
Copy link
Contributor

@blueorangutan test

@blueorangutan
Copy link

@borisstoyanov a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

asfgit pushed a commit that referenced this pull request Feb 20, 2017
CLOUDSTACK-9691: Fixed unhandeled excetion in list snapshot command when a primary store is deleted related to it@mike-tutkowski After support for snapshots on solidifire there are many places which are prone to these NullPointer exceptions resulting in various issues. Root cause for these issues is that we get the primary storage associated with snapshot and then figure out how to handle but if that store is deleted then it results in NullPointer exceptions. Without solidfire we don't need to access primary storage.

Should we handle that as issues are found or could there be other way to fix all of these issues?

* pr/1847:
  CLOUDSTACK-9691: Added test list_snapshots_with_removed_data_store
  CLOUDSTACK-9691: Fixed unhandeled excetion in list snapshot command when a primary store is deleted related to it

Signed-off-by: Rajani Karuturi <rajani.karuturi@accelerite.com>
@asfgit asfgit merged commit 3caedb9 into apache:4.9 Feb 20, 2017
@blueorangutan
Copy link

Trillian test result (tid-854)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 26568 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr1847-t854-kvm-centos7.zip
Intermitten failure detected: /marvin/tests/smoke/test_privategw_acl.py
Intermitten failure detected: /marvin/tests/smoke/test_snapshots.py
Test completed. 46 look ok, 2 have error(s)

Test Result Time (s) Test File
test_04_rvpc_privategw_static_routes Failure 315.29 test_privategw_acl.py
test_02_list_snapshots_with_removed_data_store Error 0.04 test_snapshots.py
test_01_vpc_site2site_vpn Success 140.99 test_vpc_vpn.py
test_01_vpc_remote_access_vpn Success 66.12 test_vpc_vpn.py
test_01_redundant_vpc_site2site_vpn Success 220.68 test_vpc_vpn.py
test_02_VPC_default_routes Success 284.23 test_vpc_router_nics.py
test_01_VPC_nics_after_destroy Success 513.11 test_vpc_router_nics.py
test_05_rvpc_multi_tiers Success 503.72 test_vpc_redundant.py
test_04_rvpc_network_garbage_collector_nics Success 1404.04 test_vpc_redundant.py
test_03_create_redundant_VPC_1tier_2VMs_2IPs_2PF_ACL_reboot_routers Success 538.71 test_vpc_redundant.py
test_02_redundant_VPC_default_routes Success 741.04 test_vpc_redundant.py
test_01_create_redundant_VPC_2tiers_4VMs_4IPs_4PF_ACL Success 1286.74 test_vpc_redundant.py
test_09_delete_detached_volume Success 151.44 test_volumes.py
test_08_resize_volume Success 156.61 test_volumes.py
test_07_resize_fail Success 156.51 test_volumes.py
test_06_download_detached_volume Success 157.60 test_volumes.py
test_05_detach_volume Success 151.18 test_volumes.py
test_04_delete_attached_volume Success 151.37 test_volumes.py
test_03_download_attached_volume Success 156.77 test_volumes.py
test_02_attach_volume Success 94.64 test_volumes.py
test_01_create_volume Success 621.09 test_volumes.py
test_deploy_vm_multiple Success 252.67 test_vm_life_cycle.py
test_deploy_vm Success 0.03 test_vm_life_cycle.py
test_advZoneVirtualRouter Success 0.03 test_vm_life_cycle.py
test_10_attachAndDetach_iso Success 26.80 test_vm_life_cycle.py
test_09_expunge_vm Success 125.25 test_vm_life_cycle.py
test_08_migrate_vm Success 40.93 test_vm_life_cycle.py
test_07_restore_vm Success 0.13 test_vm_life_cycle.py
test_06_destroy_vm Success 130.86 test_vm_life_cycle.py
test_03_reboot_vm Success 126.01 test_vm_life_cycle.py
test_02_start_vm Success 10.18 test_vm_life_cycle.py
test_01_stop_vm Success 35.33 test_vm_life_cycle.py
test_CreateTemplateWithDuplicateName Success 75.64 test_templates.py
test_08_list_system_templates Success 0.03 test_templates.py
test_07_list_public_templates Success 0.04 test_templates.py
test_05_template_permissions Success 0.06 test_templates.py
test_04_extract_template Success 5.16 test_templates.py
test_03_delete_template Success 5.13 test_templates.py
test_02_edit_template Success 90.18 test_templates.py
test_01_create_template Success 25.33 test_templates.py
test_10_destroy_cpvm Success 161.44 test_ssvm.py
test_09_destroy_ssvm Success 163.81 test_ssvm.py
test_08_reboot_cpvm Success 101.37 test_ssvm.py
test_07_reboot_ssvm Success 133.56 test_ssvm.py
test_06_stop_cpvm Success 131.60 test_ssvm.py
test_05_stop_ssvm Success 133.67 test_ssvm.py
test_04_cpvm_internals Success 1.02 test_ssvm.py
test_03_ssvm_internals Success 3.29 test_ssvm.py
test_02_list_cpvm_vm Success 0.15 test_ssvm.py
test_01_list_sec_storage_vm Success 0.13 test_ssvm.py
test_01_snapshot_root_disk Success 11.10 test_snapshots.py
test_04_change_offering_small Success 210.81 test_service_offerings.py
test_03_delete_service_offering Success 0.04 test_service_offerings.py
test_02_edit_service_offering Success 0.05 test_service_offerings.py
test_01_create_service_offering Success 0.11 test_service_offerings.py
test_02_sys_template_ready Success 0.13 test_secondary_storage.py
test_01_sys_vm_start Success 0.18 test_secondary_storage.py
test_09_reboot_router Success 40.33 test_routers.py
test_08_start_router Success 30.28 test_routers.py
test_07_stop_router Success 10.16 test_routers.py
test_06_router_advanced Success 0.06 test_routers.py
test_05_router_basic Success 0.04 test_routers.py
test_04_restart_network_wo_cleanup Success 5.68 test_routers.py
test_03_restart_network_cleanup Success 55.49 test_routers.py
test_02_router_internal_adv Success 0.85 test_routers.py
test_01_router_internal_basic Success 0.46 test_routers.py
test_router_dns_guestipquery Success 76.73 test_router_dns.py
test_router_dns_externalipquery Success 0.08 test_router_dns.py
test_router_dhcphosts Success 288.86 test_router_dhcphosts.py
test_router_dhcp_opts Success 21.62 test_router_dhcphosts.py
test_01_updatevolumedetail Success 0.08 test_resource_detail.py
test_01_reset_vm_on_reboot Success 130.88 test_reset_vm_on_reboot.py
test_createRegion Success 0.04 test_regions.py
test_create_pvlan_network Success 5.21 test_pvlan.py
test_dedicatePublicIpRange Success 0.44 test_public_ip_range.py
test_03_vpc_privategw_restart_vpc_cleanup Success 439.45 test_privategw_acl.py
test_02_vpc_privategw_static_routes Success 349.86 test_privategw_acl.py
test_01_vpc_privategw_acl Success 92.19 test_privategw_acl.py
test_01_primary_storage_nfs Success 35.80 test_primary_storage.py
test_createPortablePublicIPRange Success 15.19 test_portable_publicip.py
test_createPortablePublicIPAcquire Success 15.48 test_portable_publicip.py
test_isolate_network_password_server Success 89.22 test_password_server.py
test_UpdateStorageOverProvisioningFactor Success 0.14 test_over_provisioning.py
test_oobm_zchange_password Success 30.70 test_outofbandmanagement.py
test_oobm_multiple_mgmt_server_ownership Success 16.33 test_outofbandmanagement.py
test_oobm_issue_power_status Success 5.22 test_outofbandmanagement.py
test_oobm_issue_power_soft Success 15.31 test_outofbandmanagement.py
test_oobm_issue_power_reset Success 15.47 test_outofbandmanagement.py
test_oobm_issue_power_on Success 15.32 test_outofbandmanagement.py
test_oobm_issue_power_off Success 15.33 test_outofbandmanagement.py
test_oobm_issue_power_cycle Success 15.34 test_outofbandmanagement.py
test_oobm_enabledisable_across_clusterzones Success 92.56 test_outofbandmanagement.py
test_oobm_enable_feature_valid Success 5.15 test_outofbandmanagement.py
test_oobm_enable_feature_invalid Success 0.09 test_outofbandmanagement.py
test_oobm_disable_feature_valid Success 5.36 test_outofbandmanagement.py
test_oobm_disable_feature_invalid Success 0.10 test_outofbandmanagement.py
test_oobm_configure_invalid_driver Success 0.07 test_outofbandmanagement.py
test_oobm_configure_default_driver Success 0.07 test_outofbandmanagement.py
test_oobm_background_powerstate_sync Success 23.43 test_outofbandmanagement.py
test_extendPhysicalNetworkVlan Success 15.36 test_non_contigiousvlan.py
test_01_nic Success 419.17 test_nic.py
test_releaseIP Success 157.40 test_network.py
test_reboot_router Success 383.34 test_network.py
test_public_ip_user_account Success 10.25 test_network.py
test_public_ip_admin_account Success 40.30 test_network.py
test_network_rules_acquired_public_ip_3_Load_Balancer_Rule Success 67.10 test_network.py
test_network_rules_acquired_public_ip_2_nat_rule Success 61.71 test_network.py
test_network_rules_acquired_public_ip_1_static_nat_rule Success 121.06 test_network.py
test_delete_account Success 288.17 test_network.py
test_02_port_fwd_on_non_src_nat Success 55.65 test_network.py
test_01_port_fwd_on_src_nat Success 109.71 test_network.py
test_nic_secondaryip_add_remove Success 202.70 test_multipleips_per_nic.py
login_test_saml_user Success 19.32 test_login.py
test_assign_and_removal_lb Success 133.09 test_loadbalance.py
test_02_create_lb_rule_non_nat Success 187.10 test_loadbalance.py
test_01_create_lb_rule_src_nat Success 207.50 test_loadbalance.py
test_03_list_snapshots Success 0.05 test_list_ids_parameter.py
test_02_list_templates Success 0.04 test_list_ids_parameter.py
test_01_list_volumes Success 0.03 test_list_ids_parameter.py
test_07_list_default_iso Success 0.06 test_iso.py
test_05_iso_permissions Success 0.06 test_iso.py
test_04_extract_Iso Success 5.16 test_iso.py
test_03_delete_iso Success 95.21 test_iso.py
test_02_edit_iso Success 0.06 test_iso.py
test_01_create_iso Success 21.03 test_iso.py
test_04_rvpc_internallb_haproxy_stats_on_all_interfaces Success 193.60 test_internal_lb.py
test_03_vpc_internallb_haproxy_stats_on_all_interfaces Success 127.59 test_internal_lb.py
test_02_internallb_roundrobin_1RVPC_3VM_HTTP_port80 Success 515.71 test_internal_lb.py
test_01_internallb_roundrobin_1VPC_3VM_HTTP_port80 Success 426.01 test_internal_lb.py
test_dedicateGuestVlanRange Success 10.26 test_guest_vlan_range.py
test_UpdateConfigParamWithScope Success 0.14 test_global_settings.py
test_rolepermission_lifecycle_update Success 6.17 test_dynamicroles.py
test_rolepermission_lifecycle_list Success 5.99 test_dynamicroles.py
test_rolepermission_lifecycle_delete Success 5.89 test_dynamicroles.py
test_rolepermission_lifecycle_create Success 5.94 test_dynamicroles.py
test_rolepermission_lifecycle_concurrent_updates Success 6.01 test_dynamicroles.py
test_role_lifecycle_update_role_inuse Success 5.93 test_dynamicroles.py
test_role_lifecycle_update Success 10.99 test_dynamicroles.py
test_role_lifecycle_list Success 5.91 test_dynamicroles.py
test_role_lifecycle_delete Success 10.95 test_dynamicroles.py
test_role_lifecycle_create Success 5.97 test_dynamicroles.py
test_role_inuse_deletion Success 5.92 test_dynamicroles.py
test_role_account_acls_multiple_mgmt_servers Success 8.06 test_dynamicroles.py
test_role_account_acls Success 8.31 test_dynamicroles.py
test_default_role_deletion Success 5.99 test_dynamicroles.py
test_04_create_fat_type_disk_offering Success 0.07 test_disk_offerings.py
test_03_delete_disk_offering Success 0.04 test_disk_offerings.py
test_02_edit_disk_offering Success 0.05 test_disk_offerings.py
test_02_create_sparse_type_disk_offering Success 0.07 test_disk_offerings.py
test_01_create_disk_offering Success 0.11 test_disk_offerings.py
test_deployvm_userdispersing Success 20.59 test_deploy_vms_with_varied_deploymentplanners.py
test_deployvm_userconcentrated Success 20.61 test_deploy_vms_with_varied_deploymentplanners.py
test_deployvm_firstfit Success 60.77 test_deploy_vms_with_varied_deploymentplanners.py
test_deployvm_userdata_post Success 10.37 test_deploy_vm_with_userdata.py
test_deployvm_userdata Success 45.63 test_deploy_vm_with_userdata.py
test_02_deploy_vm_root_resize Success 5.99 test_deploy_vm_root_resize.py
test_01_deploy_vm_root_resize Success 6.01 test_deploy_vm_root_resize.py
test_00_deploy_vm_root_resize Success 212.48 test_deploy_vm_root_resize.py
test_deploy_vm_from_iso Success 202.41 test_deploy_vm_iso.py
test_DeployVmAntiAffinityGroup Success 65.97 test_affinity_groups.py
test_03_delete_vm_snapshots Skipped 0.00 test_vm_snapshots.py
test_02_revert_vm_snapshots Skipped 0.00 test_vm_snapshots.py
test_01_test_vm_volume_snapshot Skipped 0.00 test_vm_snapshots.py
test_01_create_vm_snapshots Skipped 0.00 test_vm_snapshots.py
test_06_copy_template Skipped 0.00 test_templates.py
test_static_role_account_acls Skipped 0.02 test_staticroles.py
test_01_scale_vm Skipped 0.00 test_scale_vm.py
test_01_primary_storage_iscsi Skipped 0.04 test_primary_storage.py
test_06_copy_iso Skipped 0.00 test_iso.py
test_deploy_vgpu_enabled_vm Skipped 0.01 test_deploy_vgpu_enabled_vm.py

@serg38
Copy link

serg38 commented Feb 20, 2017

@nvazquez @rhtyd @anshul1886 We might need to tweak the test_data.py to add additional nfs mount e.g. nfs1 and use it in this test. E.g adding
"nfs1": {
"url": "nfs://nfs/export/automation/1/testprimary1",
"name": "Primary XEN1"
},
Blueorangutain environment will need to be adjusted as well. Otherwise the test fails with
CloudstackAPIException: Execute cmd: createstoragepool failed, due to: errorCode: 530, errorText:Failed to add data store: Storage pool NFS://10.2.0.16/acs/primary/pr1847-t854-kvm-centos7/marvin_pri1 already in use by another pod (id=1)\n']

@rohityadavcloud
Copy link
Member

@borisstoyanov can you have a look at ^^, thanks.

@serg38
Copy link

serg38 commented Feb 22, 2017

@karuturi @anshul1886 Can we tweak line 278 of test_snaphsot.py from
self.services["nfs"],
to self.services["nfs2"],

and re-merge this PR so it doesn't conflict with other tests using "nfs" test data.

@nvazquez
Copy link
Contributor

@borisstoyanov @rhtyd I was checking BlueOrangutan logs:

In test_primary_storage_8NPG5G\runinfo.txt lines 27-30, there's PS creation:

2017-02-20 11:03:59,678 - DEBUG - Payload: {'apiKey': u'LIN6rqXuaJwMPfGYFh13qDwYz5VNNz1J2J6qIOWcd3oLQOq0WtD4CwRundBL6rzXToa3lQOC_vKjI3nkHtiD8Q', 'name': 'Marvin Primary Pool', 'url': 'NFS://10.2.0.16/acs/primary/pr1847-t854-kvm-centos7/marvin_pri1', 'podid': u'1c6f35e3-a31e-49cc-b0b0-801848dbb9bc', 'clusterid': u'8ac40777-344d-4f36-93a3-7b685361a523', 'zoneid': u'9c474c4e-0742-4fbb-b0ae-cb3d1e63cb70', 'command': 'createStoragePool', 'signature': 'zq6ff7P2n9iLmw3Cw9nDgMM1v90=', 'response': 'json'}
2017-02-20 11:03:59,678 - DEBUG - ========Sending GET Cmd : createStoragePool=======
2017-02-20 11:04:00,038 - DEBUG - Response : {podname : u'Pod1', name : u'Marvin Primary Pool', disksizeallocated : 0, created : u'2017-02-20T11:04:00+0000', clustername : u'p1-c1', ipaddress : u'10.2.0.16', podid : u'1c6f35e3-a31e-49cc-b0b0-801848dbb9bc', clusterid : u'8ac40777-344d-4f36-93a3-7b685361a523', zoneid : u'9c474c4e-0742-4fbb-b0ae-cb3d1e63cb70', state : u'Up', scope : u'CLUSTER', overprovisionfactor : u'2.0', path : u'/acs/primary/pr1847-t854-kvm-centos7/marvin_pri1', zonename : u'pr1847-t854-kvm-centos7', type : u'NetworkFilesystem', id : u'fa44baa2-267d-3f36-88c9-c5c545e0204b', disksizetotal : 7514055770112}
2017-02-20 11:04:00,038 - DEBUG - Created storage pool in cluster: 8ac40777-344d-4f36-93a3-7b685361a523

Then, it gets removed, lines 46-48:

2017-02-20 11:04:35,193 - DEBUG - Payload: {'apiKey': u'LIN6rqXuaJwMPfGYFh13qDwYz5VNNz1J2J6qIOWcd3oLQOq0WtD4CwRundBL6rzXToa3lQOC_vKjI3nkHtiD8Q', 'response': 'json', 'command': 'deleteStoragePool', 'signature': '73NDgRVRZDU4uHR/Vc8pRbVg2p4=', 'id': u'fa44baa2-267d-3f36-88c9-c5c545e0204b'}
2017-02-20 11:04:35,193 - DEBUG - ========Sending GET Cmd : deleteStoragePool=======
2017-02-20 11:04:35,396 - DEBUG - Response : {success : u'true'}

Then, on test_volumes_950C4W\runinfo.txt lines 474-476 listStoragePools command is sent and lists previously deleted PS:

2017-02-20 13:54:49,407 - DEBUG - Payload: {'apiKey': u'LIN6rqXuaJwMPfGYFh13qDwYz5VNNz1J2J6qIOWcd3oLQOq0WtD4CwRundBL6rzXToa3lQOC_vKjI3nkHtiD8Q', 'command': 'listStoragePools', 'signature': 'l3xL+RNaPCVAYEpCovbu1snTIRI=', 'response': 'json'}
2017-02-20 13:54:49,407 - DEBUG - ========Sending GET Cmd : listStoragePools=======
2017-02-20 13:54:49,432 - DEBUG - Response : [{podname : u'Pod1', storagecapabilities : {VOLUME_SNAPSHOT_QUIESCEVM : u'false'}, name : u'Marvin Primary Pool', disksizeallocated : 8590328832, podid : u'1c6f35e3-a31e-49cc-b0b0-801848dbb9bc', clustername : u'p1-c1', ipaddress : u'10.2.0.16', created : u'2017-02-20T12:51:45+0000', clusterid : u'8ac40777-344d-4f36-93a3-7b685361a523', zoneid : u'9c474c4e-0742-4fbb-b0ae-cb3d1e63cb70', state : u'Up', disksizeused : 174845853696, id : u'fa44baa2-267d-3f36-88c9-c5c545e0204b', overprovisionfactor : u'2.0', path : u'/acs/primary/pr1847-t854-kvm-centos7/marvin_pri1', zonename : u'pr1847-t854-kvm-centos7', type : u'NetworkFilesystem', scope : u'CLUSTER', disksizetotal : 7514055770112}, {podname : u'Pod1', storagecapabilities : {VOLUME_SNAPSHOT_QUIESCEVM : u'false'}, name : u'pr1847-t854-kvm-centos7-kvm-pri2', disksizeallocated : 977060864, podid : u'1c6f35e3-a31e-49cc-b0b0-801848dbb9bc', clustername : u'p1-c1', ipaddress : u'10.2.0.16', created : u'2017-02-20T08:52:51+0000', clusterid : u'8ac40777-344d-4f36-93a3-7b685361a523', zoneid : u'9c474c4e-0742-4fbb-b0ae-cb3d1e63cb70', state : u'Up', disksizeused : 174845853696, id : u'7a360219-f4a9-3edb-a1e3-241cbc2dee7f', overprovisionfactor : u'2.0', path : u'/acs/primary/pr1847-t854-kvm-centos7/pr1847-t854-kvm-centos7-kvm-pri2', zonename : u'pr1847-t854-kvm-centos7', type : u'NetworkFilesystem', scope : u'CLUSTER', disksizetotal : 7514055770112}, {podname : u'Pod1', storagecapabilities : {VOLUME_SNAPSHOT_QUIESCEVM : u'false'}, name : u'pr1847-t854-kvm-centos7-kvm-pri1', disksizeallocated : 3221422592, podid : u'1c6f35e3-a31e-49cc-b0b0-801848dbb9bc', clustername : u'p1-c1', ipaddress : u'10.2.0.16', created : u'2017-02-20T08:52:50+0000', clusterid : u'8ac40777-344d-4f36-93a3-7b685361a523', zoneid : u'9c474c4e-0742-4fbb-b0ae-cb3d1e63cb70', state : u'Up', disksizeused : 174845853696, id : u'4435931f-bb5a-3ce6-a2f5-6f7345d5eb0a', overprovisionfactor : u'2.0', path : u'/acs/primary/pr1847-t854-kvm-centos7/pr1847-t854-kvm-centos7-kvm-pri1', zonename : u'pr1847-t854-kvm-centos7', type : u'NetworkFilesystem', scope : u'CLUSTER', disksizetotal : 7514055770112}]

Please note that although there's almost 3-hour difference between logs, there exists a PS with the same id ("fa44baa2-267d-3f36-88c9-c5c545e0204b"), but creation times are different:
created: u'2017-02-20T11:04:00+0000' vs created : u'2017-02-20T12:51:45+0000'

So, when test_snapshots.py gets executed, it tries creating PS and fails due to there is already a PS with the same url. I agree with @serg38 on adding a new entry on test_data.py which would solve the problem, but I think it would require modifying test_data.py each time we want to create a PS using nfs. What do you guys think, that you know how BlueOrangutan works, that would be the best approach?

@borisstoyanov
Copy link
Contributor

Hi @nvazquez, let me simplify what B.O. is doing, when it read 'test' it kicks a jenkins job that deploys the pr, deploys a zone and executes all tests in the smoketests directory.
Personally I don't see anything wrong with appending an item in the test_data.py and using it, as long as we make sure to do self.cleanup.append(new_object). In particular, this object can be reused by any test as long as it's not already added.

asfgit pushed a commit that referenced this pull request Mar 28, 2017
Fix for test_snapshots.py using nfs2 instead of nfs templateFix for marvin test failure introduced in #1847

Cc: @borisstoyanov @rhtyd @karuturi

* pr/1961:
  Fix for test failure
  Fix for test_snapshots.py using nfs2 instead of nfs template

Signed-off-by: Rajani Karuturi <rajani.karuturi@accelerite.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.