Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLOUDSTACK-9200: Fixed failed to delete snapshot if snapshot is stuck in Allocated state without any job associated with it #1282

Closed
wants to merge 1 commit into from

Conversation

anshul1886
Copy link

https://issues.apache.org/jira/browse/CLOUDSTACK-9200

This issue is hard to reproduce but if occurs then it may lead to account resources cleanup failures.

@pdube
Copy link
Contributor

pdube commented Dec 23, 2015

How do you know if there are no jobs associated with it?

@anshul1886
Copy link
Author

@pdube Snapshot can be in Allocated state for a split second only in normal scenarios. If it is in Allocated state for more than that time, then it simply means we are either restarting or have to restart the management server. Now if that's the case then it should have been cleaned up by storage cleanup thread during management server start if it has job associated with it. But that is not happening so it is falling in this scenario.

@Slair1
Copy link
Contributor

Slair1 commented Apr 26, 2016

Thanks! I had the same issue, retention settings for snapshots were failing because it couldn't delete two snapshots that were stuck in "allocated' state.

@koushik-das
Copy link
Contributor

@anshul1886 The fix is only for XS snapshots. What about other HVs?

@anshul1886
Copy link
Author

@koushik-das All hypervisor uses XenServerSnapshotStrategy so it should be applicable on all hypervisors.

@swill
Copy link
Contributor

swill commented Apr 28, 2016

@anshul1886 I see in some other places in that file when snapshotDao.remove(snapshotId); is called there is a return true right after it. Can you help me understand why we would or would not do that in this case? Thanks...

@rohityadavcloud
Copy link
Member

@anshul1886 rebase against latest master, thanks

Do you think a cleanup action of some sort be performed, as this will only delete the snapshot in database

@bvbharatk
Copy link
Contributor

ACS CI BVT Run

Sumarry:
Build Number 67
Hypervisor xenserver
NetworkType Advanced
Passed=68
Failed=4
Skipped=3

Link to logs Folder (search by build_no): https://www.dropbox.com/sh/yj3wnzbceo9uef2/AAB6u-Iap-xztdm6jHX9SjPja?dl=0

Failed tests:

  • test_vpc_vpn.py
    • ContextSuite context=TestRVPCSite2SiteVpn>:setup Failed
    • ContextSuite context=TestVpcRemoteAccessVpn>:setup Failing since 2 runs
    • ContextSuite context=TestVpcSite2SiteVpn>:setup Failing since 2 runs
  • test_vm_life_cycle.py
    • test_10_attachAndDetach_iso Failed

Skipped tests:
test_vm_nic_adapter_vmxnet3
test_static_role_account_acls
test_deploy_vgpu_enabled_vm

Passed test suits:
test_deploy_vm_with_userdata.py
test_affinity_groups_projects.py
test_portable_publicip.py
test_over_provisioning.py
test_global_settings.py
test_scale_vm.py
test_service_offerings.py
test_routers_iptables_default_policy.py
test_routers.py
test_reset_vm_on_reboot.py
test_snapshots.py
test_deploy_vms_with_varied_deploymentplanners.py
test_list_ids_parameter.py
test_public_ip_range.py
test_multipleips_per_nic.py
test_regions.py
test_affinity_groups.py
test_network_acl.py
test_pvlan.py
test_volumes.py
test_nic.py
test_deploy_vm_root_resize.py
test_resource_detail.py
test_secondary_storage.py
test_disk_offerings.py

@Slair1
Copy link
Contributor

Slair1 commented Feb 9, 2017

@anshul1886 any idea why this hasn't been merged yet? are there concerns/issues?

@karuturi
Copy link
Member

karuturi commented Feb 9, 2017

@Slair1 Its missing required reviews and tests.(We have a huge backlog. any help in code review/test run is welcome). please refer to https://cwiki.apache.org/confluence/display/CLOUDSTACK/Release+principles+for+Apache+CloudStack+4.6+and+up

… in Allocated state without any job associated with it
@anshul1886
Copy link
Author

@swill Added the return true statement. @rhtyd Snapshot in Allocated state means nothing is done for that snapshot so only deletion from DB is needed.

@cloudmonger
Copy link

ACS CI BVT Run

Sumarry:
Build Number 446
Hypervisor xenserver
NetworkType Advanced
Passed=105
Failed=0
Skipped=7

Link to logs Folder (search by build_no): https://www.dropbox.com/sh/yj3wnzbceo9uef2/AAB6u-Iap-xztdm6jHX9SjPja?dl=0

Failed tests:

Skipped tests:
test_01_test_vm_volume_snapshot
test_vm_nic_adapter_vmxnet3
test_static_role_account_acls
test_11_ss_nfs_version_on_ssvm
test_nested_virtualization_vmware
test_3d_gpu_support
test_deploy_vgpu_enabled_vm

Passed test suits:
test_deploy_vm_with_userdata.py
test_affinity_groups_projects.py
test_portable_publicip.py
test_over_provisioning.py
test_global_settings.py
test_scale_vm.py
test_service_offerings.py
test_routers_iptables_default_policy.py
test_loadbalance.py
test_routers.py
test_reset_vm_on_reboot.py
test_deploy_vms_with_varied_deploymentplanners.py
test_network.py
test_router_dns.py
test_non_contigiousvlan.py
test_login.py
test_deploy_vm_iso.py
test_list_ids_parameter.py
test_public_ip_range.py
test_multipleips_per_nic.py
test_regions.py
test_affinity_groups.py
test_network_acl.py
test_pvlan.py
test_volumes.py
test_nic.py
test_deploy_vm_root_resize.py
test_resource_detail.py
test_secondary_storage.py
test_vm_life_cycle.py
test_routers_network_ops.py
test_disk_offerings.py

@karuturi
Copy link
Member

karuturi commented Apr 6, 2017

@swill @rhtyd can you review?

@@ -245,6 +245,12 @@ public boolean deleteSnapshot(Long snapshotId) {
return true;
}

if(snapshotVO.getState() == Snapshot.State.Allocated) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anshul1886 the code to check and remove snapshot when they are stuck in Allocated state is already in line 223 above, in the same method. Why this redundancy, is this a synchronization issue?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rhtyd , you're right, this change has since already been merged by PR #977.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rhtyd Missed that change. Closing PR now.

@rohityadavcloud
Copy link
Member

@anshul1886 I've left a comment, the code to check and remove snapshot on allocated state is already in that method?

@anshul1886 anshul1886 closed this Apr 14, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants