Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Continuous Replication 5.9 fail to delete VDI from VM XO_DELTA_EXPORT #2227

Closed
fulvio61 opened this issue Jun 21, 2017 · 28 comments
Assignees

Comments

@fulvio61
Copy link

@fulvio61 fulvio61 commented Jun 21, 2017

Context

  • xo-server 5.9.4
  • xo-web 5.9.1

Current behavior

At the end of a Continuous Replication job, XO fails to delete the intermediate snapshot VDI in the storage repository of the source host of the replica.

Nothing is written in the Settings>Logs
The table log of the backup section, report the Job marked as Finished in green

In the /var/log/syslog xo-server report a problem:

Jun 21 12:40:52 ubuntu xo-server[1574]: cannot delete VDI AML ERP (from VM XO_DELTA_EXPORT: Local storage (352aa767-575f-cf7f-3757-b596edfa47ea))

In the Storage Repo of the source host of the replica we found some VDI zombie with the name of the VM.

Here the discussion link on XO forum:
https://xen-orchestra.com/forum/topic/470/continuous-replication-5-9-fail-to-delete-vdi-from-vm-xo_delta_export

@olivierlambert

This comment has been minimized.

Copy link
Member

@olivierlambert olivierlambert commented Jun 26, 2017

Hi @fulvio61

Can you try to switch your xo-server to the branch called fix-export-import? Don't forget to yarn after switching the branch.

Keep us posted, it should fix your issue!

@fulvio61

This comment has been minimized.

Copy link
Author

@fulvio61 fulvio61 commented Jun 28, 2017

I did some tests, the situation got worse.

Now, even the first execution of the replica is not successfully completed.

In the Task section, the job is completed (100%) and disappears.
In the Logs table in the backup section, the job is not marked as finished (remain marked "Started" in yellow).
In the Setting>Logs section: "No logs so far".

On the replica receiving machine, the VM appears with the suffix [Importing ...]

image

@julien-f

This comment has been minimized.

Copy link
Member

@julien-f julien-f commented Jun 28, 2017

Yes, I've seen this too, I'm investigating.

@olivierlambert

This comment has been minimized.

Copy link
Member

@olivierlambert olivierlambert commented Jun 29, 2017

@fulvio61 Check thee latest bits of this branch, we made modifications 👍

@fulvio61

This comment has been minimized.

Copy link
Author

@fulvio61 fulvio61 commented Jun 29, 2017

I did some tests again after applying the lastest fix.
Now the job arrives at the end and is marked as "finished" (green) in the backup log.
The replicated VM on the target XEN (replica) are renamed correctly.

But... the initial problem remains.

XO can not delete snapshots created for replication, which become zombies.

Basically ... we're back to the beginning.

image

image

@Danp2

This comment has been minimized.

Copy link
Contributor

@Danp2 Danp2 commented Jun 29, 2017

Could this be the "base copy" showing in XC?

FWIW, I see some odd looking (possibly unrelated) issues on my end as well. Here's an example from XO --

image

and the corresponding info from XC --

image

@olivierlambert

This comment has been minimized.

Copy link
Member

@olivierlambert olivierlambert commented Jun 29, 2017

Looks like a lot orphaned VDIs. Can you find the details for them? (xe vdi-param-list uuid=<VDI UUID>) or even better, display the SR content using xapi-explore-sr?

@Danp2

This comment has been minimized.

Copy link
Contributor

@Danp2 Danp2 commented Jun 29, 2017

@olivierlambert Was that directed @fulvio61, me, or both? ;-)

Shouldn't any orphans VDIs be listed under Dashboard > Health? Mine is completely blank.

@olivierlambert

This comment has been minimized.

Copy link
Member

@olivierlambert olivierlambert commented Jun 29, 2017

Both ;)

Well, it's a vocabulary issue regarding orphans, it could be:

  1. a VDI snapshot without a parent (that's not normal at all and should be deleted anyway because useless)
  2. a VDI without any VBD (that could be normal if you choose to get a VDI that's not attached to any VM for a specific reason, but if it's not your decision, then it's OK to remove it because you don't use it at all)

Health view only shows VDIs that are in scenario 1.

@Danp2

This comment has been minimized.

Copy link
Contributor

@Danp2 Danp2 commented Jun 29, 2017

@olivierlambert Sent to your attention via email

@olivierlambert

This comment has been minimized.

Copy link
Member

@olivierlambert olivierlambert commented Jun 29, 2017

Can you use the full format? Because I don't have color in the text.

@julien-f julien-f closed this in 9c34e64 Jun 30, 2017
@Danp2

This comment has been minimized.

Copy link
Contributor

@Danp2 Danp2 commented Jun 30, 2017

That was the full format. It lost the coloring when I copied it from the console.

@julien-f

This comment has been minimized.

Copy link
Member

@julien-f julien-f commented Jun 30, 2017

Yes, it's not perfect, you can force a monochrome render by piping in a cat (xapi-explore-sr ... | cat) or redirecting directly in a file (xapi-explore-sr ... > output.txt).

@Danp2

This comment has been minimized.

Copy link
Contributor

@Danp2 Danp2 commented Jun 30, 2017

Resent another copy

@fulvio61

This comment has been minimized.

Copy link
Author

@fulvio61 fulvio61 commented Jun 30, 2017

image

@olivierlambert

This comment has been minimized.

Copy link
Member

@olivierlambert olivierlambert commented Jun 30, 2017

Thanks @Danp2 @fulvio61

@fulvio61

This comment has been minimized.

Copy link
Author

@fulvio61 fulvio61 commented Jul 5, 2017

@olivierlambert Hi Olivier... I don't understand if this problem is solved/fixed or not.

@julien-f

This comment has been minimized.

Copy link
Member

@julien-f julien-f commented Jul 5, 2017

@fulvio61 A lot of the issues are resolved but there are indeed some problems left.

We'll continue to investigate.

@julien-f julien-f reopened this Jul 5, 2017
@fulvio61

This comment has been minimized.

Copy link
Author

@fulvio61 fulvio61 commented Jul 5, 2017

@julien-f thanks

@olivierlambert

This comment has been minimized.

Copy link
Member

@olivierlambert olivierlambert commented Jul 20, 2017

@fulvio61 do you still have cannot delete VDI error message?

@fulvio61

This comment has been minimized.

Copy link
Author

@fulvio61 fulvio61 commented Jul 20, 2017

@olivierlambert
I have to run a test on the latest version to see if anything has changed...
Actually I'm still using the XO ver 5.6.4...

@fulvio61

This comment has been minimized.

Copy link
Author

@fulvio61 fulvio61 commented Jul 20, 2017

@olivierlambert I just tried one more time to run a replica job and yes, the problem occurse again:

xo-server 5.10.0
xo-web 5.10.4

Jul 20 15:26:19 ubuntu xo-server[1555]: Thu, 20 Jul 2017 13:26:19 GMT xo:xapi Snapshotting VM AML ERP - 192.168.1.15
Jul 20 15:26:23 ubuntu xo-server[1555]: Thu, 20 Jul 2017 13:26:23 GMT xo:xapi exporting VDI AML ERP (from base AML ERP)
Jul 20 15:26:26 ubuntu xo-server[1555]: Thu, 20 Jul 2017 13:26:26 GMT xo:xapi Creating VM AML ERP - 192.168.1.15 (2017-07-20)
Jul 20 15:26:26 ubuntu xo-server[1555]: Thu, 20 Jul 2017 13:26:26 GMT xo:xapi Cloning VDI AML ERP
Jul 20 15:26:29 ubuntu xo-server[1555]: Thu, 20 Jul 2017 13:26:29 GMT xo:xapi Creating VBD for VDI AML ERP on VM AML ERP - 192.168.1.15 (2017-07-20)
Jul 20 15:26:29 ubuntu xo-server[1555]: Thu, 20 Jul 2017 13:26:29 GMT xo:xapi Creating VIF for VM AML ERP - 192.168.1.15 (2017-07-20) on network Pool-wide network associated with eth0
Jul 20 15:26:34 ubuntu xo-server[1555]: Thu, 20 Jul 2017 13:26:34 GMT xo:xapi Deleting VM AML ERP - 192.168.1.15 (2017-07-20)
Jul 20 15:26:34 ubuntu xo-server[1555]: Thu, 20 Jul 2017 13:26:34 GMT xo:xapi Deleting VDI AML ERP
Jul 20 15:26:34 ubuntu xo-server[1555]: Thu, 20 Jul 2017 13:26:34 GMT xo:xapi Deleting VM XO_DELTA_EXPORT: Local storage (352aa767-575f-cf7f-3757-b596edfa47ea)
Jul 20 15:26:34 ubuntu xo-server[1555]: cannot delete VDI AML ERP (from VM XO_DELTA_EXPORT: Local storage (352aa767-575f-cf7f-3757-b596edfa47ea))

image

@olivierlambert

This comment has been minimized.

Copy link
Member

@olivierlambert olivierlambert commented Jul 20, 2017

thx for your feedback @fulvio61 !

@olivierlambert

This comment has been minimized.

Copy link
Member

@olivierlambert olivierlambert commented Aug 2, 2017

@fulvio61 this time it should work. Can you try again?

@fulvio61

This comment has been minimized.

Copy link
Author

@fulvio61 fulvio61 commented Aug 7, 2017

Hi Oliver, this is the today test report of replica, done with the lastest version.
I had execute the test on the "AML ERP" VM only.
1 and 2 run are ok.
On the 3 run the problem occurse.
As you can saw, the problem still persist.

(xo-server 5.10.0 - xo-web 5.10.4)

image

image

image

image

@fulvio61 fulvio61 closed this Aug 7, 2017
@olivierlambert

This comment has been minimized.

Copy link
Member

@olivierlambert olivierlambert commented Aug 7, 2017

You are not on the latest version, please upgrade (5.11.1 for xo-server)

@olivierlambert olivierlambert reopened this Aug 7, 2017
@fulvio61

This comment has been minimized.

Copy link
Author

@fulvio61 fulvio61 commented Aug 8, 2017

Hello Olivier, last night I rebuilt everything and now I'm on the latest version (xo-server 5.11.1 - xo-web 5.11.0).
After several backups and a lot of replication sessions I can confirm that the problem is fixed.
Everything seems work properly.
Thanks for the support.
👍

@olivierlambert

This comment has been minimized.

Copy link
Member

@olivierlambert olivierlambert commented Aug 8, 2017

Great! Please double check to be on 5.11.2 that was released few hours ago, it also avoid another potential race condition if 2 or more VM deletion happens at same time.

If you still have any other problem regarding cont. replication, please open a new issue. Your feedback is always welcome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.