Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

CA-112324: Check destination SR's connectivity during VM migration. #1556

Merged
merged 1 commit into from Jul 31, 2014

Conversation

Projects
None yet
7 participants
Contributor

ravippandey commented Nov 25, 2013

Issue: During VM migration if the destination SR is not attached , then the migration fails with a bad error message with an ungraceful view

Fix: Defined an error message in api_errors.ml namely "sr_not_attached". Catching the exception in storage_migrate.ml before mirroring starts and displaying it in a graceful manner.

Signed-off-by: Ravi Pandey ravi.pandey@citrix.com

Owner

xen-git commented Nov 25, 2013

Can one of the admins verify this patch for testing?

Owner

jonludlam commented Nov 26, 2013

@xen-git ok to test.

Owner

jonludlam commented Nov 26, 2013

Did we really not have an SR not attached API error?

Contributor

ravippandey commented Nov 26, 2013

I searched and didn't find it. So, I think its not there.

Owner

robhoes commented May 9, 2014

@xen-git re-test this please

Owner

robhoes commented May 9, 2014

Shouldn't the sender try to plug the SR's PBD on the receiver (e.g. in Xapi_vm_migrate.migrate_send')? I believe this is done for networks.

Owner

robhoes commented May 9, 2014

@jonludlam What do you think?

Member

euanh commented Jul 16, 2014

@jonludlam: Poke!

Owner

jonludlam commented Jul 16, 2014

It's a good point, and I like the symmetry with networks, so I agree we should make the suggested change. We should make sure the destination isn't disabled before plugging though.

Contributor

ravippandey commented Jul 21, 2014

I have put the modifications. I think it should handle all the cases now, possibly. It can be reviewed now.
Note: I have used an API call to get the master host on destination pool. (https://github.com/ravippandey/xen-api/blob/CA-112324/ocaml/xapi/xapi_vm_migrate.ml#L375).
I think we can avoid this API call if we add master host ref too in token given by migrate_receive, which seems more logical to me.(https://github.com/ravippandey/xen-api/blob/CA-112324/ocaml/xapi/xapi_host.ml#L1479-1484) Any thoughts?

Owner

robhoes commented Jul 21, 2014

Do you still need the new SR_NOT_ATTACHED API error? Now that you are automatically plugging the PBD, is it still possible for the error to occur?

Owner

robhoes commented Jul 21, 2014

So you are plugging the PBDs of the receiving host as well as the receiving pool's master. I don't know why the latter would be necessary, but perhaps @jonludlam could verify this.

Member

thomassa commented Jul 21, 2014

When Ravi and I were discussing this yesterday, Ravi said that if PBD is plugged only on the receiving host (and that host is a slave) then the Sr_not_attached(sr_uuid) exception happens.

(I.e. it does seem to be necessary to plug the PBD on the master as well, but I'm not sure why. Maybe it shouldn't be. If it has to be, then I feel it would be better for this to be handled by the receiving pool if practical.)

In the current code it looks to me as if the automatic plugging (of both recipient and master) happens on intra-pool as well as cross-pool migration. I don't know whether or not the master needs it when the migration is between two of its slaves in the same pool.

Contributor

ravippandey commented Jul 22, 2014

@robhoes Even though we are plugging the SR, i guess it is safe to keep the SR_NOT_ATTACHED error. It might prove helpful to us in future just like it did in knowing the above issue where master needs to be SR-attached even though it is not recipient.
I investigated and observed that if master is not SR-attached then it fails at https://github.com/ravippandey/xen-api/blob/CA-112324/ocaml/xapi/storage_migrate.ml#L333 with SR_NOT_ATTACHED error.

@thomassa yes, migration will fail in case of two slaves in same pool with the same error. Also in the case where we migrate a VM from LS to NFS on same host (slave) and NFS is not master attached, it will fail.

Owner

robhoes commented Jul 22, 2014

@ravippandey I think it looks good now, thanks. I would still like @jonludlam to do a final sanity check before putting this in though.

Contributor

ravippandey commented Jul 22, 2014

Member

thomassa commented Jul 22, 2014

There might be a problem: I think the current code will attempt to plug the destination storage into the pool master even if the storage is local storage on a slave.

@jonludlam pointed out this morning that the storage has to be plugged on the storage master for that SR. (This isn't always the same as the pool-master: local-storage is the obvious example.)

Also, we should check that the SR-master is active; at present we do this for the destination host only in assert_can_migrate which is called before migrate_send:
https://github.com/ravippandey/xen-api/blob/CA-112324/ocaml/xapi/message_forwarding.ml#L1663
https://github.com/ravippandey/xen-api/blob/CA-112324/ocaml/xapi/xapi_vm_migrate.ml#L748

@ravippandey ravippandey reopened this Jul 23, 2014

Contributor

ravippandey commented Jul 23, 2014

@thomassa i guess the local storage getting plugged into pool master will never happen because the line https://github.com/ravippandey/xen-api/blob/CA-112324/ocaml/xapi/xapi_vm_migrate.ml#L378 will return only 1 PBD for local storage. It will be only that PBD which will finally trickle down to https://github.com/ravippandey/xen-api/blob/CA-112324/ocaml/xapi/xapi_vm_migrate.ml#L380 to be plugged. That PBD.plug will result into plugging into the destination host and not pool master.
Please correct if i am still missing something in my logic.

I have put the check whether master is enabled.
thanks

Member

thomassa commented Jul 23, 2014

Ah yes, I think you're right.

Contributor

akshayramani commented Jul 24, 2014

@xen-git retest this please.

@jonludlam jonludlam commented on an outdated diff Jul 24, 2014

ocaml/xapi/xapi_vm_migrate.ml
@@ -367,6 +367,23 @@ let migrate_send' ~__context ~vm ~dest ~live ~vdi_map ~vif_map ~options =
failwith ("No SR specified in VDI map for VDI " ^ vdi_uuid)
in
+ (* Plug the destination shared SR into destination host and pool master if unplugged.
+ Plug the local SR into destination host only if unplugged *)
+ let master_host =
+ let pool = List.hd (XenAPI.Pool.get_all remote_rpc session_id) in
+ XenAPI.Pool.get_master remote_rpc session_id pool in
+ let check_master_enabled = XenAPI.Host.get_enabled remote_rpc session_id master_host in
+ if not check_master_enabled then raise (Api_errors.Server_error (Api_errors.host_disabled,[Ref.string_of master_host]));
@jonludlam

jonludlam Jul 24, 2014

Owner

You're only checking whether the master is enabled or not. I suggest moving this check into the List.filter below.

@ravippandey ravippandey commented on an outdated diff Jul 24, 2014

ocaml/xapi/xapi_vm_migrate.ml
@@ -367,6 +367,23 @@ let migrate_send' ~__context ~vm ~dest ~live ~vdi_map ~vif_map ~options =
failwith ("No SR specified in VDI map for VDI " ^ vdi_uuid)
in
+ (* Plug the destination shared SR into destination host and pool master if unplugged.
+ Plug the local SR into destination host only if unplugged *)
+ let master_host =
+ let pool = List.hd (XenAPI.Pool.get_all remote_rpc session_id) in
+ XenAPI.Pool.get_master remote_rpc session_id pool in
+ let check_master_enabled = XenAPI.Host.get_enabled remote_rpc session_id master_host in
+ if not check_master_enabled then raise (Api_errors.Server_error (Api_errors.host_disabled,[Ref.string_of master_host]));
+
+ let pbds = XenAPI.SR.get_PBDs remote_rpc session_id dest_sr_ref in
+ let pbd_host_pair = List.map (fun pbd -> (pbd, XenAPI.PBD.get_host remote_rpc session_id pbd)) pbds in
+ let pbds_to_be_plugged = List.filter (fun (pbd, host) ->
+ XenAPI.PBD.get_host remote_rpc session_id pbd = Ref.of_string dest_host ||
@ravippandey

ravippandey Jul 24, 2014

Contributor

I just realized a mistake. I am calling XenAPI.PBD.get_host twice which is not required I think.

@jonludlam jonludlam added the master label Jul 28, 2014

CA-112324: Check destination SR's connectivity during VM migration.
Signed-off-by: Ravi Pandey <ravi.pandey@citrix.com>
Owner

jonludlam commented Jul 31, 2014

Looks much nicer now!

jonludlam added a commit that referenced this pull request Jul 31, 2014

Merge pull request #1556 from ravippandey/CA-112324
CA-112324: Check destination SR's connectivity during VM migration.

@jonludlam jonludlam merged commit 4d23d5e into xapi-project:master Jul 31, 2014

1 check passed

default Merged build finished.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment