-
Notifications
You must be signed in to change notification settings - Fork 292
CA-290237, CA-290473: fix pool-join on bonded networks and VLANs #3607
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CA-290237, CA-290473: fix pool-join on bonded networks and VLANs #3607
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good from a clustering point of view, however I'd like a second opinion on whether it is safe to resync VM state before the storage.
(And we can't resync VM state after storage, due to the comment about CA-175353)
Could you create a Jenkins branch and run a BST on it?
Submitted a Jenkins branch with the same name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed with @robhoes that changing the startup sequence is risky. The log_and_ignore part is good (the NIC may not exist at all), we have to avoid getting into a restart loop, however we shouldn't move around the initializing storage if we can avoid it.
Instead we should look up the primary slave of a bond if we can't find the PIF, and try to use that for clustering.
Yeah, the startup ordering stuff is really really fragile. Don't change it unless you must, and if you do, be prepared to spend a while debugging nasty corner cases :-( |
It might be better to fix this in another way that should also fix another bug with clustering not working on startup when the bond gets an IP too late:
Could also consider adding a 2nd step after wait for management IP that uses Startup.OnThread, and does the |
2fafad8
to
e89cb1f
Compare
Will run a BVT and BST. |
ocaml/xapi/xapi.ml
Outdated
"Synchronising VLANs on slave with master", [Startup.OnlySlave; Startup.NoExnRaising], Sync_networking.copy_vlans_from_master ~__context; | ||
"Synchronising tunnels on slave with master", [Startup.OnlySlave; Startup.NoExnRaising], Sync_networking.copy_tunnels_from_master ~__context; | ||
(* CA-290237: Create cluster objects after network objects so PIFs are available *) | ||
"Create any necessary cluster_host objects", [ Startup.NoExnRaising ], (fun () -> Xapi_cluster_host.create_as_necessary __context (Helpers.get_localhost ~__context)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use log_and_ignore_exn
here, so if something goes wrong XAPI doesn't go into a restart loop.
ocaml/xapi/xapi.ml
Outdated
(fun () -> Pool_db_backup.fetch_database_backup ~master_address:(Pool_role.get_master_address()) | ||
~pool_secret:!Xapi_globs.pool_secret ~force:None); | ||
"wait management interface to come up", [ Startup.NoExnRaising ], wait_management_interface; | ||
"wait management interface to come up, resync any cluster_hosts created", [ Startup.NoExnRaising ], wait_management_interface; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... and plug any unplugged PBDs.
ocaml/xapi/xapi_pbd.ml
Outdated
if List.mem (Db.Cluster.get_cluster_stack ~__context ~self) required_cluster_stacks | ||
then Xapi_cluster_host.resync_host ~__context ~host | ||
then begin | ||
Xapi_cluster_host.create_as_necessary ~__context ~host; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that the object can be missing here, it got created during the startup sequence, or during join, this one probably doesn't need to be changed, and calling just resync_host is fine.
ocaml/xapi/xapi.ml
Outdated
begin match Xapi_clustering.find_cluster_host ~__context ~host with | ||
| Some self -> ignore | ||
Xapi_mgmt_iface.(wait_for_clustering_ip ~__context ~self); | ||
Xapi_cluster_host.resync_host ~__context ~host |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Plugging the PBD already calls resync_host, so the only thing we need here is to wait for an IP.
Rename this to maybe_wait_for_clustering_ip
.
ocaml/xapi/xapi.ml
Outdated
(fun () -> Pool_db_backup.fetch_database_backup ~master_address:(Pool_role.get_master_address()) | ||
~pool_secret:!Xapi_globs.pool_secret ~force:None); | ||
"wait management interface to come up", [ Startup.NoExnRaising ], wait_management_interface; | ||
"wait management interface to come up, resync any cluster_hosts created", [ Startup.NoExnRaising ], wait_management_interface; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is a blocking action, if something goes wrong and the networking interface of clustering doesn't come up, it would be nice if we didn't block the whole of XAPI.
So perhaps introduce a second entry here below 'wait for management interface', that says "wait for clustering IP if any, and plug unplugged PBDs" that calls your maybe_wait_for_clustering_ip
function above, and uses Startup.OnThread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So move PBD plugging to the wait_for_clustering_ip
function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, keep it both where it is now, and add it inside the new step as well, we should not change existing behaviour. Other PBDs may successfully plug even if we don't have clustering
5cb789a
to
8754ea3
Compare
I think the logic should work now, the commit message should probably say this about the order
|
4463739
to
058ddad
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets wait for tests to finish running before merging.
ocaml/xapi/xapi_mgmt_iface.ml
Outdated
(fun () -> while !iP="" || not !attached do | ||
Condition.wait ip_cond ip_mutex; | ||
iP := Db.PIF.get_IP ~__context ~self:pIF; | ||
attached := Db.PIF.get_currently_attached ~__context ~self:pIF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does do the same as get_management_iface_is_connected
, which is checking the PIF carrier (link state). You can use PIF_metrics.carrier
for that.
ocaml/xapi/xapi.ml
Outdated
"Synchronising VLANs on slave with master", [Startup.OnlySlave; Startup.NoExnRaising], Sync_networking.copy_vlans_from_master ~__context; | ||
"Synchronising tunnels on slave with master", [Startup.OnlySlave; Startup.NoExnRaising], Sync_networking.copy_tunnels_from_master ~__context; | ||
(* CA-290237: Create cluster objects after network objects so PIFs are available *) | ||
"Create any necessary cluster_host objects", [ Startup.NoExnRaising ], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@minishrink we'll need to move this after we got an IP. Creating a cluster host object checks for the presence of an IP, and you get a NO_COMPATIBLE_CLUSTER_HOST error, caused by the create failing due to PIF_HAS_NO_NETWORK_CONFIGURATION
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Creating the object not only creates the object in the Db but also tries to enable it, and creates the object in the db only if enabling succeeded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add CA-290473 in the commit message when this is fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually it might be better to have a function here just create the database object, and have the actual join/enable done elsewhere (in resync_host).
For that to work:
- add a new field to cluster_host called joined, default to true for backwards compat
- refactor
Cluster_host.create
into a create_db and create (where create calls create_db and then resync_host) resync_host
will look at the joined field: if true then it calls daemon.enable+join, and sets joined to true if successful, otherwise it calls enable if enabled- in the startup sequence here only call create_db
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more problems found when testing with VLANs with a new xenrt test.
058ddad
to
47b03ab
Compare
ocaml/xapi/xapi_cluster_host.ml
Outdated
warn "Error occurred during Cluster_host.create"; | ||
handle_error error | ||
end; | ||
if not (Db.Cluster_host.get_joined ~__context ~self) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might look odd, but I think its better for ordering to keep it like this. The logical sequence is: join then later enable if not already enabled.
The cleanup create_storage seems to be a no-op and not really needed for this PR, can we keep it for a future PR? |
fe0bd2f
to
dfbbb4c
Compare
ocaml/xapi/xapi_cluster_host.mli
Outdated
Cluster_host object exists for the host. If one does not exist, the call | ||
will return the reference to the Cluster object wrapped in 'Some' *) | ||
|
||
val create_internal : __context:Context.t -> cluster:API.ref_Cluster -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you expose internal functions?
ocaml/xapi/xapi_cluster_host.mli
Outdated
host:API.ref_host -> pIF:API.ref_PIF -> API.ref_Cluster_host | ||
(** [create_internal ~__context ~cluster ~host ~pif] creates cluster_host in DB *) | ||
|
||
val join_internal : __context:Context.t -> self:API.ref_Cluster_host -> unit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question.
ocaml/xapi/xapi_cluster_host.mli
Outdated
(** [resync_host ~__context ~host] checks for the existence of a cluster_host. | ||
If one is found and enabled in the db, it is enabled in the Client layer | ||
too. Otherwise, nothing happens. *) | ||
If one exists but hasn't joined the cluster, [join_internal] is called, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can probably move the docs of join_internal inline here.
ocaml/xapi/xapi_cluster_host.mli
Outdated
(** [create ~__context ~cluster ~host] is implementation of the XenAPI call | ||
'Cluster_host.create'. It is the Cluster_host object constructor *) | ||
'Cluster_host.create'. It is the Cluster_host object constructor, and works by | ||
calling `create_internal`, then `resync_host`, which either joins the host to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
create_internal and resync_host are implementation details, you should probably copy/move their descriptions here.
b4a2cbb
to
cfc5364
Compare
| _ -> () | ||
|
||
let assert_operation_valid ~__context ~self ~op = | ||
assert_joined_cluster ~__context ~self op; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be part of get_operation_error
, because that is also used during update of allowed operations below.
I know currently the allowed operations are not updated properly, but lets avoid introducing more bugs :)
cfc5364
to
169158d
Compare
ocaml/xapi/xapi_cluster_helpers.ml
Outdated
then raise Api_errors.(Server_error (cluster_host_not_joined, [ Ref.string_of self ])) | ||
| None -> () end | ||
end; | ||
assert_allowed_during_rpu __context op; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as below, now that you refactored it, its obvious that this is in the wrong place, please move it inside get_operation_error
.
6e8a5af
to
2f30ce8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found why join fails: one of the asserts is wrong now that we've already created the cluster
ocaml/xapi/xapi_cluster_host.ml
Outdated
with_clustering_lock (fun () -> | ||
assert_operation_host_target_is_localhost ~__context ~host; | ||
let host = Db.Cluster_host.get_host ~__context ~self in | ||
assert_cluster_host_can_be_created ~__context ~host; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please delete this line, it will always fail, because we've created it already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll move it to create_internal
ocaml/xapi/xapi_cluster_host.ml
Outdated
|
||
let host = Db.Cluster_host.get_host ~__context ~self in | ||
assert_operation_host_target_is_localhost ~__context ~host; | ||
assert_cluster_host_can_be_created ~__context ~host; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delete this one too, also this whole sequence of asserts seem to be duplicate of some asserts that already existed above, are they redundant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, but some are in the wrong place.
Helpers.call_api_functions ~__context (fun rpc session_id -> | ||
try | ||
api_func rpc session_id | ||
with err -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add an Backtrace.is_important err
right after the with
to capture the backtrace as soon as possible.
If you do anything else before the backtrace will get overwritten.
ocaml/xapi/xapi_cluster_host.ml
Outdated
assert_cluster_host_can_be_created ~__context ~host; | ||
assert_pif_attached_to ~host ~pif ~__context; | ||
let ref = Ref.make () in | ||
assert_pif_attached_to ~host ~pIF ~__context; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to query the pif again, otherwise the fix PIF prerequisites function before has done its job of setting the PIF to disallow unplug, but we still got the old PIF.
Would be better if we moved querying the PIF inside fix_pif_prerequisites, and just give fix_pif_prerequisites the PIF ref, not the PIF record (which can be out of date)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, should all the PIF helpers use the ref and not the record?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the code that modifies the PIF should be before the one that fetches the PIF record, so any code that modifies a PIF must not take the record as input, that way we can move the code that fetches the PIF record after all the code that modifies it. Currently the only code that modifies it is fix_pif_prerequisites
b903770
to
e94655e
Compare
ocaml/xapi/xapi_cluster_host.ml
Outdated
assert_cluster_host_can_be_created ~__context ~host; | ||
assert_pif_attached_to ~host ~pIF ~__context; | ||
assert_pif_prerequisites (pIF,pifrec); | ||
(* Re-acquire PIF record after it has updated *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What kind of delay do we need here in order to refresh the record?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or should we just use pif_of_host
here again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No delay is needed, just fetch it again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be good if fix_pif_prerequisites fetched the record itself, to avoid accidentally reusing a stale record in the caller.
7bb6da6
to
050fe0d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the overall diff is good now.
Some minor comments on structuring of individual commits
- don't call Cluster_host.enable before cluster join finishes - attempt to plug PBDs after resyncing host Signed-off-by: Akanksha Mathur <akanksha.mathur@citrix.com>
- separate Cluster_host.create into internal db and client calls - refactor assertions, helpers, tests, and locks - block Cluster_host operations if not joined, except leave Signed-off-by: Akanksha Mathur <akanksha.mathur@citrix.com>
050fe0d
to
b6372f2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update, this looks good now.
This PR cleans up some of the breakages exposed by the recent bonding enabling (CP-26912).
Changes made to:
startup sequence
in order:
cluster_host operations
joined
field to represent cluster membership of hostConsequences:
pool-join
should now work for bonded networks as well as VLANs