-
Notifications
You must be signed in to change notification settings - Fork 292
[477 merge] #3624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
[477 merge] #3624
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Unlike sr-probe, this one calls the XenAPI call, not the SMAPIv2 call. It requires an SR that supports this call. Signed-off-by: Gabor Igloi <gabor.igloi@citrix.com>
Add lima_release, bump schema minor: 142 -> 200 Add comment advising to leave interval for hotfixes Signed-off-by: Edwin Török <edvin.torok@citrix.com> Signed-off-by: Akanksha Mathur <akanksha.mathur@citrix.com>
(Not strictly necessary for ticket, but this change precedes the relevant fix) Signed-off-by: Akanksha Mathur <akanksha.mathur@citrix.com>
CP-26912: Add API error pif_not_attached_to_host for cluster_host creation Signed-off-by: Akanksha Mathur <akanksha.mathur@citrix.com>
CP-26912: Cluster.pool_create still takes a network parameter CP-26912: Refactor Cluster.pool_create CP-26912: Refactor Cluster_host.create_as_necessary CP-29612: Cluster.create takes PIF parameter for local cluster_host CP-26912: Add internal Cluster.get_network call for XenCenter Signed-off-by: Akanksha Mathur <akanksha.mathur@citrix.com>
…etwork changes CP-26912: Update uses of Cluster.(pool_)create in tests Signed-off-by: Akanksha Mathur <akanksha.mathur@citrix.com>
Bond.create: if the cluster_host.PIF is a bond member, the cluster_host will be moved to the bond master PIF. Bond.destroy: if the cluster_host.PIF is in the bond, it will be moved to the primary slave PIF No PIF prerequisites need to be toggled as bond creation already enforces the following: - if disallow_unplug is set for one bond member, it will be set for all of them - only one PIF (the primary slave) may have a valid (!= None) IP configuration mode, which is then copied to the bond master. This means that in bond creation, the bond master already meets the clustering PIF prerequisites, as does the primary slave for bond destruction Signed-off-by: Akanksha Mathur <akanksha.mathur@citrix.com>
This is in order to add a unit test for it CP-28213: Modify PIF assertion error Signed-off-by: Akanksha Mathur <akanksha.mathur@citrix.com>
Signed-off-by: Akanksha Mathur <akanksha.mathur@citrix.com>
…etwork changes CP-26912: Update uses of Cluster.(pool_)create in tests Signed-off-by: Akanksha Mathur <akanksha.mathur@citrix.com>
concurrently CP-25121: forbid clustering operations during rolling pool upgrade - Cluster calls such as create, destroy, add, and leave should not occur during an RPU as this may lead to mismatched versions or errors - Cluster_host calls such as enable and disable are fine as they simply toggle clustering - Pool clustering operations don't need a lock, as they simply iterate over other (locked) cluster calls CP-26147: prevent parallel Cluster.create and Cluster.destroy calls - wrapping clustering calls in `with_clustering_operation` currently blocks any and all concurrent clustering operations Signed-off-by: Akanksha Mathur <akanksha.mathur@citrix.com>
…ng RPU Signed-off-by: Akanksha Mathur <akanksha.mathur@citrix.com>
Signed-off-by: Akanksha Mathur <akanksha.mathur@citrix.com>
Signed-off-by: Akanksha Mathur <akanksha.mathur@citrix.com>
Signed-off-by: Akanksha Mathur <akanksha.mathur@citrix.com>
clustering `PIF.disallow_unplug:true` is necessary for the entire time a cluster host is enabled, as well as being a prerequisite for the API calls `Cluster_host.{enable, create}`. If `disallow_unplug` is set to false during either of these two clustering operations, it could create a race condition where both operations are allowed, but we then have a cluster_host that doesn't meet networking requirements and will therefore fail later on. This PR introduces a condition wherein you cannot allow PIF.unplug if the pool cluster is currently creating or enable a cluster_host on *any* PIF, regardless of which PIF you are allowing unplug for. This should rarely cause an issue, as `PIF.set_disallow_unplug` is rarely called, although an operation like `Cluster.pool_{create, resync}` could block it for a while, depending on the pool size. Additionally, this commit removes the obsolete `CLUSTERING_ENABLED_ON_NETWORK` API error in favour of using `CLUSTERING_ENABLED`, as cluster_hosts correspond more closely to PIFs Signed-off-by: Akanksha Mathur <akanksha.mathur@citrix.com>
Cluster_host.resync_host is iterated over all hosts in a pool by Cluster.pool_resync, and is also called at startup as part of the task "resync cluster host state" Signed-off-by: Akanksha Mathur <akanksha.mathur@citrix.com>
We'll do a best-effort attempt at declaring hosts as dead, however this can fail if there are multiple hosts that are offline. We do not want to keep broken cluster host objects around because they would prevent disabling clustering too. Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Declare dead requires the rest of the hosts in the cluster to be alive (otherwise data corruption could occur as we are about to forcibly kick hosts from the cluster). The current Xen API provides a way to kick hosts out of a pool one by one, or to mark them as dead. We could introduce a batched API, but that would be cumbersome for the user, it is better if we can declare dead hosts one at a time, and when we declared them all dead then we actually modify the cluster. The semantics of "declare dead" is different between XenAPI and clustering daemon, for XenAPI it means: the host is dead now, you can reset VM power state, etc. In fact HA automatically detects dead hosts. Then there is Host.destroy (or `host-forget`) on the CLI which considers the host permanently dead. This is the semantics the clustering daemon wants: if a host that has been declared dead to the clustering daemon comes back alive then data corruption could occur. To avoid confusion, call the new method `Cluster_host.forget`, and call it automatically from `Host.destroy`. `Host.declare_dead` will have no effect on clustering. For this to work we need to remember all the IPs of the hosts we have attempted to declare dead in the past but failed, and retry declaring them as dead again on the next API call. Declaring a host as dead is permanent, the host can never be part of the cluster again, so we do not need an undo API here. TODO: although things could go wrong if we try to join a host with the same IP as one that we are trying to declare as dead... (which can happen due to DHCP). Signed-off-by: Edwin Török <edvin.torok@citrix.com>
* CA-290471: fix wrong token timeout in XenCenter Use same units in the object and in the API. The timeout set by XenCenter 20000s, which is wrong, it should've been 20s (which is what you get from the CLI). XenCenter creates an object and then reads the field to find out the default, because the SDK doesn't expose the defaults for parameters. So we should really use the same units for API parameters and internal object fields. Especially that this field is not used by XAPI at all, it is only there so that the admin can inspect what values were used to create the cluster. Changing this field will not change the actual timeouts used by the cluster, because the timeout values cannot be changed at runtime. Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Split `Xapi_cluster_host.create` into two functions: - `create_internal` creates cluster_host object in xapi db - `resync_host` now creates and enables cluster_host in client layer via clusterd Create will work as before, but these two functions will be called separately within the xapi startup sequence Signed-off-by: Akanksha Mathur <akanksha.mathur@citrix.com>
- don't call Cluster_host.enable before cluster join finishes - attempt to plug PBDs after resyncing host Signed-off-by: Akanksha Mathur <akanksha.mathur@citrix.com>
- separate Cluster_host.create into internal db and client calls - refactor assertions, helpers, tests, and locks - block Cluster_host operations if not joined, except leave Signed-off-by: Akanksha Mathur <akanksha.mathur@citrix.com>
Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Locking_helpers provides a nice list of active locks as part of `xe host-get-thread-diagnostics`, which helps debugging. Signed-off-by: Edwin Török <edvin.torok@citrix.com>
This is quite useful for tracing what Xapi_cluster_host methods are called, and together with the previous commit for tracking deadlocks. Signed-off-by: Edwin Török <edvin.torok@citrix.com>
…ustering daemon is down Cluster.destroy did not work if you destroyed the last cluster host (like some of the unit tests actually did). Cluster_host.destroy on the last node is special: there is no cluster after you leave (leave would fail in reality), so just destroy the cluster. Refactor all 3 destroy operations into one, choose automatically based on number of hosts. Could've introduced a new API error to forbid destroying the last cluster host, but it is better if XAPI is able to automatically do the right thing than to tell the user it should call some other API instead. Also with Cluster_host.force_destroy you could have already ended up in a situation where you have no cluster hosts and want to destroy the cluster, which would hang indefinitely because the daemon was stopped. We always try to enable the daemon on startup, so keep track on whether we think it should be running, and if we know we stopped it then just raise an error when trying to do an RPC. This is useful in debugging situations where we try to send RPCs too early too (e.g. before we started the daemon). Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Signed-off-by: Akanksha Mathur <akanksha.mathur@citrix.com>
Signed-off-by: Akanksha Mathur <akanksha.mathur@citrix.com>
Define constants for checking cluster token values
lindig
approved these changes
Jun 8, 2018
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a lot of code that has been reviewed before. I had a cursory look and I think it looks good.
edwintorok
pushed a commit
to edwintorok/xen-api
that referenced
this pull request
Jun 13, 2018
…/master [477 merge]
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The changes here have been reviewed on the feature branch, ticket numbers are in commits.