Skip to content

Conversation

edwintorok
Copy link
Contributor

This was useful in debugging problems with @minishrink PR.

Copy link
Contributor

@minishrink minishrink left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good PR, this cleanup is necessary. Unless Gabor has any quibbles, I think this can be merged.

Xapi_clustering.Daemon.disable ~__context
| Result.Error error ->
D.warn "Error occurred during Cluster.destroy";
handle_error error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good, we were unnecessarily duplicating code.

Xapi_clustering.Daemon.disable ~__context
| Result.Error error ->
warn "Error occurred during Cluster_host.force_destroy";
handle_error error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also nicely done!

) srs
then raise Api_errors.(Server_error (cluster_stack_in_use, [ cluster_stack ]))

let daemon_enabled = ref false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want a helper to query this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually ignore that, the code is good to merge

@coveralls
Copy link

coveralls commented May 31, 2018

Coverage Status

Coverage decreased (-0.002%) to 20.779% when pulling 171bbd3 on edwintorok:debugging into 1f7a322 on xapi-project:feature/REQ477/master.

@edwintorok
Copy link
Contributor Author

Just realized that we do need to prevent Clusters without cluster hosts, otherwise pool autojoin won't work, and in fact a Cluster object without cluster hosts is unusable.

@edwintorok
Copy link
Contributor Author

We should merge #3607 first and then this one needs to be rebased.

Locking_helpers provides a nice list of active locks as part of
`xe host-get-thread-diagnostics`, which helps debugging.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
This is quite useful for tracing what Xapi_cluster_host methods are
called, and together with the previous commit for tracking deadlocks.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
…ustering daemon is down

Cluster.destroy did not work if you destroyed the last cluster host
(like some of the unit tests actually did).
Cluster_host.destroy on the last node is special: there is no cluster
after you leave (leave would fail in reality), so just destroy the cluster.
Refactor all 3 destroy operations into one, choose automatically based
on number of hosts.

Could've introduced a new API error to forbid destroying the last
cluster host, but it is better if XAPI is able to automatically do the
right thing than to tell the user it should call some other API instead.

Also with Cluster_host.force_destroy you could have already ended up in
a situation where you have no cluster hosts and want to destroy the
cluster, which would hang indefinitely because the daemon was stopped.

We always try to enable the daemon on startup, so keep track on whether
we think it should be running, and if we know we stopped it then just
raise an error when trying to do an RPC.

This is useful in debugging situations where we try to send RPCs too
early too (e.g. before we started the daemon).

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
@edwintorok
Copy link
Contributor Author

Rebased on latest branch.

@edwintorok edwintorok changed the title [WiP] CP-28406: improve logging and error checking in clustering functions CP-28406: improve logging and error checking in clustering functions Jun 6, 2018
@minishrink minishrink merged commit e049f31 into xapi-project:feature/REQ477/master Jun 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants