New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retry if MySQL Server is unreachable #190
Conversation
to use json.load instead of .loads
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
be my guest to refactor shell code on the server ;)
@@ -23,6 +25,15 @@ | |||
logger = logging.getLogger(__name__) | |||
|
|||
|
|||
class _NoQuorum(server_exceptions.Error): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
besides using only here, why not concentrate this in src/server_exceptions.py
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's why I put it in workload.py—since it's only used there
my preference is to avoid splitting code across multiple files since imo it's harder to read—I think it's worth splitting if multiple other files use it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well done. Thank you!
Fixes #86
Context
These are the situations the charm should handle without raising an uncaught exception:
mysqlrouter --bootstrap
fails because server does not have quorummysqlrouter --bootstrap
fails because server is unreachable (error code 2003)shell.connect()
fails because server is unreachable (error code 2003)(mysqlsh commands seem to not fail even if server does not have quorum)
Solution
Catch exceptions in the aforementioned situations & set waiting status. Retry connections on the next Juju event
Alternatives considered
We decided not to retry within the Juju event so that
A retry within the Juju event with a short timeout was considered. However, during initial deployment ~15 events are currently fired (so a 1 min timeout could block the charm for 15 minutes). It would be possible to mitigate that by storing information about the last retry—but the complexity was not worth it.
Discussion about Juju limitation that we may not get another event until update status: https://matrix.to/#/!xdClnUGkurzjxqiQcN:ubuntu.com/$_SXnGJWUQCeFGu9vpBxD5GJ5hgMMAHchOmWBPiqjDFY?via=ubuntu.com&via=matrix.org&via=cutefunny.art
Implementation/refactoring
DBError
exceptions, serialize them to JSON, and raise corresponding exception in charm code. This was done to move complexity (i.e. exception handling) from mysqlsh python code to charm code