-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DPE-2897] Cross-region async replication #447
Conversation
Signed-off-by: Marcelo Henrique Neppel <marcelo.neppel@canonical.com>
Signed-off-by: Marcelo Henrique Neppel <marcelo.neppel@canonical.com>
Signed-off-by: Marcelo Henrique Neppel <marcelo.neppel@canonical.com>
Signed-off-by: Marcelo Henrique Neppel <marcelo.neppel@canonical.com>
Signed-off-by: Marcelo Henrique Neppel <marcelo.neppel@canonical.com>
…ation Signed-off-by: Marcelo Henrique Neppel <marcelo.neppel@canonical.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have finished the initial testing phase, let's continue as followup UX improvements here.
filename = f"{POSTGRESQL_DATA_PATH}-{str(datetime.now()).replace(' ', '-').replace(':', '-')}.tar.gz" | ||
self.container.exec( | ||
f"tar -zcf {filename} {POSTGRESQL_DATA_PATH}".split() | ||
).wait_output() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please create followup tickets:
- pack archive in background (otherwise promotion to Standby will take a LOT of time if local DB is huge)
- warn users about available backups to clean (free disk space topic), goss is a good match here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…ation Signed-off-by: Marcelo Henrique Neppel <marcelo.neppel@canonical.com>
poetry.lock
Outdated
@@ -1645,7 +1645,6 @@ files = [ | |||
{file = "PyYAML-6.0.1-cp311-cp311-win_amd64.whl", hash = "sha256:bf07ee2fef7014951eeb99f56f39c9bb4af143d8aa3c21b1677805985307da34"}, | |||
{file = "PyYAML-6.0.1-cp312-cp312-macosx_10_9_x86_64.whl", hash = "sha256:855fb52b0dc35af121542a76b9a84f8d1cd886ea97c84703eaa6d88e37a2ad28"}, | |||
{file = "PyYAML-6.0.1-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:40df9b996c2b73138957fe23a16a4f0ba614f4c0efce1e9406a184b6d07fa3a9"}, | |||
{file = "PyYAML-6.0.1-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:a08c6f0fe150303c1c6b71ebcd7213c2858041a7e01975da3a99aed1e7a378ef"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can avoid those flyby changes by running poetry cache clear PyPI --all
and then poetry lock --no-update
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. It did the trick.
Signed-off-by: Marcelo Henrique Neppel <marcelo.neppel@canonical.com>
Signed-off-by: Marcelo Henrique Neppel <marcelo.neppel@canonical.com>
* Add async replication implementation Signed-off-by: Marcelo Henrique Neppel <marcelo.neppel@canonical.com> * Backup standby pgdata folder Signed-off-by: Marcelo Henrique Neppel <marcelo.neppel@canonical.com> * Improve comments and logs Signed-off-by: Marcelo Henrique Neppel <marcelo.neppel@canonical.com> * Remove unused constant Signed-off-by: Marcelo Henrique Neppel <marcelo.neppel@canonical.com> * Remove warning log call and add optional type hint Signed-off-by: Marcelo Henrique Neppel <marcelo.neppel@canonical.com> * Revert poetry.lock Signed-off-by: Marcelo Henrique Neppel <marcelo.neppel@canonical.com> * Revert poetry.lock Signed-off-by: Marcelo Henrique Neppel <marcelo.neppel@canonical.com> --------- Signed-off-by: Marcelo Henrique Neppel <marcelo.neppel@canonical.com>
Issue
It's not possible to replicate data between regions.
Solution
Implement cross-region async replication. This PR is a rebranded and more stable version of #368.
With this PR, it's no longer necessary to remove the relation and relate again when a switchover is needed.
Also, the names of the relations can easily be changed to others, like
cluster-one
andcluster-two
, for example, to avoid confusing users.Important changes:
src/relations/async_replication.py
contains the logic to make one cluster the primary and the other the standby. To make the standby cluster follow the primary cluster, the candidate for the standby cluster needs to be restarted._on_async_relation_changed
, which takes care of restarting the standby cluster units databases in order to make them replicate data from the primary cluster.The
127.0.0.6/32
address added to the Patroni configuration file is needed to allow Envoy to make different clusters communicate when using Istio.Passwords update will be implemented in another PR, as this one is already huge.
If the standby cluster has its relation removed, it goes to a
read-only mode
and can be promoted later to a normal cluster through thepromote-cluster
action.How to deploy: https://discourse.charmhub.io/t/charmed-postgresql-k8s-deploy-async-replication/13895
How to trigger a switchover: https://discourse.charmhub.io/t/charmed-postgresql-k8s-deploy-async-replication/13895
Additional instructions:
Integration tests: #448