Open
Description
What is the problem?
When initializing a repo, Crunchy is ignoring pgbackrest stanza-create errors and initialize a new database system-id
inside an S3 bucket which already has a repo initialized.
The problem
pgbackrest stanza-create --stanza="${stanza}" || pgbackrest stanza-upgrade --stanza="${stanza}"
- code reference
- the error from the first command should have been raised to Crunchy and cluster initialization should have been stopped with the following error:
backup and archive info files exist but do not match the database
- most likely the error was introduced in 61b9728 (cc @cbandy)
How to reproduce?
- Create one PG cluster and configure repo2 to point to S3
- Initiate a full backup to repo2
- Delete the first PG cluster and then create a new one, having the same S3 configured as repo2
- The cluster will be setup correctly, but normally it should raise an error because the S3 bucket already contains a stanza
Below I've attached some file to illustrate the error:
backup.info
[db]
db-catalog-version=202307071
db-control-version=1300
db-id=3
db-system-id=7449396375125397577
db-version="16"
[db:history]
1={"db-catalog-version":202307071,"db-control-version":1300,"db-system-id":7449035064254382164,"db-version":"16"}
2={"db-catalog-version":202307071,"db-control-version":1300,"db-system-id":7449383438968672363,"db-version":"16"}
3={"db-catalog-version":202307071,"db-control-version":1300,"db-system-id":7449396375125397577,"db-version":"16"}
/pgdata/pgbackrest/log/db-stanza-create.log
-------------------PROCESS START-------------------
2024-12-17 15:02:37.685 P00 INFO: stanza-create command begin 2.53.1: --exec-id=146-d2552cd2 --log-level-console=info --log-level-file=info --log-path=/pgdata/pgbackrest/log --pg1-path=/pgdata/pg16 --pg1-port=5432 --pg1-socket-path=/tmp/postgres --repo1-host=main-db-pg-repo-host-0.main-db-pg-pods.ns.svc.cluster.local. --repo1-host-ca-file=/etc/pgbackrest/conf.d/~postgres-operator/tls-ca.crt --repo1-host-cert-file=/etc/pgbackrest/conf.d/~postgres-operator/client-tls.crt --repo1-host-key-file=/etc/pgbackrest/conf.d/~postgres-operator/client-tls.key --repo1-host-type=tls --repo1-host-user=postgres --repo1-path=/pgbackrest/repo1 --repo2-path=/pgbackrest/repo2 --repo2-s3-bucket=kubernetes-automatic-deploy-db-backups --repo2-s3-endpoint=https://10.XX.129.50:8544/ --repo2-s3-key=<redacted> --repo2-s3-key-secret=<redacted> --repo2-s3-region=us-east-1 --repo2-s3-uri-style=path --repo2-storage-ca-file=/etc/pgbackrest/conf.d/repo2-server-ca.crt --repo2-type=s3 --stanza=db
2024-12-17 15:02:38.330 P00 INFO: stanza-create for stanza 'db' on repo1
2024-12-17 15:02:38.624 P00 INFO: stanza-create for stanza 'db' on repo2
2024-12-17 15:02:38.646 P00 ERROR: [028]: backup and archive info files exist but do not match the database
HINT: is this the correct stanza?
HINT: did an error occur during stanza-upgrade?
2024-12-17 15:02:38.658 P00 INFO: stanza-create command end: aborted with exception [028]
/pgdata/pgbackrest/log/db-stanza-upgrade.log
-------------------PROCESS START-------------------
2024-12-17 15:02:38.667 P00 INFO: stanza-upgrade command begin 2.53.1: --exec-id=161-c77a07f7 --log-level-console=info --log-level-file=info --log-path=/pgdata/pgbackrest/log --pg1-path=/pgdata/pg16 --pg1-port=5432 --pg1-socket-path=/tmp/postgres --repo1-host=main-db-pg-repo-host-0.main-db-pg-pods.ns.svc.cluster.local. --repo1-host-ca-file=/etc/pgbackrest/conf.d/~postgres-operator/tls-ca.crt --repo1-host-cert-file=/etc/pgbackrest/conf.d/~postgres-operator/client-tls.crt --repo1-host-key-file=/etc/pgbackrest/conf.d/~postgres-operator/client-tls.key --repo1-host-type=tls --repo1-host-user=postgres --repo1-path=/pgbackrest/repo1 --repo2-path=/pgbackrest/repo2 --repo2-s3-bucket=kubernetes-automatic-deploy-db-backups --repo2-s3-endpoint=https://10.XX.129.50:8544/ --repo2-s3-key=<redacted> --repo2-s3-key-secret=<redacted> --repo2-s3-region=us-east-1 --repo2-s3-uri-style=path --repo2-storage-ca-file=/etc/pgbackrest/conf.d/repo2-server-ca.crt --repo2-type=s3 --stanza=db
2024-12-17 15:02:39.272 P00 INFO: stanza-upgrade for stanza 'db' on repo1
2024-12-17 15:02:39.296 P00 INFO: stanza 'db' on repo1 is already up to date
2024-12-17 15:02:39.297 P00 INFO: stanza-upgrade for stanza 'db' on repo2
2024-12-17 15:02:39.398 P00 INFO: stanza-upgrade command end: completed successfully (732ms)
The second problem
- when trying o bootstrap a new cluster from
repo2
which has been corupted by the above error, we're getting:
RROR: [075]: the latest backup set found '20241217-135215F' is from a prior version of PostgreSQL
20
24-12-17T15:34:11.002221975Z HINT: was a backup created after the stanza-upgrade?
2024-12-17T15:34:11.002226189Z HINT: specify --set or --type=time/lsn to restore from a prior version of PostgreSQL.
- Question: is there any solution on how we can remove the second database
system-id
and fixrepo2
so we can continue the bootstraping process?
Thank you very much!
Metadata
Metadata
Assignees
Labels
No labels