Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

/pgdata/pgbackrest/log/db-archive-get-async.log unbounded growth on Standby Cluster #4142

Open
wmuldergov opened this issue Mar 20, 2025 · 0 comments

Comments

@wmuldergov
Copy link

Overview

If you setup a CrunchyDB cluster using an external repo with s3, the Standby Cluster will create this log: /pgdata/pgbackrest/log/db-archive-get-async.log that gets updated every time it syncs from s3 which looks to be every 5-10 seconds. Since logrotate doesn't seem to be enabled by default on this folder (like it is for /pgdata/pg17/log) this log will continue to grow and eventually cause the space in the pgdata PVC to get exhausted.

Environment

Please provide the following details:

  • Platform: OpenShift
  • Platform Version: 4.16
  • PGO Image Tag: ubi8-17.0-3.4-0
  • Postgres Version: 17
  • Storage: s3

Steps to Reproduce

REPRO

  1. Setup a Primary and Standby Cluster for Crunchy using the External S3 Repo method.
  2. On the Standby Cluster monitor the /pgdata/pgbackrest/log/db-archive-get-async.log size and watch it grow.

EXPECTED

This log file gets rotated like the logs in /pgdata/pg17/log

ACTUAL

The log file keeps growing till the space on the PVC is exhausted.

Logs

-------------------PROCESS START-------------------
2025-03-20 19:02:14.552 P00   INFO: archive-get:async command begin 2.53.1: [00000003000000C500000091, 00000003000000C500000092, 00000003000000C500000093, 00000003000000C500000094, 00000003000000C500000095, 00000003000000C500000096, 00000003000000C500000097, 00000003000000C500000098] --archive-async --exec-id=675601-15d23e31 --log-level-console=off --log-level-stderr=off --log-path=/pgdata/pgbackrest/log --pg1-path=/pgdata/pg17 --repo=2 --repo1-host=<REDACTED> --repo1-host-ca-file=/etc/pgbackrest/conf.d/~postgres-operator/tls-ca.crt --repo1-host-cert-file=/etc/pgbackrest/conf.d/~postgres-operator/client-tls.crt --repo1-host-key-file=/etc/pgbackrest/conf.d/~postgres-operator/client-tls.key --repo1-host-type=tls --repo1-host-user=postgres --repo1-path=/pgbackrest/repo1 --repo2-path=/db/dbbackup --repo2-s3-bucket=<REDACTED> --repo2-s3-endpoint=<REDACTED> --repo2-s3-key=<redacted> --repo2-s3-key-secret=<redacted> --repo2-s3-region=ca-central-1 --repo2-s3-uri-style=path --repo2-type=s3 --spool-path=/pgdata/pgbackrest-spool --stanza=db
2025-03-20 19:02:14.552 P00   INFO: get 8 WAL file(s) from archive: 00000003000000C500000091...00000003000000C500000098
2025-03-20 19:02:14.623 P00   INFO: archive-get:async command end: completed successfully (71ms)

Additional Information

I see this was merged in recently: #4108 however it looks like they put the logrotate function behind a feature flag for Open Telemetry. This issue would affect anyone that has a cluster setup in Standby mode, so I would suggest not putting it behind the feature flag for Open Telemetry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant