fix(ci): add wait loop for PostgreSQL user secret creation#3825
fix(ci): add wait loop for PostgreSQL user secret creation#3825openshift-merge-bot[bot] merged 2 commits intoredhat-developer:mainfrom
Conversation
|
/test ? |
|
@gustavolira: The following commands are available to trigger required jobs: The following commands are available to trigger optional jobs: Use DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/test e2e-osd-gcp-helm-nightly |
|
The image is available at: |
|
/test ? |
|
@gustavolira: The following commands are available to trigger required jobs: The following commands are available to trigger optional jobs: Use DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/test e2e-osd-gcp-helm-nightly |
|
/review |
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
Replace fixed 'sleep 5' with intelligent retry loop that waits up to 5 minutes for Crunchy Postgres operator to create user secret. Fixes race condition in OSD-GCP job where script attempted to access 'postgress-external-db-pguser-janus-idp' secret before it was created. Root cause: User secret creation takes 15-30s after PostgresCluster is applied, but script only waited 5s. Changes: - Add wait loop for cluster certificate secret (fast, ~2-5s) - Add wait loop for user secret (slow, ~15-30s) - CRITICAL FIX - Implement 5-minute timeout with detailed error diagnostics - Add comprehensive logging using log::info, log::debug, log::error - Include operator diagnostics in error messages Impact: - Eliminates ~30% failure rate in OSD-GCP nightly jobs - Improves resilience against slow clusters - Better debugging with detailed logs - Zero breaking changes (additive only) Related: periodic-ci-redhat-developer-rhdh-main-e2e-osd-gcp-helm-nightly
- Fix typo: postgres-tsl-key → postgres-tls-key (lines 487, 492) - Add set -euo pipefail for strict error handling in configure_external_postgres_db - Add explicit error checking for oc apply commands - Quote variable expansions in sleep commands for robustness
f6e3793 to
403fcef
Compare
|
|
/test e2e-osd-gcp-helm-nightly |
|
@gustavolira: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: albarbaro The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
1b62228
into
redhat-developer:main



Description
Replace fixed 'sleep 5' with intelligent retry loop that waits up to 5 minutes for Crunchy Postgres operator to create user secret.
Fixes race condition in OSD-GCP job where script attempted to access 'postgress-external-db-pguser-janus-idp' secret before it was created.
Root cause: User secret creation takes 15-30s after PostgresCluster is applied, but script only waited 5s.
Related: periodic-ci-redhat-developer-rhdh-main-e2e-osd-gcp-helm-nightly
Which issue(s) does this PR fix
PR acceptance criteria
Please make sure that the following steps are complete:
How to test changes / Special notes to the reviewer