Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: instance manager stuck if PostgreSQL doesn't start #4434

Merged
merged 4 commits into from
May 9, 2024

Conversation

leonardoce
Copy link
Contributor

If a primary PostgreSQL instance fails to start after being stopped, the instance manager is indefinitely waiting for it to start running the configuration queries.

This patch creates a separate context for each postmaster process and runs the configuration queries in a parallel goroutine, canceling it as soon as the postmaster finishes.

Fixes: #4433

@github-actions github-actions bot added backport-requested ◀️ This pull request should be backported to all supported releases release-1.21 release-1.22 release-1.23 labels May 3, 2024
Copy link
Contributor

github-actions bot commented May 3, 2024

❗ By default, the pull request is configured to backport to all release branches.

  • To stop backporting this pr, remove the label: backport-requested ◀️ or add the label 'do not backport'
  • To stop backporting this pr to a certain release branch, remove the specific branch label: release-x.y

@leonardoce
Copy link
Contributor Author

/test limit=local

Copy link
Contributor

github-actions bot commented May 3, 2024

@leonardoce, here's the link to the E2E on CNPG workflow run: https://github.com/cloudnative-pg/cloudnative-pg/actions/runs/8941773728

@github-actions github-actions bot added the ok to merge 👌 This PR can be merged label May 3, 2024
@leonardoce
Copy link
Contributor Author

E2e tests are all green!

@leonardoce leonardoce requested a review from a team as a code owner May 8, 2024 09:14
@mnencia
Copy link
Member

mnencia commented May 9, 2024

I think the concurrency in this code is difficult to understand correctly. I would like to have some comments to help understanding it.

Leonardo Cecchi and others added 4 commits May 9, 2024 11:39
If a primary PostgreSQL instance fails to start after being stopped, the
instance manager is indefinitely waiting for it to start running the
configuration queries.

This patch creates a separate context for each postmaster process and
runs the configuration queries in a parallel goroutine, canceling it as
soon as the postmaster finishes.

Fixes: cloudnative-pg#4433

Signed-off-by: Leonardo Cecchi <leonardo.cecchi@enteprisedb.com>
Signed-off-by: Armando Ruocco <armando.ruocco@enterprisedb.com>
Signed-off-by: Leonardo Cecchi <leonardo.cecchi@enteprisedb.com>
Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
@mnencia mnencia merged commit 473e5e7 into cloudnative-pg:main May 9, 2024
27 checks passed
cnpg-bot pushed a commit that referenced this pull request May 9, 2024
If a primary PostgreSQL instance fails to start after being stopped, the
instance manager is indefinitely waiting for it to start running the
configuration queries.

This patch creates a separate context for each postmaster process and
runs the configuration queries in a parallel goroutine, canceling it as
soon as the postmaster finishes.

Fixes: #4433

Signed-off-by: Leonardo Cecchi <leonardo.cecchi@enteprisedb.com>
Signed-off-by: Armando Ruocco <armando.ruocco@enterprisedb.com>
Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
Co-authored-by: Leonardo Cecchi <leonardo.cecchi@enteprisedb.com>
Co-authored-by: Armando Ruocco <armando.ruocco@enterprisedb.com>
Co-authored-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
(cherry picked from commit 473e5e7)
cnpg-bot pushed a commit that referenced this pull request May 9, 2024
If a primary PostgreSQL instance fails to start after being stopped, the
instance manager is indefinitely waiting for it to start running the
configuration queries.

This patch creates a separate context for each postmaster process and
runs the configuration queries in a parallel goroutine, canceling it as
soon as the postmaster finishes.

Fixes: #4433

Signed-off-by: Leonardo Cecchi <leonardo.cecchi@enteprisedb.com>
Signed-off-by: Armando Ruocco <armando.ruocco@enterprisedb.com>
Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
Co-authored-by: Leonardo Cecchi <leonardo.cecchi@enteprisedb.com>
Co-authored-by: Armando Ruocco <armando.ruocco@enterprisedb.com>
Co-authored-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
(cherry picked from commit 473e5e7)
cnpg-bot pushed a commit that referenced this pull request May 9, 2024
If a primary PostgreSQL instance fails to start after being stopped, the
instance manager is indefinitely waiting for it to start running the
configuration queries.

This patch creates a separate context for each postmaster process and
runs the configuration queries in a parallel goroutine, canceling it as
soon as the postmaster finishes.

Fixes: #4433

Signed-off-by: Leonardo Cecchi <leonardo.cecchi@enteprisedb.com>
Signed-off-by: Armando Ruocco <armando.ruocco@enterprisedb.com>
Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
Co-authored-by: Leonardo Cecchi <leonardo.cecchi@enteprisedb.com>
Co-authored-by: Armando Ruocco <armando.ruocco@enterprisedb.com>
Co-authored-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
(cherry picked from commit 473e5e7)
dougkirkley pushed a commit to dougkirkley/cloudnative-pg that referenced this pull request Jun 11, 2024
…pg#4434)

If a primary PostgreSQL instance fails to start after being stopped, the
instance manager is indefinitely waiting for it to start running the
configuration queries.

This patch creates a separate context for each postmaster process and
runs the configuration queries in a parallel goroutine, canceling it as
soon as the postmaster finishes.

Fixes: cloudnative-pg#4433

Signed-off-by: Leonardo Cecchi <leonardo.cecchi@enteprisedb.com>
Signed-off-by: Armando Ruocco <armando.ruocco@enterprisedb.com>
Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
Co-authored-by: Leonardo Cecchi <leonardo.cecchi@enteprisedb.com>
Co-authored-by: Armando Ruocco <armando.ruocco@enterprisedb.com>
Co-authored-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
Signed-off-by: Douglass Kirkley <dkirkley@eitccorp.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-requested ◀️ This pull request should be backported to all supported releases ok to merge 👌 This PR can be merged release-1.21 release-1.22 release-1.23
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: The instance manager is not able to unfence PG after a previous unfence operation failed
5 participants