Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: postgres health check reporting any db without checksums as unhealthy #10178

Merged
merged 1 commit into from
Jun 12, 2024

Conversation

mertalev
Copy link
Contributor

Description

This is a small change to make the health check respond as healthy if checksums are disabled and it gets a successful response from Postgres. It currently bogusly reports existing databases as unhealthy.

Fixes #10134

@mertalev mertalev requested a review from bo0tzz as a code owner June 12, 2024 00:05
@mertalev mertalev added the deployment Deployment related tasks label Jun 12, 2024
@mertalev mertalev enabled auto-merge (squash) June 12, 2024 00:18
@mertalev mertalev merged commit 2e0c6f6 into main Jun 12, 2024
37 of 42 checks passed
@mertalev mertalev deleted the fix/pg-health-check branch June 12, 2024 00:18
@mmomjian
Copy link
Contributor

Ty. I should have known everyone would just copy paste the new changes haha

@reini12345
Copy link

I still have the issue with v1.106.4

image

@zackpollard
Copy link
Contributor

I still have the issue with v1.106.4

image

That is actually reporting checksum errors it seems

@reini12345
Copy link

did not have this before the 1.106 update

@zackpollard
Copy link
Contributor

zackpollard commented Jun 14, 2024

did not have this before the 1.106 update

We didn't have health checks before the 1.106 update, so these issues wouldn't have been visible before then. This is not a bug, the recently added checksums are reporting an inconsistency in your database data

@mmomjian
Copy link
Contributor

mmomjian commented Jun 14, 2024

I still have the issue with v1.106.4

image

This looks fine to me. Please use the updated healthcheck command found in this PR that contains the COALESCE command

The failure_count is a docker internal counter that increments with successive failed health checks. If it was a checksum error, it should appear in the message itself. In this case the result is NULL which appears as empty (checksums disabled).

@zackpollard
Copy link
Contributor

I still have the issue with v1.106.4

image

This looks fine to me. Please use the updated healthcheck command found in this PR that contains the COALESCE command

The failure_count is a docker internal counter that increments with successive failed health checks. If it was a checksum error, it should appear in the message itself. In this case the result is NULL which appears as empty (checksums disabled).

Ah, well that's not confusing at all 🤣 It did make me think though, do we have docs to help people recover if they do end up with a legitimate checksum error? Or at least somewhere to point them at in the postgres docs?

@mmomjian
Copy link
Contributor

Practically speaking, the answer is to restore from backup. The checksums are only sufficient to detect corruption, not to restore it. I can add that to the FAQ entry about it though.

@reini12345
Copy link

thx. With the updated healthcheck command the conatiner is now "healthy"

image

For people like me who have no idea about postgres, more documentation would not be bad. I had already started looking for a way to repair the database.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deployment Deployment related tasks 🗄️server
Projects
None yet
Development

Successfully merging this pull request may close these issues.

immich_postgres healthcheck not working
4 participants