New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
uptime: not updated after a crash #180
Comments
Plan:
Before v10:
|
See PR #182 for 10+ For 9.1 to 9.6: I'm wondering if Rule : If the oldest not-NULL of these dates is after the |
Hi, You can not rely on Moreover, terminating all backends to reset the shared_buffers is not a real restart on its own. What you seems to seek for is a way to detect a backend crashed and when it did (and I have no idea how to do it right now). |
Right, that is only because I cannot rely on backends restart time before PG10 (remotely at least). If you reset all stats time for all databases (even template1?), you usually know it.
From user's point of view, it is : connections dropped, transactions canceled.... Such a thing is usually worth an investigation. And I know no automated way to detect it with check_pga. Such a restart is not obvious on weekly charts. |
If it worth investigating (and it does), investigating imply you can read the logs which are packaed with WARNING/ERROR messages in such situation :) But I agree an alert from the supervision might be useful...if possible.
I suppose the cache hit miss ratio should drop after the shared buffers reset. |
Not so obvious if you do not really search for it. Especially on a weekly OPM graph. |
indeed. However, if you set alert on cache hit/miss ratio, you should catch one with a very very low ratio. I agree this is not the best and straight solution for this issue, but I have no other idea right now :/ |
Note that even for 10+, your solution is an non-direct side effet as well :/ A much better one, but not direct anyway... |
#182 merged (thanks ioguix). I do not see a way to detect the crash and restart before v10, so I close this issue. |
pg_conf_load_time() et pg_postmaster_start_time are not updated when a crash occurs and the postmaster restarts all its children. The uptime service does not raise an alert.
Idea: check pg_stat_activity.backend_start for some vital process like checkpointer? (10+)
I have no idea to track unexpected restarts before 10 though.
The text was updated successfully, but these errors were encountered: