Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a new oldest_xmin check. #248

Merged
merged 9 commits into from Nov 4, 2020
Merged

Add a new oldest_xmin check. #248

merged 9 commits into from Nov 4, 2020

Conversation

rjuju
Copy link
Member

@rjuju rjuju commented Dec 23, 2019

This implements the check mentioned in #201.

I'm not extremely sure of the pg_stat_activity part. The problems I can see are:

  • as soon as eg. a transaction retains some xid, active queries started after will also look like they're retaining the same amount of transactions. However, if t hose queries continue to run after the initial transaction is finished, they'll be the source of retention. So it's quite hard to really identify what actually retains the xmin horizons from an aggregated view. The query is already complex enough, so I'm not in favor of adding extra steps to try to be smarter about that.

  • For pg_stat_activity, I'm not exactly sure if both backend_xmin and backend_xid can both be earlier than the other. Minimal testing showed than coalesce(backend_xmin, backend_xid) was enough, but I certainly didn't try a lot of scenario.

check_pgactivity Outdated Show resolved Hide resolved
check_pgactivity Outdated Show resolved Hide resolved
check_pgactivity Show resolved Hide resolved
check_pgactivity Outdated Show resolved Hide resolved
check_pgactivity Show resolved Hide resolved
check_pgactivity Outdated Show resolved Hide resolved
@ioguix ioguix self-requested a review December 30, 2019 16:37
@rjuju
Copy link
Member Author

rjuju commented Dec 31, 2019

As discussed off thread, I realized that I forgot to also handle the impacted databases. I just pushed an additional commit (that should be squashed, after a rebase I'll take care of that once everything is ok) to fix both that and @ioguix comments.

check_pgactivity Outdated Show resolved Hide resolved
check_pgactivity Outdated Show resolved Hide resolved
check_pgactivity Outdated Show resolved Hide resolved
check_pgactivity Show resolved Hide resolved
check_pgactivity Outdated Show resolved Hide resolved
@ioguix
Copy link
Member

ioguix commented Oct 29, 2020

Hey @rjuju,

I've been reviewing and testing. I closed all our active conversations as we finished discussing all of them.

I fixed a bug in the query were some NaN perfdata (idle_xact or active_xact or query) were not shown for database when they were set in other databases.

Please, do some test/review and fixup/merge when you are ok.

@ioguix
Copy link
Member

ioguix commented Oct 29, 2020

BTW, with only 4 connectable db (including template1 and postgres), I gathered 48 perfdatas. 12 metrics per database.

Maybe we can add a --summary arg or equivalent to only provide in perfdata the oldest xmin/age per database, and report in long messages what is the oldest horizon for each database?

@frost242
Copy link
Member

+1 for a --summary option.

@rjuju
Copy link
Member Author

rjuju commented Oct 29, 2020

Thanks @ioguix
I'm fine with a --summary option. However, just to be clear I don't expect anyone using that check for all databases nor for a long time. It's really a diagnosis check to help track down bloat issue happening at some random interval, but the rest of the time you don't really need to monitor it.

I'll review the changes soon.

@ioguix ioguix self-requested a review October 30, 2020 16:26
@ioguix
Copy link
Member

ioguix commented Oct 30, 2020

As discussed on IRC, I added a few commits:

  • thresholds are now optionnal
  • the check is now terse by default: one perfdata (oldest age) per database plus one info per database (what is the kind of the oldest xmin) in messages if there's some data to show.
  • new arg --detailed throw detailed perfdatas (12) for each database

Please, review.

++

@rjuju
Copy link
Member Author

rjuju commented Nov 4, 2020

I'm fine with the new behavior. About the patch, I think we should document the new --detailed option along with the other ones.

I also see that you don't check for compatibility with this option. Is that intended? I'm not sure if anyone will blindly add a --detailed everywhere just in case so that's may not be a huge problem, but until now our current approach is to explicitly check for every single option compatibility.

@ioguix
Copy link
Member

ioguix commented Nov 4, 2020

About the patch, I think we should document the new --detailed option along with the other ones.

Well, a lots of specifics argument are not documented with other, normal, ones. Specifics ones are only documented in their related service. Eg.: --query, --path, --type, --suffix, --uid, GUCs, --unarchiver, etc.

I also see that you don't check for compatibility with this option. Is that intended?

Fixed in last commit.

@ioguix ioguix merged commit d3199fa into master Nov 4, 2020
@ioguix ioguix deleted the oldest_xmin branch November 4, 2020 14:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants