Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't generate monitoring snapshots for statements which only reference MON$DATABASE #7567

Open
mrotteveel opened this issue May 5, 2023 · 18 comments

Comments

@mrotteveel
Copy link
Member

Currently, referencing any monitoring table will generate a monitoring snapshot. This is a relatively costly operation, and it should be unnecessary for queries which only reference the MON$DATABASE monitoring table.

Excluding MON$DATABASE from generating a monitoring table should make it cheaper to use that table.

@AlexPeshkoff
Copy link
Member

That's not as good as seems at the first glance. First of all, snapshot is created not for statement but for transaction. And nobody can guarantee that such transaction does not reference other monitoring tables later. I.e. we can have inconsistency between MON$DATABASE and other monitoring tables.

@hvlad
Copy link
Member

hvlad commented May 5, 2023

Agree with @AlexPeshkoff.

Instead of creating "short path"'s for some very special cases, we need to re-think whole monitoring architecture, imho.

@mrotteveel
Copy link
Member Author

That's not as good as seems at the first glance. First of all, snapshot is created not for statement but for transaction. And nobody can guarantee that such transaction does not reference other monitoring tables later. I.e. we can have inconsistency between MON$DATABASE and other monitoring tables.

What inconsistencies are you thinking of? I don't see which data in MON$DATABASE could become inconsistent with other monitoring tables.

@AlexPeshkoff
Copy link
Member

AlexPeshkoff commented May 5, 2023 via email

@aafemt
Copy link
Contributor

aafemt commented May 5, 2023

Transaction and attachment numbers.

+1 for rethinking whole system. IMHO, monitoring tables must be dirty read (may be with cursor stability), i.e. formed at the moment the request access them.

@mrotteveel
Copy link
Member Author

For example transaction with next transaction number may arrive in mon$transactions.

I can't come up with a scenario where that would result in an inconsistency or a problem. I think it would be an acceptable risk.

@livius2
Copy link

livius2 commented May 5, 2023

Maybe better add alias for call of.
Instead of MON$DATABASE add alias like e.g. TMP$DATABASE which do not take whole snapshot but is the same as MON$DATABASE.

@AlexPeshkoff
Copy link
Member

AlexPeshkoff commented May 5, 2023 via email

@dyemanov
Copy link
Member

dyemanov commented May 5, 2023

I don't mind rethinking the original idea. Just want to mention two points that caused the snapshot being transaction-level.

  1. Different calls to the MON$ tables inside one transaction should return consistent results. Yes, usually only statement-level consistency is enough. But sometimes you may want to query MON$ATTACHMENTS and only later query e.g. MON$IO_STATS for some particular attachment. And you expect IDs to be consistent between these two queries.

  2. Different calls to the MON$ tables inside one transaction should be fast. Monitoring never was lightning fast and while there were improvements in the performance area in v3 and also recently in v5, caching the snapshot could still be a good idea.

That said, I don't mind having both consistency options (transaction-level and statement-level) available, let's just define how it should be controlled by users.

@hvlad
Copy link
Member

hvlad commented May 5, 2023

I also consider new session control statements as a way to go.

For the beginning: things that could be managed:

  • snapshot scope: all tables or just required by query;
  • snapshot lifetime: per-query or until transaction end or explicitly defined by user.

@aafemt
Copy link
Contributor

aafemt commented May 5, 2023

But sometimes you may want to query MON$ATTACHMENTS and only later query e.g. MON$IO_STATS for some particular attachment. And you expect IDs to be consistent between these two queries.

And they aren't? I was sure that ID is attachment_id and it is stable during attachment lifetime. The same for transactions and statements.

Also I cannot say for everybody but if the attachment has disappeared at the moment when I query IO_STATS, it is fine for me to get nothing.

@livius2
Copy link

livius2 commented May 5, 2023

And they aren't?

You misunderstud ticket. It is for changing MON$DATABASE to not get whole monitoring snapshot.
And in the comment above it is warning if this is about to change it may produce such problems.
Now it is consistent throught whole transaction.

@aafemt
Copy link
Contributor

aafemt commented May 5, 2023

Yes, and I would like to hear how this inconsistency in IDs can appear if these IDs are stable during whole object's lifetime, not just for monitoring snapshot.

@dyemanov
Copy link
Member

dyemanov commented May 6, 2023

But sometimes you may want to query MON$ATTACHMENTS and only later query e.g. MON$IO_STATS for some particular attachment. And you expect IDs to be consistent between these two queries.

And they aren't? I was sure that ID is attachment_id and it is stable during attachment lifetime. The same for transactions and statements.

Attachment/transaction/statement IDs are stable. But MON$*_STATS tables have an artificial primary key which is globally unique in the shared memory but remapped to snapshot-level artificial IDs when snapshot is created, so two different snapshots may have two different IDs for the same object.

@dyemanov
Copy link
Member

dyemanov commented May 6, 2023

I also consider new session control statements as a way to go.

For the beginning: things that could be managed:

* snapshot scope: all tables or just required by query;

* snapshot lifetime: per-query or until transaction end or explicitly defined by user.

Snapshot scope is meaningless for statement-level snapshots, as we already know all tables accessed by the statement. And while I agree that we could control that scope for transaction-level snapshots, I'm not really sure this is needed. If we expect the dynamically extended snapshot (with tables loaded by demand) being possibly inconsistent between its tables, then user may just use statement-level snapshot with the same side effects. The only useful usage case that comes to mind is when user wants to get the small snapshot ASAP (without loading huge mon$compiled_statements, for example) and will query this snapshot later in the same transaction which also must be fast. I dunno how common this is in practice.

@sim1984
Copy link

sim1984 commented May 6, 2023

Why not simply link the scope of the snapshot to the isolation level of the transaction? Read committed snapshot at the statement level, snapshot - at the transaction level.

@dyemanov
Copy link
Member

dyemanov commented May 6, 2023

It's also an option, but what if someone need to use a different snapshot level than its transaction (which is generally unknown inside a procedure)? Using an autonomous transaction could help, but only if it allows to override the parent transaction options which AFAIK is currently impossible (although IMO should be supported).

@AlexPeshkoff
Copy link
Member

AlexPeshkoff commented May 7, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants