Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adapter: correct initial timestamp determination in transactions #20267

Merged
merged 1 commit into from Jul 5, 2023

Conversation

maddyblue
Copy link
Contributor

@maddyblue maddyblue commented Jun 30, 2023

In #19026 various timestamp functions were refactored to allow EXPLAIN TIMESTAMP to reflect (and effect) the current transaction. This is tricky code and the refactor incorrectly changed the behavior of transactions. In a transaction our intention is to find a timestamp during the first statement that will be valid for any other object a user might query later in the transaction, then aquire read holds on all those referenced objects. In that change, the first statement only used the referenced IDs when determining a timestamp, not all nearby (objects in the same schema) IDs. Read holds were still correctly aquired on all nearby objects.

This could cause the transaction's timestamp to not be in advance of the since of other nearby objects. For example, take two objects in the table. At the first query, object A has a since of 10, object B has a since of 20. A first query in the transaction SELECT * FROM A would choose 10 as its timestamp, then aquire read holds on A and B at 10. A later query in the transaction SELECT * FROM B would choose 10 as its timestamp (which it fetched from the in-progress transaction state), and panic because B's since is in advance of 10.

One of the reasons this happened is the transaction metadata only held on to the TimestampContext. EXPLAIN TIMESTAMP needs a larger struct, a TimestampDetermination. Teach the session metadata to track this outer struct so EXPLAIN can use it. This allows us to now only call determine_timestamp once during transaction start, instead of during each query in a transaction. Refactor some of the other read handling code to look more like it did pre-19026.

After this code, the test repro from 19743 no longer causes crashes.

Fixes #19743
Fixes #19897
Fixes #19398

Motivation

  • This PR fixes a recognized bug.

Tips for reviewer

Checklist

  • This PR has adequate test coverage / QA involvement has been duly considered.
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • This PR includes the following user-facing behavior changes:
    • Fix timestamp generation for transactions with multiple statements that could lead to crashes.

@maddyblue maddyblue requested a review from a team as a code owner June 30, 2023 10:31
@maddyblue
Copy link
Contributor Author

There's no tests for this yet because there weren't before and writing a test scenario for exactly this is not currently within the realm of what our test code can do deterministically. We should add tests for this (both randomized and deterministic), but those may take some days to write and I think it may be better to get this into the release before those because we're already in a known broken state. The by-hand tests I did worked, so some tests have been done, just none automated.

Copy link
Contributor

@def- def- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on Philip's local repro the issue seems fixed. A deterministic test would be even better of course.

@jkosh44
Copy link
Contributor

jkosh44 commented Jun 30, 2023

I understand how the problem described leads to the errors in #19743, but I'm having trouble figuring out how they lead to the identifier missing errors in the other issues. Would you be able to include a description of that in your explanation?

src/adapter/src/coord/sequencer/inner.rs Outdated Show resolved Hide resolved
Comment on lines +2119 to 2121
source_bundle: &CollectionIdBundle,
source_ids: &BTreeSet<GlobalId>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an existing issue, but I'm finding it very difficult to tell the difference between source_ids and source_bundle. source_ids seems to be IDs directly referenced in the query, while source_bundle seems to be the result of a call to sufficient_collections on an index_oracle.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not at all important for this PR, I just wanted to call it out.

@benesch
Copy link
Member

benesch commented Jul 2, 2023

You had marked this as closing #20197 but I'm not seeing how that issue is connected to this PR, so I removed that from the PR description. Let me know if I've erred.

In MaterializeInc#19026 various timestamp functions were refactored to allow `EXPLAIN
TIMESTAMP` to reflect (and effect) the current transaction. This is
tricky code and the refactor incorrectly changed the behavior of
transactions. In a transaction our intention is to find a timestamp
during the first statement that will be valid for any other object a
user might query later in the transaction, then aquire read holds on all
those referenced objects. In that change, the first statement only used
the referenced IDs when determining a timestamp, not all nearby (objects
in the same schema) IDs. Read holds were still correctly aquired on all
nearby objects.

This could cause the transaction's timestamp to not be in advance of the
since of other nearby objects. For example, take two objects in the
table. At the first query, object A has a since of 10, object B has a
since of 20. A first query in the transaction `SELECT * FROM A` would
choose 10 as its timestamp, then aquire read holds on A and B at 10. A
later query in the transaction `SELECT * FROM B` would choose 10 as its
timestamp (which it fetched from the in-progress transaction state), and
panic because B's since is in advance of 10.

One of the reasons this happened is the transaction metadata only held
on to the TimestampContext. `EXPLAIN TIMESTAMP` needs a larger struct, a
TimestampDetermination. Teach the session metadata to track this outer
struct so EXPLAIN can use it. This allows us to now only call
determine_timestamp once during transaction start, instead of during
each query in a transaction. Refactor some of the other read handling
code to look more like it did pre-19026.
@jkosh44
Copy link
Contributor

jkosh44 commented Jul 5, 2023

I understand how the problem described leads to the errors in #19743, but I'm having trouble figuring out how they lead to the identifier missing errors in the other issues. Would you be able to include a description of that in your explanation?

Just wanted to bump this comment. I'm still a bit confused about this.

Copy link
Contributor

@jkosh44 jkosh44 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@maddyblue
Copy link
Contributor Author

I understand how the problem described leads to the errors in #19743, but I'm having trouble figuring out how they lead to the identifier missing errors in the other issues. Would you be able to include a description of that in your explanation?

I don't have an explanation about why this error happened or why this PR no longer is producing that error. My best guess is that those objects were dropped in other connections, and because the read hold was made for a time in the future, the controller thought it was safe to remove them. Then later when a query comes along for them, they don't exist.

@maddyblue maddyblue merged commit e628185 into MaterializeInc:main Jul 5, 2023
62 checks passed
@maddyblue maddyblue deleted the fix-sequence-timestamp branch July 5, 2023 22:33
@def- def- mentioned this pull request Jul 5, 2023
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants