-
Notifications
You must be signed in to change notification settings - Fork 246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix issue where Watch in Postgres was looping endlessly #1278
Conversation
018edc5
to
20e3dd6
Compare
internal/datastore/postgres/watch.go
Outdated
// In order to make progress, we need to ensure that any seen transactions here are | ||
// marked as done in the revision given back to Postgres on the next iteration. We pick | ||
// the *last* transaction to start, as it should encompass all completed transactions | ||
// except those running concurrenctly, which is handled by calling markComplete on the other |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// except those running concurrenctly, which is handled by calling markComplete on the other | |
// except those running concurrently, which is handled by calling markComplete on the other |
internal/datastore/postgres/watch.go
Outdated
// the *last* transaction to start, as it should encompass all completed transactions | ||
// except those running concurrenctly, which is handled by calling markComplete on the other | ||
// transactions. | ||
currentTxn = newTxns[len(newTxns)-1].postgresRevision |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a big difference, but you could move this to the end of the block and avoid the computation if there are errors in the watch.
} | ||
|
||
_, err = tx.Exec(ctx, fmt.Sprintf( | ||
`INSERT INTO %s ("xid", "snapshot") VALUES ('%d', '%d:%d:')`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: same exact query above templated the column names, would be nice to be consistent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As its a test, I think its fine
colCaveatContext, | ||
colCreatedXid, | ||
nextx+1, | ||
)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the fix works by bumping the xid in the query forward, wouldn't it make sense to include a third write here that adds a _non_overlapping row at nextx+2
to ensure it doesn't get skipped (It should be fine because the query uses >=
, but just to be defensive)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted the test to exactly reproduce what the user reported; I could add another test if you like?
require.True(pds.watchEnabled) | ||
|
||
prev := rev.(postgresRevision) | ||
nextx := prev.snapshot.xmax + 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
super nit, feel free to ignore:
nextx := prev.snapshot.xmax + 1 | |
nexttx := prev.snapshot.xmax + 1 |
20e3dd6
to
4a4f02b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This occurred if there were two exactly overlapping transactions, which caused the watch code to repeatedly "move backwards" when looking up the next revision The fix is to ensure that all found transactions are accounted in the overall transaction revision passed to the next iteration of the watch loop Fixes authzed#1272 Thanks to @aberkovsky for the bug report and detailed repro information!
4a4f02b
to
756bf15
Compare
This occurred if there were two exactly overlapping transactions, which caused the watch code to repeatedly "move backwards" when looking up the next revision
The fix is to ensure that all found transactions are accounted in the overall transaction revision passed to the next iteration of the watch loop
Fixes #1272
Thanks to @aberkovsky for the bug report and detailed repro information!