New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
builtins,workload: add setseed function; use it to make RSW more deterministic #119042
Conversation
06dc7bc
to
13758cf
Compare
This implements the function that appears in PostgreSQL. Release note (sql change): Added the setseed() builtin function. It sets the seed for the random generator used by the random() builtin function.
13758cf
to
837a8e4
Compare
The recently added setseed builtin lets us make the ORDER BY random() clauses used in this workload deterministic. Each worker goroutine of the RSW now runs a determistic sequence of operations given the same COCKROACH_RANDOM_SEED. However, since the workload is concurrent, the overall RSW is not deterministic, since the interleaving of goroutine execution is random. This also makes the "deck" used to decide which operation to execute be per-worker rather than global. Release note: None
837a8e4
to
a5d7d67
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work and this vastly improves things!
Reviewed 12 of 12 files at r1, 2 of 2 files at r2, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @annrpom, @rafiss, @renatolabs, and @srosenberg)
pkg/workload/schemachange/operation_generator.go
line 3259 at r2 (raw file):
ctx context.Context, tx pgx.Tx, tableName tree.TableName, pctExisting int, ) (string, error) { if err := og.setSeedInDB(ctx, tx); err != nil {
Would it make sense for the worker to some how just do this once? I think that doesn't work because we can have expressions with random I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @annrpom, @fqazi, @renatolabs, and @srosenberg)
pkg/workload/schemachange/operation_generator.go
line 3259 at r2 (raw file):
Previously, fqazi (Faizan Qazi) wrote…
Would it make sense for the worker to some how just do this once? I think that doesn't work because we can have expressions with random I think.
i think it needs to be per-op, due to interleaving execution of goroutines fighting against determinism. here's an example to see why:
scenario 1:
goroutine 1: CREATE TABLE t;
goroutine 1: DROP TABLE t;
goroutine 2: CREATE VIEW AS ... -- requires picking a table by sorting by random()
scenario 2:
goroutine 1: CREATE TABLE t;
goroutine 2: CREATE VIEW AS ... -- requires picking a table by sorting by random()
goroutine 1: DROP TABLE t;
in the first scenario, goroutine 2 will have invoked random()
0 times, since there were no tables to sort. in the second scenario, goroutine 2 will have invoked random()
once, since it found a table to use.
however, the next time goroutine 2 uses random()
we want it to generate the same number in both scenarios.
tftr! bors r+ |
Build succeeded: |
This implements the setseed function that appears in PostgreSQL.
We then use it to make the ORDER BY random()
clauses used in the random schema workload (RSW) deterministic.
Each worker goroutine of the RSW now runs a determistic sequence of
operations given the same COCKROACH_RANDOM_SEED. However, since the
workload is concurrent, the overall RSW is not deterministic, since the
interleaving of goroutine execution is random.
informs #117422
Release note (sql change): Added the setseed() builtin function. It sets
the seed for the random generator used by the random() builtin function.