Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

builtins,workload: add setseed function; use it to make RSW more deterministic #119042

Merged
merged 2 commits into from Feb 12, 2024

Conversation

rafiss
Copy link
Collaborator

@rafiss rafiss commented Feb 10, 2024

This implements the setseed function that appears in PostgreSQL.

We then use it to make the ORDER BY random()
clauses used in the random schema workload (RSW) deterministic.

Each worker goroutine of the RSW now runs a determistic sequence of
operations given the same COCKROACH_RANDOM_SEED. However, since the
workload is concurrent, the overall RSW is not deterministic, since the
interleaving of goroutine execution is random.

informs #117422

Release note (sql change): Added the setseed() builtin function. It sets
the seed for the random generator used by the random() builtin function.

@rafiss rafiss requested review from a team as code owners February 10, 2024 04:03
@rafiss rafiss requested review from srosenberg and removed request for a team February 10, 2024 04:03
@cockroach-teamcity
Copy link
Member

This change is Reviewable

This implements the function that appears in PostgreSQL.

Release note (sql change): Added the setseed() builtin function. It sets
the seed for the random generator used by the random() builtin function.
The recently added setseed builtin lets us make the ORDER BY random()
clauses used in this workload deterministic.

Each worker goroutine of the RSW now runs a determistic sequence of
operations given the same COCKROACH_RANDOM_SEED. However, since the
workload is concurrent, the overall RSW is not deterministic, since the
interleaving of goroutine execution is random.

This also makes the "deck" used to decide which operation to execute be
per-worker rather than global.

Release note: None
Copy link
Collaborator

@fqazi fqazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Nice work and this vastly improves things!

Reviewed 12 of 12 files at r1, 2 of 2 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @annrpom, @rafiss, @renatolabs, and @srosenberg)


pkg/workload/schemachange/operation_generator.go line 3259 at r2 (raw file):

	ctx context.Context, tx pgx.Tx, tableName tree.TableName, pctExisting int,
) (string, error) {
	if err := og.setSeedInDB(ctx, tx); err != nil {

Would it make sense for the worker to some how just do this once? I think that doesn't work because we can have expressions with random I think.

Copy link
Collaborator Author

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @annrpom, @fqazi, @renatolabs, and @srosenberg)


pkg/workload/schemachange/operation_generator.go line 3259 at r2 (raw file):

Previously, fqazi (Faizan Qazi) wrote…

Would it make sense for the worker to some how just do this once? I think that doesn't work because we can have expressions with random I think.

i think it needs to be per-op, due to interleaving execution of goroutines fighting against determinism. here's an example to see why:

scenario 1:

goroutine 1: CREATE TABLE t;
goroutine 1: DROP TABLE t;
goroutine 2: CREATE VIEW AS ... -- requires picking a table by sorting by random()

scenario 2:

goroutine 1: CREATE TABLE t;
goroutine 2: CREATE VIEW AS ... -- requires picking a table by sorting by random()
goroutine 1: DROP TABLE t;

in the first scenario, goroutine 2 will have invoked random() 0 times, since there were no tables to sort. in the second scenario, goroutine 2 will have invoked random() once, since it found a table to use.

however, the next time goroutine 2 uses random() we want it to generate the same number in both scenarios.

@rafiss
Copy link
Collaborator Author

rafiss commented Feb 12, 2024

tftr!

bors r+

@craig
Copy link
Contributor

craig bot commented Feb 12, 2024

Build succeeded:

@craig craig bot merged commit dce042d into cockroachdb:master Feb 12, 2024
11 of 12 checks passed
@rafiss rafiss deleted the deterministic-rsw branch February 13, 2024 21:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants