-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
asim: add randomness to range generation #106311
Labels
A-kv-simulation
Relating to allocation simulation.
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-kv
KV Team
Projects
Comments
7 tasks
wenyihu6
added a commit
to wenyihu6/cockroach
that referenced
this issue
Jul 18, 2023
This patch lays the backbone of the randomized testing framework. Currently, it only supports default configuration for all options, implying that there is no randomization yet. Additionally, it refactors some of the existing structure in data_driven_test. Note that this should not change any existing behavior, and the main purpose is to make future commits cleaner. Part of: cockroachdb#106311 Release note: None
wenyihu6
added a commit
to wenyihu6/cockroach
that referenced
this issue
Jul 18, 2023
This patch takes the first step towards a randomized framework by enabling asim testing to randomly select a cluster information configuration from a set of predefined choices. These choices are hardcoded and represent common cluster configurations. Part of: cockroachdb#106311 Release note: None
wenyihu6
added a commit
to wenyihu6/cockroach
that referenced
this issue
Jul 20, 2023
This patch lays the backbone of the randomized testing framework. Currently, it only supports default configuration for all options, implying that there is no randomization yet. Additionally, it refactors some of the existing structure in data_driven_test. Note that this should not change any existing behavior, and the main purpose is to make future commits cleaner. Part of: cockroachdb#106311 Release note: None
wenyihu6
added a commit
to wenyihu6/cockroach
that referenced
this issue
Jul 21, 2023
This patch takes the first step towards a randomized framework by enabling asim testing to randomly select a cluster information configuration from a set of predefined choices. These choices are hardcoded and represent common cluster configurations. Part of: cockroachdb#106311 Release note: None
wenyihu6
added a commit
to wenyihu6/cockroach
that referenced
this issue
Aug 21, 2023
Previously, the randomized testing framework depends on default settings hardcoded in the tests, requiring users to change code-configured parameters to change the settings. This patch converts the framework to a data-driven approach, enabling more dynamic user inputs, more testing examples, and greater visibility into what each iteration is testing. TestRandomized is a randomized data-driven testing framework that validates allocators by creating randomized configurations. It is designed for regression and exploratory testing. **There are three modes for every aspect of randomized generation.** - Static Mode: 1. If randomization options are disabled (e.g. no rand_ranges command is used), the system uses the default configurations (defined in default_settings.go) with no randomization. - Randomized: two scenarios occur: 2. Use default settings for randomized generation (e.g.rand_ranges) 3. Use settings specified with commands (e.g.rand_ranges range_gen_type=zipf) **The following commands are provided:** ``` 1. "rand_cluster" [cluster_gen_type=(single_region|multi_region|any_region)] e.g. rand_cluster cluster_gen_type=(multi_region) - rand_cluster: randomly picks a predefined cluster configuration according to the specified type. - cluster_gen_type (default value is multi_region) is cluster configuration type. On the next eval, the cluster is generated as the initial state of the simulation. 2. "rand_ranges" [placement_type=(even|skewed|random|weighted_rand)] [replication_factor=<int>] [range_gen_type=(uniform|zipf)] [keyspace_gen_type=(uniform|zipf)] [weighted_rand=(<[]float64>)] e.g. rand_ranges placement_type=weighted_rand weighted_rand=(0.1,0.2,0.7) e.g. rand_ranges placement_type=skewed replication_factor=1 range_gen_type=zipf keyspace_gen_type=uniform - rand_ranges: randomly generate a distribution of ranges across stores based on the specified parameters. On the next call to eval, ranges and their replica placement are generated and loaded to initial state. - placement_type(default value is even): defines the type of range placement distribution across stores. Once set, it remains constant across iterations with no randomization involved. - replication_factor(default value is 3): represents the replication factor of each range. Once set, it remains constant across iterations with no randomization involved. - range_gen_type(default value is uniform): represents the type of distribution used to yield the range parameter as ranges are generated across iterations (range ∈[1, 1000]). - keyspace_gen_type: represents the type of distribution used to yield the keyspace parameter as ranges are generated across iterations (keyspace ∈[1000,200000]). - weighted_rand: specifies the weighted random distribution among stores. Requirements (will panic otherwise): 1. weighted_rand should only be used with placement_type=weighted_rand and vice versa. 2. Must specify a weight between [0.0, 1.0] for each element in the array, with each element corresponding to a store 3. len(weighted_rand) cannot be greater than number of stores 4. sum of weights in the array should be equal to 1 3. "eval" [seed=<int64>] [num_iterations=<int>] [duration=<time.Duration>] [verbose=<bool>] e.g. eval seed=20 duration=30m2s verbose=true - eval: generates a simulation based on the configuration set with the given commands. - seed(default value is int64(42)): used to create a new random number generator which will then be used to create a new seed for each iteration. - num_iterations(default value is 3): specifies the number of simulations to run. - duration(default value is 10m): defines duration of each iteration. - verbose(default value is false): if set to true, plots all stat(as specified by defaultStat) history. ``` RandTestingFramework is initialized with specified testSetting and maintains its state across all iterations. It repeats the test with different random configurations. Each iteration in RandTestingFramework executes the following steps: 1. Generates a random configuration: based on whether randOption is on and the specific settings for randomized generation. 2. Executes the simulation and checks the assertions on the final state. 3. Stores any outputs and assertion failures in a buffer. Release note: None Part Of: cockroachdb#106311
wenyihu6
added a commit
to wenyihu6/cockroach
that referenced
this issue
Aug 21, 2023
Previously, the randomized testing framework's output only indicates whether each iteration passes. This lack of of detail makes checking the randomized framework and debugging challenging. This patch adds more info to the output, including the selected configurations, the initial state of each simulation. Additionally, this patch removes the verbosity flag for printing history plots as it does not seem to have any practical use cases. New verbosity flags for eval are now supported. ``` "eval" [verbose=(<[]("result_only","test_settings","initial_state","config_gen","topology","all")>)] - verbose(default value is OutputResultOnly): used to set flags on what to show in the test output messages. By default, all details are displayed upon assertion failure. - result_only: only shows whether the test passed or failed, along with any failure messages - test_settings: displays settings used for the repeated tests - initial_state: displays the initial state of each test iteration - config_gen: displays the input configurations generated for each test iteration - topology: displays the topology of cluster configurations - all: display everything above ``` Part Of: [cockroachdb#106311](https://github.com/kvoli/cockroach/issues/106311) Release Note: none
wenyihu6
added a commit
to wenyihu6/cockroach
that referenced
this issue
Aug 21, 2023
This patch allows users to modify the settings for the static mode within the randomized testing framework. The following command is now supported: ``` 4. “change_static_option”[nodes=<int>][stores_per_node=<int>] [rw_ratio=<float64>] [rate=<float64>] [min_block=<int>] [max_block=<int>] [min_key=<int64>] [max_key=<int64>] [skewed_access=<bool>] [ranges=<int>] [placement_type=<gen.PlacementType>] [key_space=<int>] [replication_factor=<int>] [bytes=<int64>] [stat=<string>] [height=<int>] [width=<int>] e.g. change_static_option nodes=2 stores_per_node=3 placement_type=skewed - Change_static_option: modifies the settings for the static mode where no randomization is involved. Note that this does not change the default settings for any randomized generation. - nodes (default value is 3): number of nodes in the generated cluster - storesPerNode (default value is 1): number of store per nodes in the generated cluster - rwRatio (default value is 0.0): read-write ratio of the generated load - rate (default value is 0.0): rate at which the load is generated - minBlock (default value is 1): min size of each load event - maxBlock (default value is 1): max size of each load event - minKey (default value is int64(1)): min key of the generated load - maxKey (default value is int64(200000)): max key of the generated load - skewedAccess (default value is false): is true, workload key generator is skewed (zipf) - ranges (default value is 1): number of generated ranges - keySpace (default value is 200000): keyspace for the generated range - placementType (default value is gen.Even): type of distribution for how ranges are distributed across stores - replicationFactor (default value is 3): number of replica for each range - bytes (default value is int64(0)): size of each range in bytes - stat (default value is “replicas”): specifies the output to be plotted for the verbose option - height (default value is 15): height of the plot - width (default value is 80): width of the plot In addition, verbose=(static_settings) can now be used to display settings used for static options where no randomization is involved. ``` Part of: cockroachdb#106311 Release note: None
wenyihu6
added a commit
to wenyihu6/cockroach
that referenced
this issue
Aug 21, 2023
Previously, we enforce that the length of a given `weighted_rand` cannot exceed the number of stores. This was challenging for users as they might not know the cluster configuration that would be generated and thus do not know the number of stores. In addition, if the length of `weighted_rand` was less than total number of stores, any stores outside of the `weighted_rand` range would simply have zero replicas. This could lead to confusion. To improve user control, this patch disables the use of weighted_rand with randomized cluster generation. Requirements to use weighted_rand: 1. use static option for cluster generation 2. specify nodes(default:3) and stores_per_node(default:1) through change_static_option 3. ensure len(weighted_rand) == number of stores == nodes * stores_per_node In addition to these new rules, the following existing requirements remain in place: 1. weighted_rand should only be used with placement_type=weighted_rand and vice versa. 2. must specify a weight between [0.0, 1.0] for each element in the array, with each element corresponding to a store 3. sum of weights in the array should be equal to 1 Resolves: cockroachdb#106311 Release note: None
craig bot
pushed a commit
that referenced
this issue
Aug 21, 2023
…109142 #109152 #109156 #109157 #109161 #109165 #109166 #109172 107957: asim: convert randomized testing to data-driven r=kvoli a=wenyihu6 **asim: remove extra parsing for []float64, float64, time.Duration** In cockroachdb/datadriven#45, we upstreamed the scanning implementation in `datadriven` library. We can now handle parsing of []float64, float64, and time.Duration without additional handling. Release Note: none Epic: none --- **asim: enable user-defined repliFactor, placement in rand range_gen** This patch introduces two additional options for randomized range generations, letting users define replication factor and placement type. Although some aspects of ranges configs are randomly generated (ranges and keyspace), these two configurations are not randomized. Once set by the user, the configuration will persist across iterations. Release Note: none Part Of: #106311 --- **asim: convert randomized testing to data-driven** Previously, the randomized testing framework depends on default settings hardcoded in the tests, requiring users to change code-configured parameters to change the settings. This patch converts the framework to a data-driven approach, enabling more dynamic user inputs, more testing examples, and greater visibility into what each iteration is testing. TestRandomized is a randomized data-driven testing framework that validates allocators by creating randomized configurations. It is designed for regression and exploratory testing. **There are three modes for every aspect of randomized generation.** - Static Mode: 1. If randomization options are disabled (e.g. no rand_ranges command is used), the system uses the default configurations (defined in default_settings.go) with no randomization. - Randomized: two scenarios occur: 2. Use default settings for randomized generation (e.g.rand_ranges) 3. Use settings specified with commands (e.g.rand_ranges range_gen_type=zipf) **The following commands are provided:** ``` 1. "rand_cluster" [cluster_gen_type=(single_region|multi_region|any_region)] e.g. rand_cluster cluster_gen_type=(multi_region) - rand_cluster: randomly picks a predefined cluster configuration according to the specified type. - cluster_gen_type (default value is multi_region) is cluster configuration type. On the next eval, the cluster is generated as the initial state of the simulation. 2. "rand_ranges" [placement_type=(even|skewed|random|weighted_rand)] [replication_factor=<int>] [range_gen_type=(uniform|zipf)] [keyspace_gen_type=(uniform|zipf)] [weighted_rand=(<[]float64>)] e.g. rand_ranges placement_type=weighted_rand weighted_rand=(0.1,0.2,0.7) e.g. rand_ranges placement_type=skewed replication_factor=1 range_gen_type=zipf keyspace_gen_type=uniform - rand_ranges: randomly generate a distribution of ranges across stores based on the specified parameters. On the next call to eval, ranges and their replica placement are generated and loaded to initial state. - placement_type(default value is even): defines the type of range placement distribution across stores. Once set, it remains constant across iterations with no randomization involved. - replication_factor(default value is 3): represents the replication factor of each range. Once set, it remains constant across iterations with no randomization involved. - range_gen_type(default value is uniform): represents the type of distribution used to yield the range parameter as ranges are generated across iterations (range ∈[1, 1000]). - keyspace_gen_type: represents the type of distribution used to yield the keyspace parameter as ranges are generated across iterations (keyspace ∈[1000,200000]). - weighted_rand: specifies the weighted random distribution among stores. Requirements (will panic otherwise): 1. weighted_rand should only be used with placement_type=weighted_rand and vice versa. 2. Must specify a weight between [0.0, 1.0] for each element in the array, with each element corresponding to a store 3. len(weighted_rand) cannot be greater than number of stores 4. sum of weights in the array should be equal to 1 3. "eval" [seed=<int64>] [num_iterations=<int>] [duration=<time.Duration>] [verbose=<bool>] e.g. eval seed=20 duration=30m2s verbose=true - eval: generates a simulation based on the configuration set with the given commands. - seed(default value is int64(42)): used to create a new random number generator which will then be used to create a new seed for each iteration. - num_iterations(default value is 3): specifies the number of simulations to run. - duration(default value is 10m): defines duration of each iteration. - verbose(default value is false): if set to true, plots all stat(as specified by defaultStat) history. ``` RandTestingFramework is initialized with specified testSetting and maintains its state across all iterations. It repeats the test with different random configurations. Each iteration in RandTestingFramework executes the following steps: 1. Generates a random configuration: based on whether randOption is on and the specific settings for randomized generation. 2. Executes the simulation and checks the assertions on the final state. 3. Stores any outputs and assertion failures in a buffer. Release note: None Part Of: #106311 108185: server: remove support for sticky engines r=itsbilal a=jbowens Remove support for reusing engines from the StickyVFSRegistry. Tests should not depend on ephemeral, in-memory engine state between server restarts, or read closed Engine state. Close #108119. 108467: sql: implement oidvectortypes builtin r=fqazi a=fqazi Previously, the oidvectortypes builtin in wasn't implemented, causing a compatibility gap for tools that need to format oidvectors. To address this, this patch adds the oidvectortypes built in. Fixes: #107942 Release note (sql change): The oidvectortypes built-in has been implemented, which can format oidvector. 108678: closedts: make settings TenantReadOnly and public r=erikgrinaker a=erikgrinaker It doesn't make sense for these to be `TenantWritable`, since the side transport runs below KV. Furthermore, these settings are referenced throughout our documentation, so make them public. These should really be set only for the system tenant, and secondary tenants could simply read the system tenant's setting. This functionality runs in the host cluster below KV and it doesn't make any sense to set individual settings for tenants here. Unfortunately, this isn't currently possible with the existing settings classes, there is no way for secondary tenants to access the host's settings. Touches #108677. Epic: none Release note (ops change): The following closed timestamp side-transport settings can no longer be set from secondary tenants (they did not have an effect in secondary tenants): kv.closed_timestamp.target_duration, kv.closed_timestamp.side_transport_interval, and kv.closed_timestamp.lead_for_global_reads_override. 108845: sql: add last_updated column to crdb_internal.kv_protected_ts_records r=jayshrivastava a=jayshrivastava This change adds a `last_updated` column to the protected timestamps virtual table. This column contains the mvcc timestamp of the row. Having this column present in this table, which is included in debug zips, improves observability when debugging issues. Informs: #104161 Release note: None Epic: None 109029: sql: fix TestCreateStatisticsCanBeCancelled txn retry hang r=fqazi a=fqazi Previously, this test could hang if there was an automatic stats came in concurrently with a manual stats collection, where the request filter would end up hanging and being called twice. To address this patch will disable automatic stats collections on the table. Fixes: #109007 Release note: None 109049: concurrency: allow multiple transactions to hold locks on a single key r=nvanbenschoten a=arulajmani Locks on a single key are stored in the `lockState` struct. Prior to this patch, the lock table only expected a single transaction to hold a lock on a given key at any point in time. This restriction needs to be lifted for shared locks, whose semantics allow multiple transactions to hold locks on a single key. This patch changes the `lockState` datastructure so that it can be generalized in the future. We don't actually allow multiple transactions to acquire locks on a single key just yet -- that'll come in a subsequent patch. Informs #91545 Release note: None 109087: storage: defer putBuffer release in all cases r=nvanbenschoten a=nvanbenschoten Minor cleanup. This commit switches the remainder of the calls to putBuffer.release to be deferred, instead of being manually called at the end of their function. The comments mentioning that the defer was "measurably slower" were introduced in 4444618, which was before Go 1.14 optimized the performance of defer. Most of these, including the more performance-sensitive calls, were already switched over to use defer in fbe8852. Epic: None Release note: None 109142: roachtest: Cast snapshot-recd bytes to int in disagg-rebalance r=jbowens a=itsbilal Previously we were reading a float value as an int, which would trip up the Scan() method if the float value was large enough to be wired over in scientified notation eg. `2.3456E7`. This change ensures that Cockroach prints out the value as an integer to avoid the scan-time error in the roachtest. Fixes #109114. Epic: none Release note: None 109152: build: update some configurations for remote build execution r=rail a=rickystewart 1. Use the `large` pool of executors for `enormous` test targets 2. Add (temporary) network access to the following tests: `amazon_test`, `base_test`, `cloudprivilege_test`, `externalconn_test`, and `cockroach-go-testserver-upgrade-to-master` logictests. These erroneously have a dependency on network assets; bugs have been filed for each of these. Epic: CRDB-8308 Release note: None 109156: sql: version gate UNIQUE constraint with json column r=rafiss a=rafiss This prevents the usage of a json column in a unique constraint, until after the upgrade is finalized. fixes #108978 Release note: None 109157: ci,ui: don't lint `e2e-tests` r=sjbarag a=rickystewart This workspace has a huge download of `cypress` which was causing CI to flake. Epic: none Release note: None 109161: workload: add background qos to kv workload r=bananabrick a=bananabrick A --background-qos flag can be used in the kv workload to ensure that the generated work is treated as low priority by admission control. Epic: none Release note: None 109165: Revert "rangefeed/changefeed: Enable mux rangefeeds by default." r=erikgrinaker a=erikgrinaker This reverts commit de65c54. We decided to keep these disabled for another release, to get more real-world experience with it first. Touches #95781. Touches #105270. Release note (performance improvement): The following release note no longer applies: "mux range feeds reuse connection and workers across multiple range feeds. This mode is now enabled by default." 109166: build: more resources for building AWS dependency r=rail a=rickystewart This is a huge package with apparently a lot of auto-generated code that was causing OOM's on EngFlow RBE. This fixes it. Epic: none Release note: CRDB-8308 109172: storage: Fix panic in MVCCHistories test r=jbowens a=itsbilal storage_test.intentPrintingReadWriter previously did not support ReaderWithMustIterators. Epic: none Release note: None Co-authored-by: wenyihu6 <wenyi.hu@cockroachlabs.com> Co-authored-by: Jackson Owens <jackson@cockroachlabs.com> Co-authored-by: Faizan Qazi <faizan@cockroachlabs.com> Co-authored-by: Erik Grinaker <grinaker@cockroachlabs.com> Co-authored-by: Jayant Shrivastava <jayants@cockroachlabs.com> Co-authored-by: Arul Ajmani <arulajmani@gmail.com> Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com> Co-authored-by: Bilal Akhtar <bilal@cockroachlabs.com> Co-authored-by: Ricky Stewart <ricky@cockroachlabs.com> Co-authored-by: Rafi Shamim <rafi@cockroachlabs.com> Co-authored-by: Arjun Nair <nair@cockroachlabs.com>
wenyihu6
added a commit
to wenyihu6/cockroach
that referenced
this issue
Aug 22, 2023
Previously, the randomized testing framework's output only indicates whether each iteration passes. This lack of of detail makes checking the randomized framework and debugging challenging. This patch adds more info to the output, including the selected configurations, the initial state of each simulation. Additionally, this patch removes the verbosity flag for printing history plots as it does not seem to have any practical use cases. New verbosity flags for eval are now supported. ``` "eval" [verbose=(<[]("result_only","test_settings","initial_state","config_gen","topology","all")>)] - verbose(default value is OutputResultOnly): used to set flags on what to show in the test output messages. By default, all details are displayed upon assertion failure. - result_only: only shows whether the test passed or failed, along with any failure messages - test_settings: displays settings used for the repeated tests - initial_state: displays the initial state of each test iteration - config_gen: displays the input configurations generated for each test iteration - topology: displays the topology of cluster configurations - all: display everything above ``` Part Of: [cockroachdb#106311](https://github.com/kvoli/cockroach/issues/106311) Release Note: none
wenyihu6
added a commit
to wenyihu6/cockroach
that referenced
this issue
Aug 22, 2023
This patch allows users to modify the settings for the static mode within the randomized testing framework. The following command is now supported: ``` 4. “change_static_option”[nodes=<int>][stores_per_node=<int>] [rw_ratio=<float64>] [rate=<float64>] [min_block=<int>] [max_block=<int>] [min_key=<int64>] [max_key=<int64>] [skewed_access=<bool>] [ranges=<int>] [placement_type=<gen.PlacementType>] [key_space=<int>] [replication_factor=<int>] [bytes=<int64>] [stat=<string>] [height=<int>] [width=<int>] e.g. change_static_option nodes=2 stores_per_node=3 placement_type=skewed - Change_static_option: modifies the settings for the static mode where no randomization is involved. Note that this does not change the default settings for any randomized generation. - nodes (default value is 3): number of nodes in the generated cluster - storesPerNode (default value is 1): number of store per nodes in the generated cluster - rwRatio (default value is 0.0): read-write ratio of the generated load - rate (default value is 0.0): rate at which the load is generated - minBlock (default value is 1): min size of each load event - maxBlock (default value is 1): max size of each load event - minKey (default value is int64(1)): min key of the generated load - maxKey (default value is int64(200000)): max key of the generated load - skewedAccess (default value is false): is true, workload key generator is skewed (zipf) - ranges (default value is 1): number of generated ranges - keySpace (default value is 200000): keyspace for the generated range - placementType (default value is gen.Even): type of distribution for how ranges are distributed across stores - replicationFactor (default value is 3): number of replica for each range - bytes (default value is int64(0)): size of each range in bytes - stat (default value is “replicas”): specifies the output to be plotted for the verbose option - height (default value is 15): height of the plot - width (default value is 80): width of the plot In addition, verbose=(static_settings) can now be used to display settings used for static options where no randomization is involved. ``` Part of: cockroachdb#106311 Release note: None
wenyihu6
added a commit
to wenyihu6/cockroach
that referenced
this issue
Aug 22, 2023
Previously, we enforce that the length of a given `weighted_rand` cannot exceed the number of stores. This was challenging for users as they might not know the cluster configuration that would be generated and thus do not know the number of stores. In addition, if the length of `weighted_rand` was less than total number of stores, any stores outside of the `weighted_rand` range would simply have zero replicas. This could lead to confusion. To improve user control, this patch disables the use of weighted_rand with randomized cluster generation. Requirements to use weighted_rand: 1. use static option for cluster generation 2. specify nodes(default:3) and stores_per_node(default:1) through change_static_option 3. ensure len(weighted_rand) == number of stores == nodes * stores_per_node In addition to these new rules, the following existing requirements remain in place: 1. weighted_rand should only be used with placement_type=weighted_rand and vice versa. 2. must specify a weight between [0.0, 1.0] for each element in the array, with each element corresponding to a store 3. sum of weights in the array should be equal to 1 Resolves: cockroachdb#106311 Release note: None
wenyihu6
added a commit
to wenyihu6/cockroach
that referenced
this issue
Aug 24, 2023
Previously, the randomized testing framework's output only indicates whether each iteration passes. This lack of of detail makes checking the randomized framework and debugging challenging. This patch adds more info to the output, including the selected configurations, the initial state of each simulation. Additionally, this patch removes the verbosity flag for printing history plots as it does not seem to have any practical use cases. New verbosity flags for eval are now supported. ``` "eval" [verbose=(<[]("result_only","test_settings","initial_state","config_gen","topology","all")>)] - verbose(default value is OutputResultOnly): used to set flags on what to show in the test output messages. By default, all details are displayed upon assertion failure. - result_only: only shows whether the test passed or failed, along with any failure messages - test_settings: displays settings used for the repeated tests - initial_state: displays the initial state of each test iteration - config_gen: displays the input configurations generated for each test iteration - topology: displays the topology of cluster configurations - all: displays everything above ``` Part Of: [cockroachdb#106311](https://github.com/kvoli/cockroach/issues/106311) Release Note: none (cherry picked from commit 27e9bd438a6ef1f6f0ed7205d91f28c9b54fd314)
craig bot
pushed a commit
that referenced
this issue
Aug 25, 2023
108059: asim: better outputs for data-driven tests r=kvoli a=wenyihu6 **asim: sort before iterating over maps when printing** Previously, the simulator iterates over an unordered map when formatting and printing states, stores, and ranges, resulting in non-deterministic output. This patch addresses the issue by sorting the maps by key before printing and formatting. Release note: none Epic: none --- **asim: better outputs for data-driven tests** Previously, the randomized testing framework's output only indicates whether each iteration passes. This lack of of detail makes checking the randomized framework and debugging challenging. This patch adds more info to the output, including the selected configurations, the initial state of each simulation. Additionally, this patch removes the verbosity flag for printing history plots as it does not seem to have any practical use cases. New verbosity flags for eval are now supported. ``` "eval" [verbose=(<[]("result_only","test_settings","initial_state","config_gen","topology","all")>)] - verbose(default value is OutputResultOnly): used to set flags on what to show in the test output messages. By default, all details are displayed upon assertion failure. - result_only: only shows whether the test passed or failed, along with any failure messages - test_settings: displays settings used for the repeated tests - initial_state: displays the initial state of each test iteration - config_gen: displays the input configurations generated for each test iteration - topology: displays the topology of cluster configurations - all: displays everything above ``` Part Of: #106311 Release Note: none Co-authored-by: wenyihu6 <wenyi.hu@cockroachlabs.com>
wenyihu6
added a commit
to wenyihu6/cockroach
that referenced
this issue
Aug 25, 2023
This patch allows users to modify the settings for the static mode within the randomized testing framework. The following command is now supported: ``` 4. “change_static_option”[nodes=<int>][stores_per_node=<int>] [rw_ratio=<float64>] [rate=<float64>] [min_block=<int>] [max_block=<int>] [min_key=<int64>] [max_key=<int64>] [skewed_access=<bool>] [ranges=<int>] [placement_type=<gen.PlacementType>] [key_space=<int>] [replication_factor=<int>] [bytes=<int64>] [stat=<string>] [height=<int>] [width=<int>] e.g. change_static_option nodes=2 stores_per_node=3 placement_type=skewed - Change_static_option: modifies the settings for the static mode where no randomization is involved. Note that this does not change the default settings for any randomized generation. - nodes (default value is 3): number of nodes in the generated cluster - storesPerNode (default value is 1): number of store per nodes in the generated cluster - rwRatio (default value is 0.0): read-write ratio of the generated load - rate (default value is 0.0): rate at which the load is generated - minBlock (default value is 1): min size of each load event - maxBlock (default value is 1): max size of each load event - minKey (default value is int64(1)): min key of the generated load - maxKey (default value is int64(200000)): max key of the generated load - skewedAccess (default value is false): is true, workload key generator is skewed (zipf) - ranges (default value is 1): number of generated ranges - keySpace (default value is 200000): keyspace for the generated range - placementType (default value is gen.Even): type of distribution for how ranges are distributed across stores - replicationFactor (default value is 3): number of replica for each range - bytes (default value is int64(0)): size of each range in bytes - stat (default value is “replicas”): specifies the output to be plotted for the verbose option - height (default value is 15): height of the plot - width (default value is 80): width of the plot In addition, verbose=(static_settings) can now be used to display settings used for static options where no randomization is involved. ``` Part of: cockroachdb#106311 Release note: None
wenyihu6
added a commit
to wenyihu6/cockroach
that referenced
this issue
Aug 25, 2023
Previously, we enforce that the length of a given `weighted_rand` cannot exceed the number of stores. This was challenging for users as they might not know the cluster configuration that would be generated and thus do not know the number of stores. In addition, if the length of `weighted_rand` was less than total number of stores, any stores outside of the `weighted_rand` range would simply have zero replicas. This could lead to confusion. To improve user control, this patch disables the use of weighted_rand with randomized cluster generation. Requirements to use weighted_rand: 1. use static option for cluster generation 2. specify nodes(default:3) and stores_per_node(default:1) through change_static_option 3. ensure len(weighted_rand) == number of stores == nodes * stores_per_node In addition to these new rules, the following existing requirements remain in place: 1. weighted_rand should only be used with placement_type=weighted_rand and vice versa. 2. must specify a weight between [0.0, 1.0] for each element in the array, with each element corresponding to a store 3. sum of weights in the array should be equal to 1 Resolves: cockroachdb#106311 Release note: None
wenyihu6
added a commit
to wenyihu6/cockroach
that referenced
this issue
Aug 25, 2023
This patch allows users to modify the settings for the static mode within the randomized testing framework. The following command is now supported: ``` 4. “change_static_option”[nodes=<int>][stores_per_node=<int>] [rw_ratio=<float64>] [rate=<float64>] [min_block=<int>] [max_block=<int>] [min_key=<int64>] [max_key=<int64>] [skewed_access=<bool>] [ranges=<int>] [placement_type=<gen.PlacementType>] [key_space=<int>] [replication_factor=<int>] [bytes=<int64>] [stat=<string>] [height=<int>] [width=<int>] e.g. change_static_option nodes=2 stores_per_node=3 placement_type=skewed - Change_static_option: modifies the settings for the static mode where no randomization is involved. Note that this does not change the default settings for any randomized generation. - nodes (default value is 3): number of nodes in the generated cluster - storesPerNode (default value is 1): number of store per nodes in the generated cluster - rwRatio (default value is 0.0): read-write ratio of the generated load - rate (default value is 0.0): rate at which the load is generated - minBlock (default value is 1): min size of each load event - maxBlock (default value is 1): max size of each load event - minKey (default value is int64(1)): min key of the generated load - maxKey (default value is int64(200000)): max key of the generated load - skewedAccess (default value is false): is true, workload key generator is skewed (zipf) - ranges (default value is 1): number of generated ranges - keySpace (default value is 200000): keyspace for the generated range - placementType (default value is gen.Even): type of distribution for how ranges are distributed across stores - replicationFactor (default value is 3): number of replica for each range - bytes (default value is int64(0)): size of each range in bytes - stat (default value is “replicas”): specifies the output to be plotted for the verbose option - height (default value is 15): height of the plot - width (default value is 80): width of the plot In addition, verbose=(static_settings) can now be used to display settings used for static options where no randomization is involved. ``` Part of: cockroachdb#106311 Release note: None
craig bot
pushed a commit
that referenced
this issue
Aug 25, 2023
108099: asim: enforce len(weighted_rand) == number of stores r=kvoli a=wenyihu6 **asim: enable option to change static option settings** This patch allows users to modify the settings for the static mode within the randomized testing framework. The following command is now supported: ``` 4. “change_static_option”[nodes=<int>][stores_per_node=<int>] [rw_ratio=<float64>] [rate=<float64>] [min_block=<int>] [max_block=<int>] [min_key=<int64>] [max_key=<int64>] [skewed_access=<bool>] [ranges=<int>] [placement_type=<gen.PlacementType>] [key_space=<int>] [replication_factor=<int>] [bytes=<int64>] [stat=<string>] [height=<int>] [width=<int>] e.g. change_static_option nodes=2 stores_per_node=3 placement_type=skewed - Change_static_option: modifies the settings for the static mode where no randomization is involved. Note that this does not change the default settings for any randomized generation. - nodes (default value is 3): number of nodes in the generated cluster - storesPerNode (default value is 1): number of store per nodes in the generated cluster - rwRatio (default value is 0.0): read-write ratio of the generated load - rate (default value is 0.0): rate at which the load is generated - minBlock (default value is 1): min size of each load event - maxBlock (default value is 1): max size of each load event - minKey (default value is int64(1)): min key of the generated load - maxKey (default value is int64(200000)): max key of the generated load - skewedAccess (default value is false): is true, workload key generator is skewed (zipf) - ranges (default value is 1): number of generated ranges - keySpace (default value is 200000): keyspace for the generated range - placementType (default value is gen.Even): type of distribution for how ranges are distributed across stores - replicationFactor (default value is 3): number of replica for each range - bytes (default value is int64(0)): size of each range in bytes - stat (default value is “replicas”): specifies the output to be plotted for the verbose option - height (default value is 15): height of the plot - width (default value is 80): width of the plot In addition, verbose=(static_settings) can now be used to display settings used for static options where no randomization is involved. ``` Part of: #106311 Release note: None ---- **asim: enforce len(weighted_rand) == number of stores** Previously, we enforce that the length of a given `weighted_rand` cannot exceed the number of stores. This was challenging for users as they might not know the cluster configuration that would be generated and thus do not know the number of stores. In addition, if the length of `weighted_rand` was less than total number of stores, any stores outside of the `weighted_rand` range would simply have zero replicas. This could lead to confusion. To improve user control, this patch disables the use of weighted_rand with randomized cluster generation. Requirements to use weighted_rand: 1. use static option for cluster generation 2. specify nodes(default:3) and stores_per_node(default:1) through change_static_option 3. ensure len(weighted_rand) == number of stores == nodes * stores_per_node In addition to these new rules, the following existing requirements remain in place: 1. weighted_rand should only be used with placement_type=weighted_rand and vice versa. 2. must specify a weight between [0.0, 1.0] for each element in the array, with each element corresponding to a store 3. sum of weights in the array should be equal to 1 Resolves: #106311 Release note: None 109461: sql: add error and reporting when unable to fix malformed quantile r=rharding6373 a=rharding6373 This PR captures quantile information for future reproduction when we encounter a malformed quantile that we're unable to fix, instead of inducing an inactionable panic. Epic: None Fixes: #109060 Release note: None Co-authored-by: wenyihu6 <wenyi.hu@cockroachlabs.com> Co-authored-by: rharding6373 <rharding6373@users.noreply.github.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
A-kv-simulation
Relating to allocation simulation.
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-kv
KV Team
The issue tracks work for adding randomness to range generation.
Related: #106192
This initial phase of the project focuses on creating a small scale testing
framework. This task involves introducing randomness to the range generation
process while using a constant initial cluster setup for node / store placement,
localities, and zone configurations. These configurations will be based on
widely-used default configurations which are already satisfiable and valid,
eliminating the need for additional validation.
The randomness will be primarily stemmed from varying range factors such as
replication factor, key space, number of bytes, and leaseholder placement. We
may also add extra nodes with random localities in subsequent stages, but these
should not influence the test outcome.
The test’s pass or fail criterion will be based on conformance assertion to
ensure none of the replicas are over-replicated, under-replicated, or violated
constraints.
Jira issue: CRDB-29498
The text was updated successfully, but these errors were encountered: