New workload classes by harriscr · Pull Request #333 · ceph/cbt

harriscr · 2025-05-19T14:40:20Z

This is the first part of the work to allow the workloads feature added in PR 306 to be used with any benchmark type.

To achieve this a number of new classes have been added to CBT. UML class digrams for these have been generated using pyreverse, and are given in each section below.

To integrate Workloads into any particular benchmark type a Workloads object would need to be added to the class variables, and instantiated by passing the configuration object used to create the benchmark.
A new command class for the benchmark type would also have to be created so that the workload can understand how to convert the yaml options to a CLI invocation that can be used to run the I/O exerciser in question.

The easiest way would be to add a self._workloads to the Benchmark base class and instantiate it there with the config object and archive directory.

self._workloads: Workloads = Workloads(config, self.run_dir)

Then the specific benchmark can run the workload using:

if self._workloads.exist():
            self._workloads.set_benchmark_type(<string_representation_of_benchmark_type>)
            self._workloads.set_executable(<full_path_to_executable_for_benchmark>)
            self._workloads.run()

Workload Classes

Class diagram

Workloads

This class is a container for the actual workload classes themselves. It is instantiated with the workloads section of the benchmark yaml file, and is used to run the individual workloads.

Workload

The Workload class contains the details for each individual workload. It is designed to be created and called via the Workloads class, so should never be called directly.

Command classes

These classes try to encapsulate the CLI command that will eventually be run on the system to run the I/O exerciser. Eventually the aim is to add a command class for each individual I/O exerciser

Class diagram

Command

An abstract base class for any command type.

FioCommand

The concrete class for an fio command. Parses all the options passed via a CBT yaml file for a single run of the fio I/O exerciser. This may need to be split farther in future into rbd and non-rbd versions in the future, but for this initial code a single class is sufficient

CliOptions

A class based on the standard python dictionary that holds key/value pairs that equate to an option passed used on the CLI and the corresponding value for that option. Unlike a regular dictionary it does not allow values to be updated, and returns None instead of a KeyError when a value for an unknown key is requested.
Note to reviewers: I'm not sure if this class is needed or not, but it seems to be a neat way to cope with sets of options and corresponding values for a CLI invocation

Testing

============================= slowest 5 durations ==============================
0.02s setup    tests/test_bm_cephtestrados.py::TestBenchmarkcephtestrados::test_valid_archive_dir
0.01s setup    tests/test_bm_rbdfio.py::TestBenchmarkrbdfio::test_valid_archive_dir
0.01s setup    tests/test_bm_kvmrbdfio.py::TestBenchmarkkvmrbdfio::test_valid_archive_dir
0.01s setup    tests/test_bm_librbdfio.py::TestBenchmarklibrbdfio::test_valid_archive_dir
0.01s setup    tests/test_bm_rawfio.py::TestBenchmarkrawfio::test_valid_archive_dir
======================== 326 passed, 3 skipped in 0.44s ========================
Finished running tests!

Black and ruff show no errors.

I'll update teuthology logs once they have run

Signed-off-by: Chris Harris(harriscr@uk.ibm.com)

Signed-off-by: Chris Harris<harriscr@uk.ibm.com>

cli_options.py

command/command.py

command/fio_command.py

perezjosibm · 2025-05-19T15:25:41Z

command/fio_command.py

+
+    def __init__(self, options: dict[str, str], workload_output_directory: str) -> None:
+        self._volume_number: int = int(options["volume_number"])
+        self._total_iodepth: Optional[str] = options.get("total_iodepth", None)


Probably more generic to consider encapsulate any FIO options as a class. Unit tests can be generated to a set of valid options. That definitely protects the code for the future.

command/fio_command.py

workloads/workload.py

perezjosibm · 2025-05-19T15:50:59Z

workloads/workload.py

+from command.command import Command
+from command.fio_command import FioCommand
+
+WORKLOAD_TYPE = dict[str, Union[str, list[str]]]  # pylint: disable=["invalid-name"]


We probably need WORKLOAD_TYPE to be a class itself, because it encapsulates useful info, like number of OSD of the cluster, whether its Classic or Crimson, whether we need a number of Reactors, Alien threads, etc.
Notice also that can be part of the info you need for post-processing.

The WORKLOAD_TYPE here is a type hint to MyPy for checking the code. As we collect more of these I think we should look into creating a cbt_types.py file to contain them all.

I agree that storing things like the number of ODS etc. that you mentioned would be useful. To me that sort of intofmation is a property of the cluster under test, not a property of an I/O workload though. It would belong to the Cluster object.

I am wondering if eventually we want to have something like a results object that stores the configuration and options/results for a single benchmark run. I know we output the cluster configuration at the start of a run (in the benchmark .yaml file),
but that leave a step of matching the config to the benchmark run. For a 1:1 mapping this isn't too difficult, but if CBT ran with multiple cluster definitions then it makes things harder.
There is also already a Result object in the code (see benchmark.py), but that currently doesn't store any information about the configuration.

The Results object class already exists, albeit it looks an initial try and needs TLC, have a look at benchmark.py:

class Result: def __init__(self, run, alias, result, baseline, stmt, accepted): self.run = run self.alias = alias self.result = result self.baseline = baseline self.stmt = stmt self.accepted = accepted def __str__(self): fmt = '{run}: {alias}: {stmt}:: {result}/{baseline} => {status}' return fmt.format(run=self.run, alias=self.alias, stmt=self.stmt, result=self.result, baseline=self.baseline, status="accepted" if self.accepted else "rejected")

I think it would be very useful to have a relation between a 'run' with its corresponding (Cluster) configuration.

(Sorry I skipped the gun, just read the rest of the post and saying the same thing).

For a 1:1 mapping this isn't too difficult, but if CBT ran with multiple cluster definitions then it makes things harder.

I see your point, I noticed different convention for EC test runs.

For example, for Crimson, I normally need a range over number of OSD (implicitly backend drives), number of reactors, number of Alien threads, etc. so diverge completely from the current test plan schema that CBT expects. That's why I capture succinct details of the configuration parameters in the test run name itself, so each test run ends up with its own .json linking the configuration details.
I might not lobby for such a convention to be supported in CBT (since recreating clusters etc might conflict with the expected behaviour wrt teuthology, which I think runs CBT with flag indicating to use the existing cluster iirc). I might keep a prototype in my own checkout and test it in anger.

workloads/workload.py

perezjosibm · 2025-05-19T16:09:40Z

workloads/workload.py

+        self._all_options: WORKLOAD_TYPE = options.copy()
+        self._executable_path: str
+        self._script: str = f"{options.get('pre_workload_script', '')}"
+


As a minimum, a workload can be specified (regardless of benchmark) by the following parameters:

IO type,

Block size,

IO depth,

(IO size normally the full target),

target (device or volume)
Of course the free dict _all_options, supports that, but if we have this in place already would be very useful for consistency across the code base, including post processing. This can be serialised in to a .json object and load it when post-processing.

Yes, I mean as instance object attributes in the Workload class (self._x variables) 👍

workloads/workload.py

perezjosibm

Leaving review comments as per requested review, eventhough the state of the pr is still Draft.

workloads/workload.py

workloads/workloads.py

perezjosibm · 2025-05-19T17:17:04Z

workloads/workloads.py

+            elif isinstance(value, list):
+                global_options[option_name] = value
+            else:
+                global_options[option_name] = f"{value}"


If you convert single/scalar values to strings, why is not requited for list of items?

We do not want to convert the entire list to a string as that doesn't quite do what's expected.
e.g (using python 3.9)

>>> l: list[int] = [1, 2, 3, 6, 89] >>> print(f"{l}") [1, 2, 3, 6, 89] >>> b = f"{l}" >>> print(b[:-3]) [1, 2, 3, 6,

So the whole list, including the brackets would be converted to a single string, which would then have to be parsed at a later date and the brackets stripped off.
If we leave this list as a list, then we can iterate over it in later code without having to fist strip the brackets and then split the string into its component parts.
There is an argument to say we should do the same with dictionaries, but I haven't come across one yet - maybe when I'm doing more testing something will show up.

We could iterate over the contents of the list and also convert those to a string, but with the random nested structure present in the test plan yaml files we will never be able to convert everything correctly.

I think the point you want to make is this:

options is a list, keep it -- even if its a singleton, eg a list with a single element

options is a scalar, convert to string.

For consistency, to avoid the ambiguous situation of a single element list, I'd prefer to always be a list, and convert them all as appropriate at the same time (rather than at different times), which makes the code clearer to maintain.

but with the random nested structure present in the test plan yaml files we will never be able to convert everything correctly.

Not sure I understand, iirc there were lots of discussions last year (some of which might still be around in the slack channel ceph-uk-cbt) wrt changing the format to the cbt .yaml, etc. I guess that did not progress at the end 😞

workloads/workloads.py

Signed-off-by: Chris Harris<harriscr@uk.ibm.com>

harriscr · 2025-07-01T15:38:58Z

Teuthology runs:

perf-basic
https://pulpito.ceph.com/harriscr-2025-06-24_10:54:23-perf-basic-main-distro-default-smithi/

rados/perf
https://pulpito.ceph.com/harriscr-2025-06-24_12:31:18-rados:perf-main-distro-default-smithi/
2 failed with "Command failed on smithi053 with status 100: 'sudo apt-get -y --force-yes install python3-pip librbd-dev collectl linux-tools-generic'"

lee-j-sanders

LGTM

Chris Harris and others added 7 commits April 24, 2025 11:59

Refactoring workloads: new classes

fa859bf

Signed-off-by: Chris Harris(harriscr@uk.ibm.com)

Refactoring workloads: new classes

a9101be

Signed-off-by: Chris Harris(harriscr@uk.ibm.com)

Refactoring workloads: new classes

8345b8d

Signed-off-by: Chris Harris(harriscr@uk.ibm.com)

Refactoring workloads: new classes

cf53b98

Signed-off-by: Chris Harris<harriscr@uk.ibm.com>

Refactoring workloads: new classes

3da638d

Signed-off-by: Chris Harris<harriscr@uk.ibm.com>

Refactoring workloads: new classes

ae8d23a

Signed-off-by: Chris Harris<harriscr@uk.ibm.com>

Merge branch 'ceph:master' into ch_wip_workload_refactor

f64a775

harriscr self-assigned this May 19, 2025

harriscr added the enhancement label May 19, 2025

harriscr marked this pull request as draft May 19, 2025 14:42

harriscr requested review from lee-j-sanders and perezjosibm May 19, 2025 14:42