Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add small batches nightly regression test #13532

Merged
merged 1 commit into from
Jan 7, 2024

Conversation

ballard26
Copy link
Contributor

Adds a nightly regression test with many clients and partitions test where producers send small batches to Redpanda. This sort of workload will help us track performance improvements of a few future projects.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.2.x
  • v23.1.x
  • v22.3.x

Release Notes

  • none

from rptest.services.cluster import cluster


class SmallBatchesTest(RedpandaTest):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do these tests work with different hardware sizes? E.g., they require 6 nodes, but is 6 nodes enough if the instance sizes are small? Probably we require some minimum size of instance, and then also if the instances are much larger the test passes "easily".

This is basically out of scope for this change but I'm curious how this works.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These nightly regression tests are solely targeted at a 3x i3en.6xlarge cluster. So the short answer is they don't work for different hardware sizes.

Copy link
Contributor Author

@ballard26 ballard26 Sep 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately we don't have a quick way of specifying hardware requirements in DT, only instance count.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this clarification.

extra_rp_conf={"aggregate_metrics": True})

@cluster(num_nodes=6)
def omb_test(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When do we decide to add a new test class (as here) vs a new test within an existing class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's mainly how our results aggregation script works. When inserting the results into the perf-db they use the test class name as the label for the results. Hence a new test class instead of a new method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose I should change it to label the results as TestClassName.MethodName though on second thought. I'll get a PR up in vtools for that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah very nice, because just using using the class name seems like a bug.

}

benchmark = OpenMessagingBenchmark(
self._ctx, self.redpanda, "ACK_ALL_GROUP_LINGER_1MS",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we just set batch.size 1 as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes seems like a good idea to reduce the variability of the load under backpressure, otherwise it may start batching up anyway (though I did not do the rate/producer calculation to see how feasible that is).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it has to start batching then the e2e will be too high for the test to pass. However, I'll add batch.size=1 to limit the scope of what we're testing here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#13655 will need to be merged to make setting the batch size possible.

@piyushredpanda piyushredpanda merged commit 28fe73a into redpanda-data:dev Jan 7, 2024
23 of 26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants