tests: Add produce microbenchmark #16386

mfleming · 2024-01-31T12:19:04Z

Backports Required

Release Notes

none

mfleming · 2024-01-31T12:19:37Z

Here's a flamegraph from a run of the benchmark.

mfleming · 2024-01-31T12:21:07Z

Also worth noting

Writes to a single topic, single partition
It only uses RF=1
Doesn't use transactional / idempotent batches

vbotbuildovich · 2024-01-31T14:31:55Z

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/44518#018d5fa3-8b22-4521-8042-440ad19cd19a

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/44691#018d795d-0167-4333-9982-6b3135201b0b

StephanDollberg · 2024-02-01T09:31:46Z

src/v/kafka/server/tests/produce_partition_bench.cc

+        std::vector<ss::future<>> dispatched;
+        std::vector<ss::future<kafka::produce_response::partition>> produced;
+
+        perf_tests::start_measuring_time();


I am confused, this is now being run 10k times in different fibers for a single benchmark loop?

Yeah good point, this isn't going to work correctly.

Since my original motiviation for cranking up the concurrency was to get a useful flamegraph, we can just remove the concurrency and simplify things.

vbotbuildovich · 2024-02-05T14:14:15Z

new failures in https://buildkite.com/redpanda/redpanda/builds/44691#018d795d-0172-4938-bcef-8d5ec0e678bf:

"rptest.tests.offset_for_leader_epoch_test.OffsetForLeaderEpochTest.test_offset_for_leader_epoch"

new failures in https://buildkite.com/redpanda/redpanda/builds/45193#018dcdd0-f5c3-4ceb-8c88-3000864ecbb1:

"rptest.tests.data_transforms_test.DataTransformsLoggingTest.test_logs_volume"

StephanDollberg

Could you share a screenshot of the flamegraph with the produce bit zoomed in? Or share the whole file?

StephanDollberg · 2024-02-06T01:09:37Z

src/v/kafka/server/tests/produce_partition_bench.cc

+        builder.add_raw_kv(iobuf{}, iobuf{});
+    }
+
+    auto batch = std::move(builder).build();


So will this 100 batches or one? I thought one request can only contain a single batch per partition?

Sorry, this variable is poorly named -- it's one batch with 100 records.

mfleming · 2024-02-06T15:23:19Z

Flamegraph without concurrency.

out.svg.gz

travisdowns · 2024-02-07T14:07:12Z

src/v/kafka/server/tests/produce_partition_bench.cc

+    produce_partition_fixture() {
+        BOOST_TEST_CHECKPOINT("before leadership");
+
+        wait_for_controller_leadership().get0();


nit: prefer get()

travisdowns · 2024-02-07T14:11:48Z

src/v/kafka/server/tests/produce_partition_bench.cc

+
+    constexpr size_t num_records = 100;
+    for (size_t i = 0; i < num_records; ++i) {
+        builder.add_raw_kv(iobuf{}, iobuf{});


Can we put some data in the messages? Seems like something easy to make the test parameterizable on: the size of the payload.

travisdowns · 2024-02-07T14:13:31Z

src/v/kafka/server/tests/produce_partition_bench.cc

+    dispatched.push_back(std::move(stages.dispatched));
+    produced.push_back(std::move(stages.produced));
+
+    co_await ss::when_all_succeed(dispatched.begin(), dispatched.end())


I don't understand the purpose of the dispatched vector: it only ever has one element it in? Removing it would simplify this a lot and also allow a pure coroutine style probably.

Good point. Same for produced. Removing.

travisdowns · 2024-02-07T14:15:31Z

src/v/redpanda/tests/fixture.h

@@ -789,7 +789,7 @@ class redpanda_thread_fixture {
        { r.encode(writer, version) } -> std::same_as<void>;
    }
    kafka::request_context make_request_context(
-      RequestType request, kafka::request_header& header, conn_ptr conn = {}) {
+      RequestType& request, kafka::request_header& header, conn_ptr conn = {}) {


Can you give some detail here? This doesn't strike me as a safe change without some additional information, since callers may now see the request they are passing in mutated, as opposed to having the copy mutated (leaving the caller non-the-wiser).

Yeah, you're right that this is a risk. I checked the code paths and didn't see any modifications to the request to thought this would be safe (though admittedly this doesn't guard against future changes).

Changing to a reference was necessary because partition_produce_data has a deleted default copy constructor due to its records field, e.g. std::optional<kafka::produce_request_record_data> records{}' and unlike kafka::fetch_requestit's not possible to copy akafka::produce_requestas-is. Would you prefer to see a user-defined copy constructor forpartition_produce_data` ?

If it is not modified, we could just make the parameter type const RequestType&?

However, I think the second part of your reply is suggesting it is modified: do we move out of request in order to avoid copying? "Moving out of" is a modification.

So I think if it is truly not modified, make it const. If it is modified, we should try to leave the signature alone move it at the call site.

Of course you're right, it is modified :)

Switched to doing a move at the call site.

travisdowns

See comments.

travisdowns

Continued the thread about the parameter type change. It's the only outstanding issue as far as I'm aware.

This one is simliar to the fetch plan benchmark. Right now it only handles a single partition with RF=1 and doesn't use transactional / idempotent batches.

mfleming · 2024-02-22T11:39:03Z

I'm getting a timeout for gtest_raft_rpunit which @mmaslankaprv is looking at. But should I just merge this anyway since it only adds a new benchmark?

StephanDollberg · 2024-02-22T11:54:19Z

Sounds fine to me. Will need actioning from @piyushredpanda

mfleming requested review from travisdowns, StephanDollberg and ballard26 January 31, 2024 12:19

github-actions bot added the area/redpanda label Jan 31, 2024

StephanDollberg reviewed Feb 1, 2024

View reviewed changes

mfleming force-pushed the pr/15363 branch 2 times, most recently from f5c97ee to 1719c49 Compare February 5, 2024 12:13

StephanDollberg reviewed Feb 6, 2024

View reviewed changes

mfleming force-pushed the pr/15363 branch from 1719c49 to c80ffb2 Compare February 6, 2024 15:24

travisdowns reviewed Feb 7, 2024

View reviewed changes

mfleming force-pushed the pr/15363 branch from c80ffb2 to 36fc233 Compare February 13, 2024 16:52

mfleming requested review from StephanDollberg and travisdowns February 19, 2024 11:52

travisdowns reviewed Feb 19, 2024

View reviewed changes

mfleming force-pushed the pr/15363 branch from 36fc233 to 50f8cf8 Compare February 21, 2024 13:35

tests: Add produce microbenchmark

520d121

This one is simliar to the fetch plan benchmark. Right now it only handles a single partition with RF=1 and doesn't use transactional / idempotent batches.

mfleming force-pushed the pr/15363 branch from 50f8cf8 to 520d121 Compare February 21, 2024 13:57

piyushredpanda merged commit 0b5bb0b into redpanda-data:dev Feb 22, 2024
13 of 16 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tests: Add produce microbenchmark #16386

tests: Add produce microbenchmark #16386

mfleming commented Jan 31, 2024

mfleming commented Jan 31, 2024

mfleming commented Jan 31, 2024

vbotbuildovich commented Jan 31, 2024 •

edited

StephanDollberg Feb 1, 2024

mfleming Feb 5, 2024

vbotbuildovich commented Feb 5, 2024 •

edited

StephanDollberg left a comment

StephanDollberg Feb 6, 2024

mfleming Feb 6, 2024

mfleming commented Feb 6, 2024 •

edited

travisdowns Feb 7, 2024

travisdowns Feb 7, 2024

travisdowns Feb 7, 2024

mfleming Feb 13, 2024

travisdowns Feb 7, 2024

mfleming Feb 13, 2024

travisdowns Feb 19, 2024 •

edited

mfleming Feb 21, 2024

travisdowns left a comment

travisdowns left a comment

mfleming commented Feb 22, 2024

StephanDollberg commented Feb 22, 2024

tests: Add produce microbenchmark #16386

tests: Add produce microbenchmark #16386

Conversation

mfleming commented Jan 31, 2024

Backports Required

Release Notes

mfleming commented Jan 31, 2024

mfleming commented Jan 31, 2024

vbotbuildovich commented Jan 31, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vbotbuildovich commented Feb 5, 2024 • edited

StephanDollberg left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mfleming commented Feb 6, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

travisdowns Feb 19, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

travisdowns left a comment

Choose a reason for hiding this comment

travisdowns left a comment

Choose a reason for hiding this comment

mfleming commented Feb 22, 2024

StephanDollberg commented Feb 22, 2024

vbotbuildovich commented Jan 31, 2024 •

edited

vbotbuildovich commented Feb 5, 2024 •

edited

mfleming commented Feb 6, 2024 •

edited

travisdowns Feb 19, 2024 •

edited