-
Notifications
You must be signed in to change notification settings - Fork 553
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rptest: Create multi-node test with 1k topics and >10M events #16375
Conversation
fcdab33
to
6988e17
Compare
why not 10k or 100k, too slow of a test? curious. |
It is possible. The test becomes very complex and there is too many moving parts to create it whole from scratch.
On the other hand, to my humble opinion, E2E tests should simulate real-like traffic as much as possible and it is not a swarm like approach, but steady random events with variations in sizing and delays. Like, 10k topics onboard, but active amount is about 30% at max with random active topics each timeframe. |
0484be9
to
1c50027
Compare
1c50027
to
0973e45
Compare
new failures in https://buildkite.com/redpanda/redpanda/builds/44703#018d7a36-570f-436a-86a9-64d878b2cc42:
new failures in https://buildkite.com/redpanda/redpanda/builds/45171#018dc95f-3d66-472d-9127-5b654fade6df:
new failures in https://buildkite.com/redpanda/redpanda/builds/45171#018dc95f-3d67-4773-b236-ae9728911652:
new failures in https://buildkite.com/redpanda/redpanda/builds/45171#018dc95f-3d63-478e-880c-7bb336aabdb9:
new failures in https://buildkite.com/redpanda/redpanda/builds/45171#018dc95f-3d64-4031-9543-1b64e0756e06:
new failures in https://buildkite.com/redpanda/redpanda/builds/45306#018dd73f-27c6-4aaf-9759-796aae8b8a1d:
new failures in https://buildkite.com/redpanda/redpanda/builds/45306#018dd73f-27c6-4b4c-be77-1a0384655e33:
new failures in https://buildkite.com/redpanda/redpanda/builds/45306#018dd73f-27c1-4a07-94eb-453a926d5cb1:
new failures in https://buildkite.com/redpanda/redpanda/builds/45306#018dd73f-27c2-4039-bfee-444271e5478d:
new failures in https://buildkite.com/redpanda/redpanda/builds/45831#018e1b11-1c41-4dcf-967b-1d0d74c99cca:
new failures in https://buildkite.com/redpanda/redpanda/builds/45831#018e1b11-1c45-44a6-b37f-77aa1d0047b0:
new failures in https://buildkite.com/redpanda/redpanda/builds/46141#018e39d6-c0e3-4840-bf1a-b93e3ad78139:
new failures in https://buildkite.com/redpanda/redpanda/builds/46141#018e39d6-c0e6-46ee-9037-afcafa33c669:
new failures in https://buildkite.com/redpanda/redpanda/builds/46141#018e39e4-7484-4623-86b2-d7459adb6e2a:
new failures in https://buildkite.com/redpanda/redpanda/builds/46141#018e39e4-88f6-450e-8c19-544a464e52b1:
|
ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/44703#018d7a36-570f-436a-86a9-64d878b2cc42 ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/45254#018dd2ae-3954-47d6-a761-711e2d744ebd ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/46196#018e3dbb-e273-491d-97b9-b5ce346c792e ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/46646#018e68b0-56f5-4059-b746-9c2225ee3f70 ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/46646#018e68b0-56ec-47a2-aa6b-e7a5bc4f23d9 |
@emaxerrno, we reached 65k here when using 6xlarge: #16463 |
/ci-repeat 2 |
@savex can you check the failures please? Is |
This is a part of e2e activity that would simulate some real-like client activities. I would not want to mix them up with our internal scale tests. |
not sure I follow, the test is timing out (not finishing in 30 mins or less), so not a good fit for e2e or maybe there is a bug. |
0baaf0b
to
0118ef5
Compare
/ci-repeat 2 |
/ci-repeat 1 |
4de69a8
to
977f9ab
Compare
/ci-repeat 1 |
977f9ab
to
123e512
Compare
Updated with skips for debug version of RP as it would not offer required performance. |
/ci-repeat 2 |
123e512
to
3f5abbe
Compare
/ci-repeat 2 |
target_total_events = 500 * 1024 | ||
else: | ||
# Prepare topics for EC2 | ||
# Total number of workloads would be 2 x 5 = 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment accurate?
total flink nodes = 5
workloads_per_node = 4
total = 20?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this should be updated. Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
total workloads 4 * 5 = 20
total topics 4 * 5 * 25 = 500
To test multi-node transactions at scale (>10M) multi-node test is created. Also, single node test modified to reuse similar code Also, updated transaction rate validation formulas since there should be more clear validation than used before with explanation on what it going on.
3f5abbe
to
a386410
Compare
Treat CREATED and SCHEDULED as active statuses. In case of docker, job manager is slower and this will cause flink.wait() to function properly Also, flink_scale tests would not work on debug version of RP
a386410
to
743fc35
Compare
/ci-repeat 2 |
Goal is to generate as many topics as possible until RP starts to fail.
Flink workload is used with auto-sized task managers and flexible failure rate settings.
Workloads should generate low to low-moderate data to random topics. Data rate should be controllable.
Milestones
Fixes: redpanda-data/devprod#1013
Backports Required
Release Notes