Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI Failure (consumer timed out waiting for offsets) in ControllerLogLimitMirrorMakerTests.test_mirror_maker_with_limits #10502

Closed
jcsp opened this issue May 2, 2023 · 13 comments
Assignees
Labels

Comments

@jcsp
Copy link
Contributor

jcsp commented May 2, 2023

This test is doing something peculiar. It's running start_workload and then immediately run_validation, so the producer only runs for a few seconds.

In that context, it is surprising that the consumer is timing out, even it is relatively brief 120 second timeout. I can see some fetches happening in the redpanda log right up to the point it times out, so I don't think this is totally wedged.

https://buildkite.com/redpanda/redpanda/builds/28388#0187dbdc-560a-4943-9d2e-5e86b0cb0b3d

Module: rptest.tests.controller_log_limiting_test
Class:  ControllerLogLimitMirrorMakerTests
Method: test_mirror_maker_with_limits
====================================================================================================
test_id:    rptest.tests.controller_log_limiting_test.ControllerLogLimitMirrorMakerTests.test_mirror_maker_with_limits
status:     FAIL
run time:   5 minutes 2.855 seconds


    TimeoutError("Consumer failed to consume up to offsets {TopicPartition(topic='topic-ddocalmndc', partition=4): 250, TopicPartition(topic='topic-ddocalmndc', partition=5): 424, TopicPartition(topic='topic-ddocalmndc', partition=8): 506, TopicPartition(topic='topic-ddocalmndc', partition=9): 1849, TopicPartition(topic='topic-ddocalmndc', partition=3): 692, TopicPartition(topic='topic-ddocalmndc', partition=0): 212, TopicPartition(topic='topic-ddocalmndc', partition=6): 423, TopicPartition(topic='topic-ddocalmndc', partition=1): 102, TopicPartition(topic='topic-ddocalmndc', partition=2): 873, TopicPartition(topic='topic-ddocalmndc', partition=7): 408} after waiting 120s, last committed offsets: {1: {TopicPartition(topic='topic-ddocalmndc', partition=5): 1, TopicPartition(topic='topic-ddocalmndc', partition=7): 68, TopicPartition(topic='topic-ddocalmndc', partition=8): 507, TopicPartition(topic='topic-ddocalmndc', partition=3): 693}}.")
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/ducktape/tests/runner_client.py", line 135, in run
    data = self.run_test()
  File "/usr/local/lib/python3.10/dist-packages/ducktape/tests/runner_client.py", line 227, in run_test
    return self.test_context.function(self.test)
  File "/root/tests/rptest/services/cluster.py", line 49, in wrapped
    r = f(self, *args, **kwargs)
  File "/root/tests/rptest/tests/controller_log_limiting_test.py", line 374, in test_mirror_maker_with_limits
    self.run_validation(consumer_timeout_sec=120)
  File "/root/tests/rptest/tests/end_to_end.py", line 267, in run_validation
    self.run_consumer_validation(
  File "/root/tests/rptest/tests/end_to_end.py", line 288, in run_consumer_validation
    self.await_consumed_offsets(last_acked_offsets,
  File "/root/tests/rptest/tests/end_to_end.py", line 219, in await_consumed_offsets
    wait_until(
  File "/usr/local/lib/python3.10/dist-packages/ducktape/utils/util.py", line 57, in wait_until
    raise TimeoutError(err_msg() if callable(err_msg) else err_msg) from last_exception
ducktape.errors.TimeoutError: Consumer failed to consume up to offsets {TopicPartition(topic='topic-ddocalmndc', partition=4): 250, TopicPartition(topic='topic-ddocalmndc', partition=5): 424, TopicPartition(topic='topic-ddocalmndc', partition=8): 506, TopicPartition(topic='topic-ddocalmndc', partition=9): 1849, TopicPartition(topic='topic-ddocalmndc', partition=3): 692, TopicPartition(topic='topic-ddocalmndc', partition=0): 212, TopicPartition(topic='topic-ddocalmndc', partition=6): 423, TopicPartition(topic='topic-ddocalmndc', partition=1): 102, TopicPartition(topic='topic-ddocalmndc', partition=2): 873, TopicPartition(topic='topic-ddocalmndc', partition=7): 408} after waiting 120s, last committed offsets: {1: {TopicPartition(topic='topic-ddocalmndc', partition=5): 1, TopicPartition(topic='topic-ddocalmndc', partition=7): 68, TopicPartition(topic='topic-ddocalmndc', partition=8): 507, TopicPartition(topic='topic-ddocalmndc', partition=3): 693}}.
@dlex
Copy link
Contributor

dlex commented May 22, 2023

@VladLazar
Copy link
Contributor

FAIL test: ControllerLogLimitMirrorMakerTests.test_mirror_maker_with_limits (1/69 runs)
  failure at 2023-05-22T14:44:28.482Z: TimeoutError("Consumer failed to consume up to offsets {TopicPartition(topic='topic-pwcnqtpsnb', partition=7): 1175, TopicPartition(topic='topic-pwcnqtpsnb', partition=0): 344, TopicPartition(topic='topic-pwcnqtpsnb', partition=8): 519, TopicPartition(topic='topic-pwcnqtpsnb', partition=2): 1230, TopicPartition(topic='topic-pwcnqtpsnb', partition=6): 509, TopicPartition(topic='topic-pwcnqtpsnb', partition=4): 35, TopicPartition(topic='topic-pwcnqtpsnb', partition=3): 834, TopicPartition(topic='topic-pwcnqtpsnb', partition=9): 188, TopicPartition(topic='topic-pwcnqtpsnb', partition=5): 141, TopicPartition(topic='topic-pwcnqtpsnb', partition=1): 781} after waiting 120s, last committed offsets: {1: {TopicPartition(topic='topic-pwcnqtpsnb', partition=9): 189}}.")
      on (amd64, container) in job https://buildkite.com/redpanda/redpanda/builds/29583#0188439d-c75d-4048-94a9-94b65ca6b252

@graphcareful
Copy link
Contributor

@michael-redpanda
Copy link
Contributor

@NyaliaLui
Copy link
Contributor

@NyaliaLui
Copy link
Contributor

FAIL test: ControllerLogLimitMirrorMakerTests.test_mirror_maker_with_limits (10/31 runs)
  failure at 2023-05-25T23:28:19.897Z: TimeoutError("Consumer failed to consume up to offsets {TopicPartition(topic='topic-vudwwwqkoy', partition=7): 382, TopicPartition(topic='topic-vudwwwqkoy', partition=0): 218, TopicPartition(topic='topic-vudwwwqkoy', partition=2): 29, TopicPartition(topic='topic-vudwwwqkoy', partition=3): 669, TopicPartition(topic='topic-vudwwwqkoy', partition=5): 544, TopicPartition(topic='topic-vudwwwqkoy', partition=1): 694, TopicPartition(topic='topic-vudwwwqkoy', partition=8): 950, TopicPartition(topic='topic-vudwwwqkoy', partition=6): 1128, TopicPartition(topic='topic-vudwwwqkoy', partition=9): 39, TopicPartition(topic='topic-vudwwwqkoy', partition=4): 1604} after waiting 120s, last committed offsets: {1: {}}.")
      on (amd64, container) in job https://buildkite.com/redpanda/redpanda/builds/29928#01885502-1e1d-4952-af58-5440720806b5

@abhijat
Copy link
Contributor

abhijat commented May 30, 2023

https://buildkite.com/redpanda/redpanda/builds/30108#018868e9-8cfa-4239-a449-fa4c1bc3d5c4

  File "/usr/local/lib/python3.10/dist-packages/ducktape/tests/runner_client.py", line 135, in run
    data = self.run_test()
  File "/usr/local/lib/python3.10/dist-packages/ducktape/tests/runner_client.py", line 227, in run_test
    return self.test_context.function(self.test)
  File "/root/tests/rptest/services/cluster.py", line 49, in wrapped
    r = f(self, *args, **kwargs)
  File "/root/tests/rptest/tests/controller_log_limiting_test.py", line 374, in test_mirror_maker_with_limits
    self.run_validation(consumer_timeout_sec=120)
  File "/root/tests/rptest/tests/end_to_end.py", line 270, in run_validation
    self.run_consumer_validation(
  File "/root/tests/rptest/tests/end_to_end.py", line 291, in run_consumer_validation
    self.await_consumed_offsets(last_acked_offsets,
  File "/root/tests/rptest/tests/end_to_end.py", line 222, in await_consumed_offsets
    wait_until(
  File "/usr/local/lib/python3.10/dist-packages/ducktape/utils/util.py", line 57, in wait_until
    raise TimeoutError(err_msg() if callable(err_msg) else err_msg) from last_exception
ducktape.errors.TimeoutError: Consumer failed to consume up to offsets {TopicPartition(topic='topic-dcisvluetn', partition=5): 341, TopicPartition(topic='topic-dcisvluetn', partition=4): 209, TopicPartition(topic='topic-dcisvluetn', partition=3): 273, TopicPartition(topic='topic-dcisvluetn', partition=6): 619, TopicPartition(topic='topic-dcisvluetn', partition=2): 1347, TopicPartition(topic='topic-dcisvluetn', partition=9): 1175, TopicPartition(topic='topic-dcisvluetn', partition=0): 738, TopicPartition(topic='topic-dcisvluetn', partition=1): 125, TopicPartition(topic='topic-dcisvluetn', partition=8): 323, TopicPartition(topic='topic-dcisvluetn', partition=7): 590} after waiting 120s, last committed offsets: {1: {TopicPartition(topic='topic-dcisvluetn', partition=4): 210, TopicPartition(topic='topic-dcisvluetn', partition=5): 1, TopicPartition(topic='topic-dcisvluetn', partition=6): 620, TopicPartition(topic='topic-dcisvluetn', partition=0): 606, TopicPartition(topic='topic-dcisvluetn', partition=3): 274}}.

@twmb
Copy link
Contributor

twmb commented May 30, 2023

@michael-redpanda
Copy link
Contributor

@mmaslankaprv
Copy link
Member

This issue was specific to debug mode being slow, haven't seen it since the test was disabled in debug mode. Closing this issue.

@andijcr
Copy link
Contributor

andijcr commented Sep 11, 2023

        {
            "title": "TimeoutError(\"Consumer failed to consume up to offsets {TopicPartition(topic='topic-iitmhewojh', partition=3): 1285, TopicPartition(topic='topic-iitmhewojh', partition=8): 1420, TopicPartition(topic='topic-iitmhewojh', partition=2): 1262, TopicPartition(topic='topic-iitmhewojh', partition=7): 1279} after waiting 120s, last committed offsets: {1: {}}.\")",
            "id": 12960,
            "ts": 1694187797.481844,
            "type": "pr-merged",
            "build": "release",
            "arch": "amd64",
            "link": "https://buildkite.com/redpanda/redpanda/builds/36671"
        }

@andijcr andijcr reopened this Sep 11, 2023
@piyushredpanda
Copy link
Contributor

I don't think this ever gets time allocation. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests