Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI Failure (AssertionError for len(topics_info)) in EndToEndShadowIndexingTest.test_recover_after_delete_records #11998

Closed
rockwotj opened this issue Jul 10, 2023 · 4 comments · Fixed by #14188
Assignees
Labels
area/cloud-storage Shadow indexing subsystem ci-failure kind/bug Something isn't working sev/low Bugs which are non-functional paper cuts, e.g. typos, issues in log messages

Comments

@rockwotj
Copy link
Contributor

https://buildkite.com/redpanda/redpanda/builds/32688#01892f2e-4d70-4992-8a74-a4d5330808ff

Module: rptest.tests.e2e_shadow_indexing_test
Class:  EndToEndShadowIndexingTest
Method: test_recover_after_delete_records
test_id:    rptest.tests.e2e_shadow_indexing_test.EndToEndShadowIndexingTest.test_recover_after_delete_records
status:     FAIL
run time:   1 minute 5.215 seconds


    AssertionError()
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/ducktape/tests/runner_client.py", line 135, in run
    data = self.run_test()
  File "/usr/local/lib/python3.10/dist-packages/ducktape/tests/runner_client.py", line 227, in run_test
    return self.test_context.function(self.test)
  File "/root/tests/rptest/utils/mode_checks.py", line 63, in f
    return func(*args, **kwargs)
  File "/root/tests/rptest/services/cluster.py", line 82, in wrapped
    r = f(self, *args, **kwargs)
  File "/root/tests/rptest/tests/e2e_shadow_indexing_test.py", line 400, in test_recover_after_delete_records
    assert len(topics_info) == 1
AssertionError
@rockwotj rockwotj added kind/bug Something isn't working ci-failure labels Jul 10, 2023
@rockwotj rockwotj changed the title CI Failure (key symptom) in EndToEndShadowIndexingTest.test_recover_after_delete_records CI Failure (AssertionError for len(topics_info)) in EndToEndShadowIndexingTest.test_recover_after_delete_records Jul 10, 2023
@NyaliaLui
Copy link
Contributor

Looks like this came back in https://buildkite.com/redpanda/redpanda/builds/34284#0189b4f1-5f3b-41ca-924d-b5df6fdff096

test_id:    rptest.tests.e2e_shadow_indexing_test.EndToEndShadowIndexingTest.test_recover_after_delete_records
status:     FAIL
run time:   53.934 seconds


    AssertionError()
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/ducktape/tests/runner_client.py", line 135, in run
    data = self.run_test()
  File "/usr/local/lib/python3.10/dist-packages/ducktape/tests/runner_client.py", line 227, in run_test
    return self.test_context.function(self.test)
  File "/root/tests/rptest/utils/mode_checks.py", line 63, in f
    return func(*args, **kwargs)
  File "/root/tests/rptest/services/cluster.py", line 82, in wrapped
    r = f(self, *args, **kwargs)
  File "/root/tests/rptest/tests/e2e_shadow_indexing_test.py", line 589, in test_recover_after_delete_records
    assert len(topics_info) == 1
AssertionError

@NyaliaLui NyaliaLui reopened this Aug 3, 2023
@abhijat abhijat added the area/cloud-storage Shadow indexing subsystem label Sep 6, 2023
@abhijat
Copy link
Contributor

abhijat commented Sep 6, 2023

Looking at broker logs:

TRACE 2023-08-02 07:07:51,395 [shard 1] kafka - requests.cc:94 - [172.16.16.28:51366] processing name:list_offsets, key:2, version:4 for rpk, mem_units: 8110, ctx_size: 42
TRACE 2023-08-02 07:07:51,395 [shard 1] kafka - handler.h:69 - [client_id: {rpk}] handling list_offsets v4 request {replica_id=-1 isolation_level=0 topics={{name={panda-topic} partitions={{partition_index=0 current_leader_epoch=-1 timestamp={timestamp: -2} max_num_offsets=0}}}}}
TRACE 2023-08-02 07:07:51,395 [shard 1] kafka - request_context.h:189 - [172.16.16.28:51366] sending 2:list_offsets for {rpk}, response {throttle_time_ms=0 topics={{name={panda-topic} partitions={{partition_index=0 error_code={ error_code: unknown_topic_or_partition [3] } old_style_offsets={} timestamp={timestamp: missing} offset=-1 leader_epoch=-1}}}}}
TRACE 2023-08-02 07:07:51,395 [shard 1] kafka - requests.cc:94 - [172.16.16.28:51366] processing name:list_offsets, key:2, version:4 for rpk, mem_units: 8110, ctx_size: 42
TRACE 2023-08-02 07:07:51,395 [shard 1] kafka - handler.h:69 - [client_id: {rpk}] handling list_offsets v4 request {replica_id=-1 isolation_level=0 topics={{name={panda-topic} partitions={{partition_index=0 current_leader_epoch=-1 timestamp={timestamp: missing} max_num_offsets=0}}}}}
TRACE 2023-08-02 07:07:51,395 [shard 1] kafka - request_context.h:189 - [172.16.16.28:51366] sending 2:list_offsets for {rpk}, response {throttle_time_ms=0 topics={{name={panda-topic} partitions={{partition_index=0 error_code={ error_code: unknown_topic_or_partition [3] } old_style_offsets={} timestamp={timestamp: missing} offset=-1 leader_epoch=-1}}}}}

and soon after:

INFO  2023-08-02 07:07:51,507 [shard 1] raft - [group_id:1, {kafka/panda-topic/0}] consensus.cc:1481 - started raft, log offsets: {start_offset:32079, committed_offset:32078, committed_offset_term:-9223372036854775808, dirty_offset:32078, dirty_offset_term:-9223372036854775808}, term: 0, configuration: {current: {voters: {{id: {1}, revision: {9}}, {id: {0}, revision: {9}}, {id: {2}, revision: {9}}}, learners: {}}, old:{nullopt}, revision: 9, update: {nullopt}, version: 4}, brokers: {{id: 1, kafka_advertised_listeners: {{dnslistener:{host: docker-rp-5, port: 9092}}, {iplistener:{host: 172.16.16.10, port: 9093}}, {kerberoslistener:{host: docker-rp-5.redpanda-test, port: 9094}}}, rpc_address: {host: docker-rp-5, port: 33145}, rack: {nullopt}, properties: {cores 2, mem_available 2, disk_available 61}}, {id: 0, kafka_advertised_listeners: {{dnslistener:{host: docker-rp-4, port: 9092}}, {iplistener:{host: 172.16.16.22, port: 9093}}, {kerberoslistener:{host: docker-rp-4.redpanda-test, port: 9094}}}, rpc_address: {host: docker-rp-4, port: 33145}, rack: {nullopt}, properties: {cores 2, mem_available 2, disk_available 61}}, {id: 2, kafka_advertised_listeners: {{dnslistener:{host: docker-rp-16, port: 9092}}, {iplistener:{host: 172.16.16.18, port: 9093}}, {kerberoslistener:{host: docker-rp-16.redpanda-test, port: 9094}}}, rpc_address: {host: docker-rp-16, port: 33145}, rack: {nullopt}, properties: {cores 2, mem_available 2, disk_available 61}}}}

Maybe the describe_topics in the test should be wrapped in wait, right now it may return unknown topic/partition as in this case:

[DEBUG - 2023-08-02 07:07:51,361 - rpk - _execute - lineno:896]: Executing command: ['/var/lib/buildkite-agent/builds/buildkite-amd64-xfs-builders-i-0554e0cfbe8a1ed0b-1/redpanda/redpanda/vbuild/redpanda_installs/ci/bin/rpk', 'topic', '-X', 'brokers=docker-rp-5:9092,docker-rp-16:9092,docker-rp-4:9092', 'describe', 'panda-topic', '-p', '-v']
[DEBUG - 2023-08-02 07:07:51,397 - rpk - _execute - lineno:916]: 
PARTITION  LEADER  EPOCH  REPLICAS  LOG-START-OFFSET            HIGH-WATERMARK
0          1       1      [0 1 2]   UNKNOWN_TOPIC_OR_PARTITION  UNKNOWN_TOPIC_OR_PARTITION

@abhijat abhijat added the sev/low Bugs which are non-functional paper cuts, e.g. typos, issues in log messages label Sep 6, 2023
@rockwotj
Copy link
Contributor Author

@andrwng
Copy link
Contributor

andrwng commented Sep 15, 2023

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cloud-storage Shadow indexing subsystem ci-failure kind/bug Something isn't working sev/low Bugs which are non-functional paper cuts, e.g. typos, issues in log messages
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants