Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage - Cannot continue parsing. recived size:0 bytes, expected:582646 bytes. context:parser::skip_batch #3305

Closed
VadimPlh opened this issue Dec 17, 2021 · 1 comment
Assignees
Labels
area/cloud-storage Shadow indexing subsystem kind/bug Something isn't working

Comments

@VadimPlh
Copy link
Contributor

yesterday I tested truncation for segment in gc:

  • Produce data
  • Truncate segment
  • Try to consume

When I opened redpanda log today I see a lot of error like:
storage - Cannot continue parsing. recived size:0 bytes, expected:582646 bytes. context:parser::skip_batch
After deleting topic and all segments from gc it is still producing error (count of fetch session for current segment is 0)

2021-12-17 12:29:54	
2021-12-17T10:29:27.254608897Z stderr F DEBUG 2021-12-17 10:29:27,252 [shard 13] cloud_storage - [fiber481 kafka/delete_first_segment/0] - remote_partition.cc:215 - maybe_reset_reader called
	
2021-12-17 12:29:54	
2021-12-17T10:29:27.254600393Z stderr F ERROR 2021-12-17 10:29:27,252 [shard 13] storage - Cannot continue parsing. recived size:0 bytes, expected:582646 bytes. context:parser::skip_batch
	
2021-12-17 12:29:54	
2021-12-17T10:29:27.254596679Z stderr F DEBUG 2021-12-17 10:29:27,252 [shard 13] cloud_storage - [fiber448~1~12 kafka/delete_first_segment/0] - remote_segment.cc:404 - skip_batch_start called for 35353
	
2021-12-17 12:29:54	
2021-12-17T10:29:27.254592821Z stderr F DEBUG 2021-12-17 10:29:27,252 [shard 13] cloud_storage - [fiber448~1~12 kafka/delete_first_segment/0] - remote_segment.cc:357 - accept_batch_start skip because last_kafka_offset 36333 (last_rp_offset: 36334) < config.start_offset: 39000
	
2021-12-17 12:29:54	
2021-12-17T10:29:27.254588814Z stderr F DEBUG 2021-12-17 10:29:27,252 [shard 13] cloud_storage - [fiber481 kafka/delete_first_segment/0] - remote_partition.cc:122 - Invoking 'read_some' on current log reader {start_offset:{39000}, max_offset:{9223372036854775807}, min_bytes:0, max_bytes:1048576, type_filter:batch_type::raft_data, first_timestamp:nullopt}
	
2021-12-17 12:29:54	
2021-12-17T10:29:27.254584542Z stderr F DEBUG 2021-12-17 10:29:27,252 [shard 13] cloud_storage - [fiber481 kafka/delete_first_segment/0] - remote_partition.cc:268 - maybe_reset_stream completed true false
	
2021-12-17 12:29:54	
2021-12-17T10:29:27.254579206Z stderr F DEBUG 2021-12-17 10:29:27,252 [shard 13] cloud_storage - [fiber481 kafka/delete_first_segment/0] - remote_partition.cc:236 - maybe_reset_reader, config start_offset: 39000, reader max_offset: 46155
	
2021-12-17 12:29:54	
2021-12-17T10:29:27.254574958Z stderr F DEBUG 2021-12-17 10:29:27,252 [shard 13] cloud_storage - [fiber481 kafka/delete_first_segment/0] - remote_partition.cc:215 - maybe_reset_reader called
	
2021-12-17 12:29:54	
2021-12-17T10:29:27.254570619Z stderr F ERROR 2021-12-17 10:29:27,252 [shard 13] storage - Cannot continue parsing. recived size:0 bytes, expected:582646 bytes. context:parser::skip_batch

It is still try to parse segment inside cloud_storage

@VadimPlh VadimPlh added kind/bug Something isn't working area/cloud-storage Shadow indexing subsystem labels Dec 17, 2021
Lazin added a commit to Lazin/redpanda that referenced this issue Dec 17, 2021
The partition_record_batch_reader_impl component is not stopping when
the underlying remote_partition is stopped. This manifested in the
following situation during failure. The reader stuck in an infinite loop
first. Then the remote_partition was stopped, but the infinite loop
didn't. It continued to consume CPU even when the entire topic was
deleted.

Thic commit fixes this by checking the abort_source inside the
remote_partition. Fixes redpanda-data#3305
@dswang dswang unassigned ztlpn and LenaAn Dec 21, 2021
@Lazin
Copy link
Contributor

Lazin commented Dec 22, 2021

fixed by #3280 and #3293

@Lazin Lazin closed this as completed Dec 22, 2021
Lazin added a commit to Lazin/redpanda that referenced this issue Dec 24, 2021
The partition_record_batch_reader_impl component is not stopping when
the underlying remote_partition is stopped. This manifested in the
following situation during failure. The reader stuck in an infinite loop
first. Then the remote_partition was stopped, but the infinite loop
didn't. It continued to consume CPU even when the entire topic was
deleted.

Thic commit fixes this by checking the abort_source inside the
remote_partition. Fixes redpanda-data#3305

(cherry picked from commit c6ad84d)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cloud-storage Shadow indexing subsystem kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants