Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid crash in recovery by pessimistically looking up the earliest key #289

Merged

Conversation

cbrand
Copy link
Contributor

@cbrand cbrand commented Mar 24, 2022

Description

The earliest_offset function in the consumer seems to not always return all tps which have been passed in but presumabely only the ones it has data for.

After longer runtime it seems that the response MAY return a dict not filled with all topic partitions which have been added. To bypass this and not lead faust to crash, use the .get function on the returned mapping and default to None on return.

I think this error occurs if the request happens during a rebalance looking at the code base of aiokafka, but am not totally sure. This change should however not crash the overall recovery process if this occurs. If a rebalance happens on the next call the recovery will restart but will not hard crash as it does right now.

There also seems to be an old issue which I address in this PR:
(Fixes #73)

@cbrand
Copy link
Contributor Author

cbrand commented Mar 24, 2022

Cherry picked the flake8 fixes from #288 to allow CI to pass.

@codecov-commenter
Copy link

codecov-commenter commented Mar 24, 2022

Codecov Report

Merging #289 (b93ee88) into master (e14a550) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master     #289   +/-   ##
=======================================
  Coverage   94.70%   94.70%           
=======================================
  Files         100      100           
  Lines       10852    10852           
  Branches     1518     1518           
=======================================
  Hits        10277    10277           
  Misses        501      501           
  Partials       74       74           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e14a550...b93ee88. Read the comment docs.

@cbrand cbrand force-pushed the fix/build-offsets-key-error branch from 92d415f to 221689f Compare March 24, 2022 10:11
The earliest_offset function in the consumer seems to not always return
all tps which have been passed in but presumabely only the ones it has
data for.

After longer runtime it seems that the response MAY return a dict
not filled with all topic partitions which have been added. To bypass
this and not lead faust to crash, use the .get function on the returned
mapping and default to None on return.
@cbrand cbrand force-pushed the fix/build-offsets-key-error branch from e2ffde0 to f26ece8 Compare March 24, 2022 10:33
@patkivikram patkivikram merged commit db6a3ae into faust-streaming:master Jun 7, 2022
@cbrand cbrand deleted the fix/build-offsets-key-error branch May 2, 2023 12:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RocksDB recovery crashes with KeyError
3 participants