-
Notifications
You must be signed in to change notification settings - Fork 552
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[transaction] Clear old pids from rm_stm state in overflow #7057
Conversation
2e116dc
to
43dde1b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tx houskeeping looks good while the idempotency has issues. also please add tests for the idempotency, something like:
- create a producer, produce
- create n producers to exhaust the limit and kick out the first producer
- produce via the first producer
- observe error
I am wondering if we want to remove the oldest PIDs from all of them ?) Similar approach is used in |
080dd5e
to
86aac9c
Compare
5e9115e
to
d89123e
Compare
f993eed
to
3c5baf7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need @dotnwat's feedback on the name of the property but otherwise it looks good!
@@ -2232,6 +2234,8 @@ void rm_stm::apply_data(model::batch_identity bid, model::offset last_offset) { | |||
_log_state.lru_idempotent_pids.iterator_to(seq_it->second)); | |||
} | |||
_log_state.lru_idempotent_pids.push_back(seq_it->second); | |||
spawn_background_clean_for_pids( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: drop rm_stm::
? sometimes you use clear_type
sometimes not
We need to prevent overflow for some state maps, so we need to delete info for pids which does not have tx. For this we just should check is pid inside tx_seqs map or not. If not it means tx was commited or aborted
To track old session for idempotent request we need to store replication order for them for delete old session in right order.
So idea is clear internal state for oldest session if we have overflow. we just need to delete info for oldest pid in out state
New option will control the max size of internal maps in rm_stm. User can use transaction/idempotency in different way. Some of them can be cause of problem. For example: user can create producer per request. Idempotency in kafka protocol works per session, the session is scoped by a producer so if user create a producer per request we will store all info for each producer in Idempotency case. It does not provide any benefits for user, but will create big maps in internal state for rm_stm. Also the same things for transaction.
Run cleaning for old session in replicate_seq and apply_data
/backport v22.3.x |
Cover letter
Problem
User can use transaction/idempotency in different way. Some of them can be cause of problem. For example: user can create producer per request. Idempotency in kafka protocol works per session, the session is scoped by a producer so if user create a producer per request we will store all info for each producer in Idempotency case. It does not provide any benefits for user, but will create big maps in internal state for
rm_stm
. Also the same things for transaction.We need to prevent this type overflow. So idea is add new settings (
max_concurrent_producer_ids
). And whenrm_stm
will reach this limits we will spawn clear process for log and mame state.Fix
For Idempotency we use
_log_state.seq_table
, where we store session info. The simplest way to track order of lastapply_data
for pid. We will store seq_entry inside intrusive list to store replication order.For transaction is a little bit harder. We can not delete any info for ongoing transaction. But we can understand do user finish transaction or not, For this we just need to check
_log_state.tx_seqs
. Because on abort or commit we clear this map. So for all commited/aborted txs we just can delete all info from internal mapsFixes #5321
Backport Required
UX changes
Release notes
Features
max_concurrent_producer_ids
) to control how much sessions for transaction/idempotency will be saved