Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

c/rm_stm: cleanup tx_locks #15796

Merged
merged 1 commit into from
Dec 20, 2023
Merged

Conversation

rockwotj
Copy link
Contributor

We never cleanup tx_locks, clear_old_tx_pids seems like the right place
to be doing this. Two things:

  1. We should be ensuring the tx_lock isn't held when expiring, this
    mirrors the logic in is_producer_expired.
  2. We actually need to erase the pid from the tx_lock. Right now it
    grows unboundedly.

So when we collect our pids to delete we break the mutex so that if that
pid is ever used again before deleting it from the map we will not allow
progress. I am not sure if we need error handling someplace special for
this potentially broken semaphore.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.1.x
  • v22.3.x
  • v22.2.x

Release Notes

Bug Fixes

  • Fix a memory leak when transactions are used with many different producer IDs.

We never cleanup tx_locks, clear_old_tx_pids seems like the right place
to be doing this. Two things:

1. We should be ensuring the tx_lock isn't held when expiring, this
   mirrors the logic in `is_producer_expired`.
2. We actually need to erase the pid from the tx_lock. Right now it
   grows unboundedly.

It seems possible that a producer of the same ID but a higher epoch
could be holding/using the lock, so in that case we don't need to do
anything, otherwise we can GC the lock when cleaning up the PID.

Signed-off-by: Tyler Rockwood <rockwood@redpanda.com>
@rockwotj rockwotj merged commit f20a25b into redpanda-data:v23.2.x Dec 20, 2023
23 checks passed
@rockwotj rockwotj deleted the stop-the-leak branch December 20, 2023 07:31
@rockwotj
Copy link
Contributor Author

/backport v23.1.x

@vbotbuildovich
Copy link
Collaborator

The pull request's base branch is not the default one. Cancelling backport...

Workflow run logs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants