Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bookie GC doesn't clean up overreplicated ledgers from entry logs #4632

Closed
rdhabalia opened this issue Jun 28, 2019 · 2 comments
Closed

Bookie GC doesn't clean up overreplicated ledgers from entry logs #4632

rdhabalia opened this issue Jun 28, 2019 · 2 comments
Assignees
Labels
type/bug The PR fixed a bug or issue reported a bug

Comments

@rdhabalia
Copy link
Contributor

Motivation

  • Sometimes due to overreplication, bookie contains ledgers which are not owned by that bookie anymore and that bookie is not part of the ensemble-list of those ledgers. In this case, GC finds out those overreplicated ledgers and
  • deletes their index from dbStorage (rocksDB) and
  • tries to delete them from entrylog files.

However, bookie doesn't delete them from entry-log files due to change made in #870 where bookie avoids deleting ledger if znode of that ledger exists.

Because of that bookie ends up storing large number entrylog files with ledgers which are owned by different bookies. It also cause OOM when GC tries to deal with large number of entry log files.

Fix

  1. OOM should be addressed by: 1949 or 1938
  2. And clean up overreplicated ledgers which are owned by different bookies should be fixed by this commit
    I will create a PR with this fix.
@rdhabalia
Copy link
Contributor Author

@sijie I addressed all comments under: 1949

and also created PR to fix this issue: apache/bookkeeper#2119

eolivelli pushed a commit to apache/bookkeeper that referenced this issue Jul 12, 2019
### Motivation
As described at: apache/pulsar#4632
- Sometimes due to overreplication, bookie contains ledgers which are not owned by that bookie anymore and that bookie is not part of the ensemble-list of those ledgers. In this case, GC finds out those overreplicated ledgers and 
- deletes their index from dbStorage (rocksDB) and 
- tries to delete them from entrylog files.

However, bookie doesn't delete them from entry-log files due to change made in [#870](#870) where bookie avoids deleting ledger if znode of that ledger exists.

Because of that bookie ends up storing large number entrylog files with ledgers which are owned by different bookies. It also cause OOM when GC tries to deal with large number of entry log files.

### Modification

Delete the ledgers if bookie is not part of ensemble list of over-replicated ledgers

Reviewers: Enrico Olivelli <eolivelli@gmail.com>, Sijie Guo <sijie@apache.org>

This closes #2119 from rdhabalia/overRepl
@sijie
Copy link
Member

sijie commented Jul 6, 2020

The feature has been implemented in BookKeeper apache/bookkeeper#2119

@sijie sijie closed this as completed Jul 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug The PR fixed a bug or issue reported a bug
Projects
None yet
Development

No branches or pull requests

2 participants