[BUG] BookKeeper client constructor leaks resources (threads / connections) when initialization fails partway

**BUG REPORT**

***Describe the bug***

The `BookKeeper` client constructor in
`bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeper.java`
allocates several heavy resources sequentially (worker pools, schedulers,
event loop group, request timer, bookie client, metadata driver, ledger
manager, bookie info reader, etc.). If any step after the early allocations
throws — for example `MetadataClientDriver` initialization fails, or
`LedgerManagerFactory#newLedgerManager()` throws — the already-created
resources are NOT released, because the constructor never returns an
instance and `close()` is therefore unreachable to the caller.

The leaked resources include:
- `scheduler` (`OrderedScheduler`) — 1 leaked thread
- `highPriorityTaskExecutor` (`OrderedScheduler`) — 1 leaked thread
- `mainWorkerPool` (`OrderedExecutor`) — `numWorkerThreads` leaked threads
- `bookieClient` — leaked Netty channels / sockets to bookies
- `metadataDriver` — leaked ZK / metadata-store session
- `requestTimer` (when `ownTimer == true`) — 1 leaked `HashedWheelTimer` thread
- `eventLoopGroup` (when `ownEventLoopGroup == true`) — leaked Netty I/O threads
- `bookieInfoScheduler` (when disk-weight placement enabled) — 1 leaked thread

For long-running processes that retry client creation on transient metadata
failures (e.g. ZK timeouts at startup), every failed `build()` accumulates
threads / sockets / file descriptors until OOM or FD exhaustion.

***To Reproduce***

Steps to reproduce the behavior:

1. Start a BookKeeper cluster normally.
2. Configure a client with a metadata service URI that will reject the
   connection AFTER the worker pools and bookie client have been constructed
   (e.g. point `metadataServiceUri` at an unreachable ZK address such as
   `zk+null://127.0.0.1:1/ledgers`, or at a ZK ensemble that accepts TCP
   but rejects `getLedgerManagerFactory`).
3. In a loop, call:
`try { BookKeeper.forConfig(conf).build(); } catch (Exception ignore) { /* expected */ }`
100 iterations.
5. Run `jstack <pid>` and observe that   `BookKeeperClientScheduler`, `BookKeeperClientWorker-*`,
   `BookKeeperHighPriorityThread` (and on disk-weight setups
   `BKClientMetaDataPollScheduler-*`) thread counts — and bookie / metadata
   sockets — grow monotonically with every failed `build()`.

***Expected behavior***

If the constructor throws, all resources that were already created inside it
must be released before the exception propagates to the caller. After a
failed `build()`, the JVM should retain zero client-owned threads and zero
client-owned sockets attributable to that attempt.

**Proposed fix.**

1. Wrap the constructor body in `try { ... } finally { ... }`, using a
   local `boolean initialized = false` set to `true` only after the last
   initialization step succeeds. In `finally`, call `close()` when
   `!initialized`.
2. Make `close()` safe on a partially initialized instance by null-guarding
   every field dereference. The existing `closed` flag plus `closeLock`
   already make `close()` idempotent, so this also hardens the double-close
   path.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] BookKeeper client constructor leaks resources (threads / connections) when initialization fails partway #4797

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[BUG] BookKeeper client constructor leaks resources (threads / connections) when initialization fails partway #4797

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions