Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-16089: Fix memory leak in RocksDBStore #15174

Merged
merged 1 commit into from
Jan 11, 2024

Conversation

nicktelford
Copy link
Contributor

ColumnFamilyDescriptor is not a RocksObject, which in theory means it's not backed by any native memory allocated by RocksDB.

However, in practice, ColumnFamilyHandle#getDescriptor(), which returns a ColumnFamilyDescriptor, allocates an internal rocksdb::db::ColumnFamilyDescriptor, copying the name and options of the column family into it.

Since the Java ColumnFamilyDescriptor is not a RocksObject, it's not possible to track this allocation and free it from Java.

Fortunately, in our case, we can simply avoid calling ColumnFamilyHandle#getDescriptor(), since we're only interested in the column family name, which is already available on
ColumnFamilyHandle#getName(), which does not leak memory.

We can also optimize away the temporary Options, which was previously a source of memory leaks, because userSpecifiedOptions is an instance of Options.

`ColumnFamilyDescriptor` is _not_ a `RocksObject`, which in theory means
it's not backed by any native memory allocated by RocksDB.

However, in practice, `ColumnFamilyHandle#getDescriptor()`, which
returns a `ColumnFamilyDescriptor`, allocates an internal
`rocksdb::db::ColumnFamilyDescriptor`, copying the name and handle of
the column family into it.

Since the Java `ColumnFamilyDescriptor` is not a `RocksObject`, it's not
possible to track this allocation and free it from Java.

Fortunately, in our case, we can simply avoid calling
`ColumnFamilyHandle#getDescriptor()`, since we're only interested in the
column family name, which is already available on
`ColumnFamilyHandle#getName()`, which does not leak memory.

We can also optimize away the temporary `Options`, which was previously a
source of memory leaks, because `userSpecifiedOptions` is an instance of
`Options`.
@nicktelford
Copy link
Contributor Author

@lucasbru I'm just writing up a more detailed explanation, including graphs, in the ticket. I'm also going to raise a bug with RocksDB, because I believe this to be a bug in RocksJNI.

Copy link
Member

@lucasbru lucasbru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@lucasbru lucasbru merged commit 8904317 into apache:trunk Jan 11, 2024
1 check failed
@nicktelford nicktelford deleted the KAFKA-16089 branch January 11, 2024 17:16
nicktelford added a commit to nicktelford/kafka that referenced this pull request Jan 12, 2024
`ColumnFamilyDescriptor` is _not_ a `RocksObject`, which in theory means
it's not backed by any native memory allocated by RocksDB.

However, in practice, `ColumnFamilyHandle#getDescriptor()`, which
returns a `ColumnFamilyDescriptor`, allocates an internal
`rocksdb::db::ColumnFamilyDescriptor`, copying the name and handle of
the column family into it.

Since the Java `ColumnFamilyDescriptor` is not a `RocksObject`, it's not
possible to track this allocation and free it from Java.

Fortunately, in our case, we can simply avoid calling
`ColumnFamilyHandle#getDescriptor()`, since we're only interested in the
column family name, which is already available on
`ColumnFamilyHandle#getName()`, which does not leak memory.

We can also optimize away the temporary `Options`, which was previously a
source of memory leaks, because `userSpecifiedOptions` is an instance of
`Options`.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
@mjsax mjsax added the streams label Jan 17, 2024
showuon pushed a commit to showuon/kafka that referenced this pull request Jan 22, 2024
`ColumnFamilyDescriptor` is _not_ a `RocksObject`, which in theory means
it's not backed by any native memory allocated by RocksDB.

However, in practice, `ColumnFamilyHandle#getDescriptor()`, which
returns a `ColumnFamilyDescriptor`, allocates an internal
`rocksdb::db::ColumnFamilyDescriptor`, copying the name and handle of
the column family into it.

Since the Java `ColumnFamilyDescriptor` is not a `RocksObject`, it's not
possible to track this allocation and free it from Java.

Fortunately, in our case, we can simply avoid calling
`ColumnFamilyHandle#getDescriptor()`, since we're only interested in the
column family name, which is already available on
`ColumnFamilyHandle#getName()`, which does not leak memory.

We can also optimize away the temporary `Options`, which was previously a
source of memory leaks, because `userSpecifiedOptions` is an instance of
`Options`.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
nicktelford added a commit to nicktelford/kafka that referenced this pull request Jan 24, 2024
`ColumnFamilyDescriptor` is _not_ a `RocksObject`, which in theory means
it's not backed by any native memory allocated by RocksDB.

However, in practice, `ColumnFamilyHandle#getDescriptor()`, which
returns a `ColumnFamilyDescriptor`, allocates an internal
`rocksdb::db::ColumnFamilyDescriptor`, copying the name and handle of
the column family into it.

Since the Java `ColumnFamilyDescriptor` is not a `RocksObject`, it's not
possible to track this allocation and free it from Java.

Fortunately, in our case, we can simply avoid calling
`ColumnFamilyHandle#getDescriptor()`, since we're only interested in the
column family name, which is already available on
`ColumnFamilyHandle#getName()`, which does not leak memory.

We can also optimize away the temporary `Options`, which was previously a
source of memory leaks, because `userSpecifiedOptions` is an instance of
`Options`.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
yyu1993 pushed a commit to yyu1993/kafka that referenced this pull request Feb 15, 2024
`ColumnFamilyDescriptor` is _not_ a `RocksObject`, which in theory means
it's not backed by any native memory allocated by RocksDB.

However, in practice, `ColumnFamilyHandle#getDescriptor()`, which
returns a `ColumnFamilyDescriptor`, allocates an internal
`rocksdb::db::ColumnFamilyDescriptor`, copying the name and handle of
the column family into it.

Since the Java `ColumnFamilyDescriptor` is not a `RocksObject`, it's not
possible to track this allocation and free it from Java.

Fortunately, in our case, we can simply avoid calling
`ColumnFamilyHandle#getDescriptor()`, since we're only interested in the
column family name, which is already available on
`ColumnFamilyHandle#getName()`, which does not leak memory.

We can also optimize away the temporary `Options`, which was previously a
source of memory leaks, because `userSpecifiedOptions` is an instance of
`Options`.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
clolov pushed a commit to clolov/kafka that referenced this pull request Apr 5, 2024
`ColumnFamilyDescriptor` is _not_ a `RocksObject`, which in theory means
it's not backed by any native memory allocated by RocksDB.

However, in practice, `ColumnFamilyHandle#getDescriptor()`, which
returns a `ColumnFamilyDescriptor`, allocates an internal
`rocksdb::db::ColumnFamilyDescriptor`, copying the name and handle of
the column family into it.

Since the Java `ColumnFamilyDescriptor` is not a `RocksObject`, it's not
possible to track this allocation and free it from Java.

Fortunately, in our case, we can simply avoid calling
`ColumnFamilyHandle#getDescriptor()`, since we're only interested in the
column family name, which is already available on
`ColumnFamilyHandle#getName()`, which does not leak memory.

We can also optimize away the temporary `Options`, which was previously a
source of memory leaks, because `userSpecifiedOptions` is an instance of
`Options`.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
Phuc-Hong-Tran pushed a commit to Phuc-Hong-Tran/kafka that referenced this pull request Jun 6, 2024
`ColumnFamilyDescriptor` is _not_ a `RocksObject`, which in theory means
it's not backed by any native memory allocated by RocksDB.

However, in practice, `ColumnFamilyHandle#getDescriptor()`, which
returns a `ColumnFamilyDescriptor`, allocates an internal
`rocksdb::db::ColumnFamilyDescriptor`, copying the name and handle of
the column family into it.

Since the Java `ColumnFamilyDescriptor` is not a `RocksObject`, it's not
possible to track this allocation and free it from Java.

Fortunately, in our case, we can simply avoid calling
`ColumnFamilyHandle#getDescriptor()`, since we're only interested in the
column family name, which is already available on
`ColumnFamilyHandle#getName()`, which does not leak memory.

We can also optimize away the temporary `Options`, which was previously a
source of memory leaks, because `userSpecifiedOptions` is an instance of
`Options`.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants