Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-38178][SS] Correct the logic to measure the memory usage of RocksDB #35480

Closed
wants to merge 1 commit into from

Conversation

Myasuka
Copy link
Member

@Myasuka Myasuka commented Feb 10, 2022

What changes were proposed in this pull request?

Correct the logic to measure the memory usage of RocksDB to include the memory used by block cache.
As "block-cache-pinned-usage" is included in "block-cache-usage", we don't need to sum the pinned usage separately.

Why are the changes needed?

Current reported metrics of RocksDB memory usage is not correct.

Does this PR introduce any user-facing change?

NO

How was this patch tested?

The information could refer to https://github.com/facebook/rocksdb/wiki/Memory-usage-in-RocksDB

@Myasuka
Copy link
Member Author

Myasuka commented Feb 10, 2022

cc @zsxwing @viirya, please take a look.

@@ -403,7 +404,7 @@ class RocksDB(
RocksDBMetrics(
numKeysOnLoadedVersion,
numKeysOnWritingVersion,
readerMemUsage + memTableMemUsage,
readerMemUsage + memTableMemUsage + blockCacheUsage,
Copy link
Member

@viirya viirya Feb 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

based on the doc, do we also need rocksdb.block-cache-pinned-usage?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for @viirya 's comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

based on the doc, do we also need rocksdb.block-cache-pinned-usage?

No, I don't think so, and I actually write my conclusion in original ticket SPARK-38178:
BTW, as the "block-cache-pinned-usage" is included in "block-cache-usage", we don't need to include the pinned usage.

Let's take a look at the RocksDB implementation:

  {DB::Properties::kBlockCacheUsage,
   {false, nullptr, &InternalStats::HandleBlockCacheUsage, nullptr,
    nullptr}},
  {DB::Properties::kBlockCachePinnedUsage,
   {false, nullptr, &InternalStats::HandleBlockCachePinnedUsage, nullptr,
    nullptr}},
bool InternalStats::HandleBlockCacheUsage(uint64_t* value, DBImpl* /*db*/,
                                          Version* /*version*/) {
  Cache* block_cache;
  bool ok = GetBlockCacheForStats(&block_cache);
  if (!ok) {
    return false;
  }
  *value = static_cast<uint64_t>(block_cache->GetUsage());
  return true;
}

bool InternalStats::HandleBlockCachePinnedUsage(uint64_t* value, DBImpl* /*db*/,
                                                Version* /*version*/) {
  Cache* block_cache;
  bool ok = GetBlockCacheForStats(&block_cache);
  if (!ok) {
    return false;
  }
  *value = static_cast<uint64_t>(block_cache->GetPinnedUsage());
  return true;
}
size_t LRUCacheShard::GetUsage() const {
  MutexLock l(&mutex_);
  return usage_;
}

size_t LRUCacheShard::GetPinnedUsage() const {
  MutexLock l(&mutex_);
  assert(usage_ >= lru_usage_);
  return usage_ - lru_usage_;
}

As you can see, the pinned usage is included in the block cache usage.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the detail, @Myasuka .

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. Maybe it is good to put a sentence in the PR description to mention the fact that "block-cache-pinned-usage" is included in "block-cache-usage", so we don't need to include it separately.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@viirya Thanks for your suggestions, I have already updated the PR description.

Copy link
Member

@viirya viirya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like that is something we need to add to RocksDB memory usage.

@HyukjinKwon
Copy link
Member

cc @xuanyuanking too FYI

dongjoon-hyun pushed a commit that referenced this pull request Feb 11, 2022
…cksDB

### What changes were proposed in this pull request?

Correct the logic to measure the memory usage of RocksDB to include the memory used by block cache.
As "block-cache-pinned-usage" is included in "block-cache-usage", we don't need to sum the pinned usage separately.

### Why are the changes needed?

Current reported metrics of RocksDB memory usage is not correct.

### Does this PR introduce _any_ user-facing change?

NO

### How was this patch tested?

The information could refer to https://github.com/facebook/rocksdb/wiki/Memory-usage-in-RocksDB

Closes #35480 from Myasuka/rocksdb-mem-usage.

Authored-by: Yun Tang <myasuka@live.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 93a9464)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
@dongjoon-hyun
Copy link
Member

Merged to master/3.2. Thank you, @Myasuka and all.

@dongjoon-hyun
Copy link
Member

Welcome to the Apache Spark community, @Myasuka .
You are added to the Apache Spark contributor group too and SPARK-38178 is assigned to you.

kazuyukitanimura pushed a commit to kazuyukitanimura/spark that referenced this pull request Aug 10, 2022
…cksDB

### What changes were proposed in this pull request?

Correct the logic to measure the memory usage of RocksDB to include the memory used by block cache.
As "block-cache-pinned-usage" is included in "block-cache-usage", we don't need to sum the pinned usage separately.

### Why are the changes needed?

Current reported metrics of RocksDB memory usage is not correct.

### Does this PR introduce _any_ user-facing change?

NO

### How was this patch tested?

The information could refer to https://github.com/facebook/rocksdb/wiki/Memory-usage-in-RocksDB

Closes apache#35480 from Myasuka/rocksdb-mem-usage.

Authored-by: Yun Tang <myasuka@live.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 93a9464)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 6646212)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants