Fix regression issue of too large score #10518

siying · 2022-08-10T23:30:25Z

Summary:
#10057 caused a regression bug: since the base level size is not adjusted based on L0 size anymore, L0 score might become very large. This makes compaction heavily favor L0->L1 compaction against L1->L2 compaction, and cause in some cases, data stuck in L1 without being moved down. We fix calculating a score of L0 by size(L0)/size(L1) in the case where L0 is large..

Test Plan: run db_bench against data on tmpfs and watch the behavior of data stuck in L1 goes away.

Summary: facebook#10057 caused a regression bug: since the base level size is not adjusted based on L0 size anymore, L0 score might become very large. This makes compaction heavily favor L0->L1 compaction against L1->L2 compaction, and cause in some cases, data stucks in L1 without being moved down. We fix calculating a score of L0 in the same way of L1 so that L0->L1 is favored if L0 size is larger than L1. Test Plan: run db_bench against data on tmpfs and watch the behavior of data stuck in L1 goes away.

facebook-github-bot · 2022-08-10T23:39:47Z

@siying has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2022-08-11T04:22:18Z

@siying has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot · 2022-08-11T04:23:28Z

@siying has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

ajkr

LGTM!

db/version_set.cc

facebook-github-bot · 2022-08-11T19:52:03Z

@siying has updated the pull request. You must reimport the pull request before landing.

ajkr

Change makes sense to me. Score(L0) looks lower than pre-#10057 but higher than post-#10057 in case of growing L0 and Lbase. In any case I am fine to try this.

facebook-github-bot · 2022-08-11T22:08:57Z

@siying has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot · 2022-08-11T22:09:13Z

@siying has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

ajkr · 2022-08-11T22:20:12Z

db/version_set.cc

+              // It might be guafanteed by logic below anyway, but we are
+              // explicit here to make sure we don't stop writes with no
+              // compaction scheduled.
+              score = std::max(score, 1.01);


Why not use std::max(score, std::min(level_multiplier_, total_size / max_bytes_for_level_base)) so increased Size(L0) would actually cause score to increase even when it's smaller than Size(Lbase)? In the current PR, for example, if Score(Lbase) = 3, then Size(L0)/Size(Lbase) must be larger than 3 to prioritize L0->Lbase. That seems like too large an L0 and also an hourglass shape. With total_size / max_bytes_for_level_base we will at least prefer L0 whenever it's larger than Lbase.

I tried that. I faced a problem that total_size / max_bytes_for_level_base can be very large. In the current algorithm, actually L1 will have a much smaller score than L0 (because we de-prioritize a level when there are lots of pending data in upper levels), and this causes data stuck in L1. That's why it is always divided by Size(Lbase) if Lbase is relatively large, this is to make it possible for L0 score could be smaller than Lbase if Lbase is large.

I faced a problem that total_size / max_bytes_for_level_base can be very large

The std::min(level_multiplier_, total_size / max_bytes_for_level_base) intended to prevent that case

That's why it is always divided by Size(Lbase) if Lbase is relatively large, this is to make it possible for L0 score could be smaller than Lbase if Lbase is large.

I think Score(L0) < Score(Lbase) when Size(L0) < Size(Lbase) is good enough. The current formula makes Score(L0) < Score(Lbase) when Size(L0) < Score(Lbase) * Size(Lbase), which just seems like L0 may have to be much much bigger than Lbase in order for L0 to be prioritized.

Anyway I don't mind trying it if you want since hourglass LSM might work fine. I haven't measured it. So my accept still stands.

I think Score(L0) < Score(Lbase) when Size(L0) < Size(Lbase) is good enough. The current formula makes Score(L0) < Score(Lbase) when Size(L0) < Score(Lbase) * Size(Lbase), which just seems like L0 may have to be much much bigger than Lbase in order for L0 to be prioritized.

The benchmark I ran (the one Mark used) actually shows that L0->L1 still feels prioritized perhaps too much against L1->L2. If L0 is larger than L1, L0->L1 is always prioritized against L1->L2, because the score Math makes L1 score < 0.5 (5 after scaling). Similarly, if number of files are more than number of L0 file trigger, we almost always pick from L0. L1->L2 usually gets picked up with L1 is significantly larger than L0.

Here are some snapshots of the benchmark runs (level multiplier is 8 and base level size base is 64MB, exact parameters used by Mark):

L0 12/10 321.94 MB L4 23/23 368.26 MB L5 11/0 176.77 MB L6 112/4 1.50 GB L7 798/23 11.42 GB L0 8/6 218.41 MB L4 24/24 374.77 MB L5 20/0 317.81 MB L6 118/5 1.53 GB L7 814/25 11.69 GB L0 10/6 278.14 MB L4 20/20 312.10 MB L5 11/0 175.13 MB L6 119/4 1.55 GB L7 830/18 11.96 GB L0 9/6 307.93 MB L4 20/20 321.85 MB L5 12/0 193.10 MB L6 113/0 1.52 GB L7 853/0 12.21 GB L0 5/0 131.31 MB L4 26/0 402.99 MB L5 18/0 282.12 MB L6 123/3 1.60 GB L7 871/15 12.42 GB

If L0 is larger than L1, L0->L1 is always prioritized against L1->L2, because the score Math makes L1 score < 0.5 (5 after scaling).

Got it, did the math and agreed Score(L0) > Score(Lbase) when Size(L0) > Size(Lbase). I had forgotten to incorporate Size(L0) in the denominator of Score(Lbase) earlier (courtesy of total_downcompact_bytes). This is challenging to reason about but I am glad it works.

Similarly, if number of files are more than number of L0 file trigger, we almost always pick from L0.

Understood this part already. The size-based component of Score(L0) is not used for this case.

the score Math makes L1 score < 0.5 (5 after scaling).

My calculation for Score(Lbase) is 10 * S'(Lbase) / (max_bytes_for_level_base + S(L0)). S'(Li) is the sum of compensated size of files not being compacted on level i, and S(Li) is sum of all actual file sizes on level i.

I don't see why S'(L0) > S'(Lbase) (assuming that's what you mean by L0 is larger than L1) means Score(Lbase) is < 5. It looks to be typically less than 10, unless Lbase files have high size compensation.

For my attempt to prove Score(L0) > Score(Lbase) in this scenario (S'(L0) > S'(Lbase)), it required the following two small tweaks. I think they fix a case where Lbase has a lot of deletions and wonder if you think they make sense.

base_level_size uses S'(Lbase) instead of S*(Lbase). S*(Li) is sum of all compensated file sizes on level i. This change removes files undergoing compaction from base_level_size

total_downcompact_bytes uses S'(Li) instead of S(Li). This changes actual file size to compensated file size in total_downcompact_bytes, and removes files undergoing compaction.

Summary: #10057 caused a regression bug: since the base level size is not adjusted based on L0 size anymore, L0 score might become very large. This makes compaction heavily favor L0->L1 compaction against L1->L2 compaction, and cause in some cases, data stuck in L1 without being moved down. We fix calculating a score of L0 by size(L0)/size(L1) in the case where L0 is large.. Pull Request resolved: #10518 Test Plan: run db_bench against data on tmpfs and watch the behavior of data stuck in L1 goes away. Reviewed By: ajkr Differential Revision: D38603145 fbshipit-source-id: 4949e52dc28b54aacfe08417c6e6cc7e40a27225

Summary: facebook#10057 caused a regression bug: since the base level size is not adjusted based on L0 size anymore, L0 score might become very large. This makes compaction heavily favor L0->L1 compaction against L1->L2 compaction, and cause in some cases, data stuck in L1 without being moved down. We fix calculating a score of L0 by size(L0)/size(L1) in the case where L0 is large.. Pull Request resolved: facebook#10518 Test Plan: run db_bench against data on tmpfs and watch the behavior of data stuck in L1 goes away. Reviewed By: ajkr Differential Revision: D38603145 fbshipit-source-id: 4949e52dc28b54aacfe08417c6e6cc7e40a27225 Signed-off-by: tabokie <xy.tao@outlook.com>

siying requested a review from ajkr August 10, 2022 23:30

facebook-github-bot added the CLA Signed label Aug 10, 2022

Modify

85c1dd7

ajkr approved these changes Aug 11, 2022

View reviewed changes

db/version_set.cc Outdated Show resolved Hide resolved

ajkr reviewed Aug 11, 2022

View reviewed changes

db/version_set.cc Outdated Show resolved Hide resolved

New updates

b7dc80d

ajkr approved these changes Aug 11, 2022

View reviewed changes

More fix

c9c77a6

ajkr reviewed Aug 11, 2022

View reviewed changes

facebook-github-bot closed this in 2297769 Aug 12, 2022

tabokie mentioned this pull request Aug 14, 2023

cherry-pick compaction enhancements tikv/rocksdb#345

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix regression issue of too large score #10518

Fix regression issue of too large score #10518

siying commented Aug 10, 2022 •

edited

Loading

facebook-github-bot commented Aug 10, 2022

facebook-github-bot commented Aug 11, 2022

facebook-github-bot commented Aug 11, 2022

ajkr left a comment

facebook-github-bot commented Aug 11, 2022

ajkr left a comment

facebook-github-bot commented Aug 11, 2022

facebook-github-bot commented Aug 11, 2022

ajkr Aug 11, 2022

siying Aug 11, 2022

ajkr Aug 12, 2022

siying Aug 12, 2022 •

edited

Loading

ajkr Aug 13, 2022

ajkr Aug 15, 2022 •

edited

Loading

Fix regression issue of too large score #10518

Fix regression issue of too large score #10518

Conversation

siying commented Aug 10, 2022 • edited Loading

facebook-github-bot commented Aug 10, 2022

facebook-github-bot commented Aug 11, 2022

facebook-github-bot commented Aug 11, 2022

ajkr left a comment

Choose a reason for hiding this comment

facebook-github-bot commented Aug 11, 2022

ajkr left a comment

Choose a reason for hiding this comment

facebook-github-bot commented Aug 11, 2022

facebook-github-bot commented Aug 11, 2022

ajkr Aug 11, 2022

Choose a reason for hiding this comment

siying Aug 11, 2022

Choose a reason for hiding this comment

ajkr Aug 12, 2022

Choose a reason for hiding this comment

siying Aug 12, 2022 • edited Loading

Choose a reason for hiding this comment

ajkr Aug 13, 2022

Choose a reason for hiding this comment

ajkr Aug 15, 2022 • edited Loading

Choose a reason for hiding this comment

siying commented Aug 10, 2022 •

edited

Loading

siying Aug 12, 2022 •

edited

Loading

ajkr Aug 15, 2022 •

edited

Loading