Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(compact): make get_compact_task easy to read #2122

Merged
merged 10 commits into from
Apr 28, 2022

Conversation

Little-Wallace
Copy link
Contributor

@Little-Wallace Little-Wallace commented Apr 25, 2022

What's changed and what's your intention?

close #2121
This is also part of leveled-compaction

Checklist

  • I have written necessary docs and comments
  • I have added necessary unit tests and integration tests

Refer to a related PR or issue link (optional)

commit 72f62ba537d3e4093ff2f1f2388d46e9e698b67d
Author: Little-Wallace <bupt2013211450@gmail.com>
Date:   Tue Apr 26 09:49:24 2022 +0800

    address comment

    Signed-off-by: Little-Wallace <bupt2013211450@gmail.com>

commit 745787cf0d24281d072a7602568dc3eec4bd151e
Author: Little-Wallace <bupt2013211450@gmail.com>
Date:   Mon Apr 25 18:31:06 2022 +0800

    refactor pick up files

    Signed-off-by: Little-Wallace <bupt2013211450@gmail.com>

commit 2d95e44fc1da705ee7e228b594563605fba13576
Author: Little-Wallace <bupt2013211450@gmail.com>
Date:   Mon Apr 25 18:08:18 2022 +0800

    refactor compaction

    Signed-off-by: Little-Wallace <bupt2013211450@gmail.com>

commit ab7a06fca9663a2c0817da8a05c4c1aa35bfb581
Author: Little-Wallace <bupt2013211450@gmail.com>
Date:   Mon Apr 25 17:48:45 2022 +0800

    fix conflict

    Signed-off-by: Little-Wallace <bupt2013211450@gmail.com>

commit 1db91fc9abbe101187c78ed754588cbee9832615
Author: Little-Wallace <bupt2013211450@gmail.com>
Date:   Mon Apr 25 17:06:20 2022 +0800

    support concurrent compaction

    Signed-off-by: Little-Wallace <bupt2013211450@gmail.com>

commit 1c152f324534ea0dca2fcd0f5548d74d1fee199f
Author: Little-Wallace <bupt2013211450@gmail.com>
Date:   Mon Apr 25 14:56:34 2022 +0800

    fix sst order

    Signed-off-by: Little-Wallace <bupt2013211450@gmail.com>

commit 3a5f839d43730d00afc67c7950b519189b6d5338
Author: Little-Wallace <bupt2013211450@gmail.com>
Date:   Sun Apr 24 23:51:00 2022 +0800

    refactor compaction

    Signed-off-by: Little-Wallace <bupt2013211450@gmail.com>

commit 481485d33c9b730f08cf75f564149ba8576cc4d1
Author: Little-Wallace <bupt2013211450@gmail.com>
Date:   Sun Apr 24 15:05:53 2022 +0800

    refactor compact

    Signed-off-by: Little-Wallace <bupt2013211450@gmail.com>

Signed-off-by: Little-Wallace <bupt2013211450@gmail.com>
Signed-off-by: Little-Wallace <bupt2013211450@gmail.com>
Signed-off-by: Little-Wallace <bupt2013211450@gmail.com>
Copy link
Contributor

@soundOfDestiny soundOfDestiny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will come back immediately

@soundOfDestiny
Copy link
Contributor

Does we support parallel L0 to L1 here?

@soundOfDestiny
Copy link
Contributor

@Little-Wallace tells me that these code are expected to support parallel.
but if we have [7,8) [7,8) [0,1) [5,6) [9,10) in epoch order in L0 and first [7,8) is now being compacted, not only the second [7,8], but other SSTs cannot be selected to compact as well.

Signed-off-by: Little-Wallace <bupt2013211450@gmail.com>
@codecov
Copy link

codecov bot commented Apr 26, 2022

Codecov Report

Merging #2122 (eb9c925) into main (ade4fea) will increase coverage by 0.03%.
The diff coverage is 91.82%.

@@            Coverage Diff             @@
##             main    #2122      +/-   ##
==========================================
+ Coverage   70.90%   70.94%   +0.03%     
==========================================
  Files         652      654       +2     
  Lines       82790    82858      +68     
==========================================
+ Hits        58704    58785      +81     
+ Misses      24086    24073      -13     
Flag Coverage Δ
rust 70.94% <91.82%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/meta/src/hummock/mod.rs 31.13% <0.00%> (+2.18%) ⬆️
src/meta/src/hummock/vacuum.rs 84.23% <ø> (ø)
src/storage/hummock_sdk/src/compact.rs 100.00% <ø> (ø)
src/meta/src/hummock/hummock_manager.rs 91.56% <70.58%> (-0.07%) ⬇️
src/meta/src/hummock/compaction/mod.rs 83.53% <83.53%> (ø)
src/meta/src/hummock/level_handler.rs 73.80% <91.37%> (-26.20%) ⬇️
...a/src/hummock/compaction/tier_compaction_picker.rs 97.84% <97.84%> (ø)
...rc/meta/src/hummock/compaction/overlap_strategy.rs 100.00% <100.00%> (ø)
src/meta/src/hummock/metrics_utils.rs 96.24% <100.00%> (-0.47%) ⬇️
src/meta/src/hummock/test_utils.rs 95.41% <100.00%> (+0.03%) ⬆️
... and 10 more

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

@soundOfDestiny
Copy link
Contributor

@Little-Wallace tells me that these code are expected to support parallel.
but if we have [7,8) [7,8) [0,1) [5,6) [9,10) in epoch order in L0 and first [7,8) is now being compacted, not only the second [7,8], but other SSTs cannot be selected to compact as well.

and it cannot be simply postponed to next PR, because if we simply enable this situation in code of this PR, it will cause overlapping SSTs generated in L1. Therefore I guess code here perhaps needs to be refactored (for example, key range shall be retained to maintain this situation) to implement points above.

@Little-Wallace
Copy link
Contributor Author

Before we support vertical group, the file compact from shared buffer is very likely to overlap with other files.
After we support vertical group, we will split files with different prefix to different group, such sene is not need.

@soundOfDestiny
Copy link
Contributor

but I worry that we base our LSM on an assumption that 'each L0 SST is in same key range'

@soundOfDestiny
Copy link
Contributor

Actually, if we make the assumption that 'each L0 SST is in same key range', code can be much more cleaner because all we need is just compaction. We do not need to care key ranges of SST any more.
and if not, codes here haven't supported very well.

@Little-Wallace
Copy link
Contributor Author

If they are not in the same key range, we can also compact them in parallel.
The only thing that is not allowed is that there are three sst files: sst1, sst2, sst3. When the sst1 is doing compaction job, we can not start a compact job include sst3 but exclude sst2.

@soundOfDestiny
Copy link
Contributor

If they are not in the same key range, we can also compact them in parallel.
The only thing that is not allowed is that there are three sst files: sst1, sst2, sst3. When the sst1 is doing compaction job, we can not start a compact job include sst3 but exclude sst2.

but if sst2 is from L0 intra compaction, this situation is very likely to happen then

@soundOfDestiny
Copy link
Contributor

soundOfDestiny commented Apr 26, 2022

If current code is not OOP enough, I can refactor them. However, I personally don't want to drop some existing feature without reasonable consideration because I guess those features will not impact code tidy.

@soundOfDestiny
Copy link
Contributor

In addition, if we are used to depending on L0 intra compaction, it will make each L0 SST have a very large key range, which will involve very many L1 SSTs while being compacted to L1. From my perspective, I think we should reduce L0 intra compaction to avoid this situation.

@soundOfDestiny
Copy link
Contributor

Certainly, if there are too many L0 SSTs and L1 is in congestion, we have to do L0 intra compaction. However, if L1 is not in congestion and amount of SSTs in L0 is acceptable, I prefer compacting them to L1 respectively to doing L0 intra compaction first.
Note that compacting them to L1 respectively would not cause extra write amplification because with the condition that SSTs in L1 are non-overlapped, we can compact L0 SSTs with corresponding 1~2 L1 SSTs each.

Signed-off-by: Little-Wallace <bupt2013211450@gmail.com>
@soundOfDestiny
Copy link
Contributor

Suppose that there are 3 SSTs in L0 in this order:
[0, 1) in compaction, [0, 10) not in compaction, [6, 7) not in compaction
current code will collect [6, 7) as next compaction, which is incorrect

Signed-off-by: Little-Wallace <bupt2013211450@gmail.com>
Signed-off-by: Little-Wallace <bupt2013211450@gmail.com>
Copy link
Contributor

@zwang28 zwang28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Signed-off-by: Little-Wallace <bupt2013211450@gmail.com>
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

license-eye has totally checked 759 files.

Valid Invalid Ignored Fixed
756 1 2 0
Click to see the invalid file list
  • src/meta/src/hummock/compaction/overlap_strategy.rs

Signed-off-by: Little-Wallace <bupt2013211450@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

refactor: make compact status code easy to read
3 participants