branch-4.1: [fix](cloud) Fix tablets permanently invisible to compaction scheduler due to race condition in CloudTabletMgr::get_tablet (#60832)#61612
Open
bobhan1 wants to merge 1 commit intoapache:branch-4.1from
Conversation
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
Contributor
Author
|
run buildall |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
…r due to race condition in `CloudTabletMgr::get_tablet` (apache#60832) Issue Number: close #xxx Related PR: apache#57922 Fix a race condition in `CloudTabletMgr::get_tablet()` introduced by disappear from `_tablet_map`, making them invisible to the compaction scheduler. This leads to tables accumulating hundreds of rowsets without any compaction under high-frequency import. ## Root Cause Commit `0918952c70` refactored `get_tablet()` by moving `_cache->insert()` and `_tablet_map->put()` from inside the `SingleFlight` lambda to outside it. This introduced a race condition: 1. When N concurrent `get_tablet()` calls arrive for the same `tablet_id`, `SingleFlight` executes the load lambda only once (by the "leader"), but all N callers receive a `shared_ptr` pointing to the same `CloudTablet` object. 2. After `SingleFlight::load()` returns, all N callers independently execute: ```cpp _cache->insert(key, value, ...) // each creates a competing LRU cache entry _tablet_map->put(tablet) // each inserts into tablet_map ``` 3. **Each `_cache->insert()` evicts the previous entry for the same key. The evicted `Value`'s destructor calls `_tablet_map.erase(tablet.get())`**. The safety check `it->second.get() == tablet` was designed to prevent erasing a newer tablet object — but here **all callers share the same raw pointer from `SingleFlight`**, so the check always passes, and the erase succeeds. 4. After the last caller's old cache handle is released (when its returned shared_ptr goes out of scope), the destructor erases the entry from _tablet_map. 5. Crucially, subsequent `get_tablet()` calls find the tablet in the LRU cache (cache hit path), which never touches `_tablet_map`. So the tablet is permanently invisible to `_tablet_map` and can never re-enter it. 6. **The compaction scheduler uses `get_weak_tablets()` which iterates `_tablet_map`, so it never sees these tablets and never schedules compaction for them.** ## Before (original correct code, prior to apache#57922): ```cpp auto load_tablet = [this, &key, ...](int64_t tablet_id) { // load from meta service... // Cache insert + tablet_map put INSIDE lambda — only leader executes auto* handle = _cache->insert(key, value.release(), ...); _tablet_map->put(std::move(tablet)); return ret; }; s_singleflight_load_tablet.load(tablet_id, std::move(load_tablet)); ``` ## After apache#57922 (buggy code): ```cpp auto load_tablet = [this, ...](int64_t tablet_id) { // load from meta service... return tablet; // just return raw tablet }; auto result = s_singleflight_load_tablet.load(tablet_id, std::move(load_tablet)); // Cache insert + tablet_map put OUTSIDE lambda — ALL concurrent callers execute _cache->insert(key, value.release(), ...); _tablet_map->put(std::move(tablet)); ``` ## Fix Move `_cache->insert()` and `_tablet_map->put()` back inside the `SingleFlight` lambda, ensuring only the leader caller performs cache insertion and `_tablet_map` registration. This restores the invariant that a single `get_tablet()` cache miss produces exactly one LRU cache entry and one `_tablet_map` entry, eliminating the race condition. None - Test <!-- At least one of them must be included. --> - [ ] Regression test - [ ] Unit Test - [ ] Manual test (add detailed scripts or steps below) - [ ] No need to test or manual test. Explain why: - [ ] This is a refactor/code format and no logic has been changed. - [ ] Previous test can cover this change. - [ ] No code files have been changed. - [ ] Other reason <!-- Add your reason? --> - Behavior changed: - [ ] No. - [ ] Yes. <!-- Explain the behavior change --> - Does this need documentation? - [ ] No. - [ ] Yes. <!-- Add document PR link here. eg: apache/doris-website#1214 --> - [ ] Confirm the release note - [ ] Confirm test cases - [ ] Confirm document - [ ] Add branch pick label <!-- Add branch pick label that this PR should merge into -->
33d33ce to
57ee598
Compare
Contributor
Author
|
run buildall |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
pick #60832