Insert YieldToNextRound in compaction to avoid affecting online tasks #293
Insert YieldToNextRound in compaction to avoid affecting online tasks #293thweetkomputer merged 1 commit intomainfrom
Conversation
WalkthroughReplaces std::unordered_set with absl::flat_hash_set across file-GC APIs, adds cooperative yielding to GC and background compaction (including a heap-based yielding sort), and makes MappingSnapshot inherit enable_shared_from_this for shared_ptr use. Changes
Sequence Diagram(s)mermaid Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
f35867d to
705fd21
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@include/file_gc.h`:
- Line 3: Remove the now-unused include <unordered_set> from include/file_gc.h:
locate the top-of-file include list (where `#include` <unordered_set> appears) and
delete that line since the code now uses absl::flat_hash_set<FileId> (and
references to FileId) instead of std::unordered_set, ensuring no other parts of
include/file_gc.h depend on <unordered_set>.
In `@include/tasks/task.h`:
- Around line 3-4: Remove the unused include by deleting the line '#include
<butil/time.h>' from the header; the Task-related declarations (e.g., class Task
and the Yield() and YieldToNextRound() prototypes) do not depend on butil/time.h
since their implementations in src/tasks/task.cpp only use coroutine resumption,
so simply remove that include from include/tasks/task.h to avoid an unnecessary
dependency.
In `@src/storage/shard.cpp`:
- Line 3: Remove the unused include directive "#include <butil/time.h>" from the
top of the file; the file uses std::chrono for time operations (see usages
around functions handling time near lines with std::chrono) so delete the
include to minimize headers and rebuild to ensure no butil symbols are
referenced (check functions like any time-related helpers in shard.cpp).
🧹 Nitpick comments (3)
src/tasks/background_write.cpp (2)
161-169: Consider reducing consecutive yields.Lines 161, 164, and 169 have three
YieldToNextRound()calls in quick succession with minimal work between them (sorting already yielded, andreserveis typically fast). Consider consolidating to a single yield afterHeapSortFpIdsWithYieldcompletes.♻️ Suggested consolidation
} - YieldToNextRound(); assert(fp_ids.size() == mapping_cnt); HeapSortFpIdsWithYield(fp_ids); YieldToNextRound(); constexpr uint8_t max_move_batch = max_read_pages_batch; std::vector<Page> move_batch_buf; move_batch_buf.reserve(max_move_batch); - YieldToNextRound(); std::vector<FilePageId> move_batch_fp_ids;
182-194: Redundant reset ofround_cntafter yield.When
(round_cnt & 0xFFFF) == 0is true (including whenround_cnt == 0), you yield and then resetround_cnt = 0. Since the condition already guaranteesround_cntis a multiple of 65536 (or 0), resetting to 0 loses the count history. This means:
- On the first iteration with
round_cnt = 0, you yield immediately before any work- For files with no pages in the inner loop, you'll yield on every outer iteration
If the intent is to yield periodically based on cumulative work, simply remove the reset. If the intent is to always yield after processing 65536 items, the current logic works but the initial yield (when
round_cnt == 0) may be unintentional.♻️ Option A: Remove reset to track cumulative work
size_t round_cnt = 0; for (FileId file_id = begin_file_id; file_id < end_file_id; file_id++) { - if ((round_cnt & 0xFFFF) == 0) + if (round_cnt >= 0x10000) { YieldToNextRound(); round_cnt = 0; }♻️ Option B: Initialize to 1 to avoid immediate yield
- size_t round_cnt = 0; + size_t round_cnt = 1; for (FileId file_id = begin_file_id; file_id < end_file_id; file_id++) { if ((round_cnt & 0xFFFF) == 0) { YieldToNextRound(); - round_cnt = 0; + round_cnt = 1; }src/file_gc.cpp (1)
24-41: GuardThdTask()before yielding.
ThdTask()is dereferenced unconditionally; ifGetRetainedFilesever runs outside a task context (e.g., background/offline GC or shutdown), this becomes a hard crash. Please confirm the invariant or add a nullptr guard.♻️ Defensive guard
- if ((page_id & 0xFF) == 0) - { - ThdTask()->YieldToNextRound(); - } + if ((page_id & 0xFF) == 0) + { + if (auto *task = ThdTask()) + { + task->YieldToNextRound(); + } + }
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
include/file_gc.hinclude/tasks/background_write.hinclude/tasks/task.hsrc/file_gc.cppsrc/storage/shard.cppsrc/tasks/background_write.cppsrc/tasks/write_task.cpp
🧰 Additional context used
🧬 Code graph analysis (5)
include/file_gc.h (1)
src/file_gc.cpp (4)
GetRetainedFiles(24-43)GetRetainedFiles(24-26)ExecuteCloudGC(529-572)ExecuteCloudGC(529-531)
src/file_gc.cpp (1)
src/tasks/task.cpp (2)
ThdTask(350-353)ThdTask(350-350)
include/tasks/background_write.h (1)
src/tasks/background_write.cpp (2)
HeapSortFpIdsWithYield(61-93)HeapSortFpIdsWithYield(61-62)
src/tasks/background_write.cpp (1)
src/tasks/task.cpp (2)
YieldToNextRound(29-33)YieldToNextRound(29-29)
src/tasks/write_task.cpp (1)
src/file_gc.cpp (2)
GetRetainedFiles(24-43)GetRetainedFiles(24-26)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: test
🔇 Additional comments (13)
include/tasks/background_write.h (2)
3-5: LGTM!The added includes for
<utility>and<vector>are necessary for the newHeapSortFpIdsWithYieldmethod signature.
25-27: LGTM!The private helper method declaration is well-placed and appropriately scoped. The implementation (visible in
src/tasks/background_write.cpp) correctly uses incremental heap operations with periodic yields to enable cooperative multitasking during sorting.src/tasks/background_write.cpp (3)
48-59: LGTM!The comparator correctly orders by
FilePageIdfirst, then byPageIdfor ties. Placing it in an anonymous namespace appropriately limits its linkage.
61-93: LGTM - well-designed cooperative sorting.The heap sort implementation correctly:
- Early-returns for trivial cases
- Incrementally builds the heap with
push_heap- Extracts in sorted order with
pop_heap- Yields every 256 operations to give online tasks a chance to run
95-284: Good addition of cooperative yielding in compaction.The overall approach of inserting
YieldToNextRound()at key points in the compaction flow (page collection, sorting, main loop) effectively breaks up the long-running operation to avoid starving online tasks. The log messages at start and finish provide useful observability.include/file_gc.h (5)
6-6: LGTM!
absl::flat_hash_setis a good choice overstd::unordered_setfor this use case - it provides better cache locality and typically faster iteration/lookup performance.
17-19: LGTM!The signature change to use
absl::flat_hash_set<FileId>is consistent with the implementation insrc/file_gc.cppwhich also adds cooperative yielding during the file enumeration.
28-35: LGTM!The GC execution functions consistently use the new
absl::flat_hash_set<FileId>type for retained files.
42-47: LGTM!Consistent type update for
DeleteUnreferencedLocalFiles.
74-79: LGTM!Consistent type update for
DeleteUnreferencedCloudFiles.src/file_gc.cpp (1)
11-11: Flat_hash_set migration looks consistent.
Nice end‑to‑end switch toabsl::flat_hash_setacross GC APIs. Please confirm Abseil’s container target is linked for all binaries compiling this file to avoid missing‑symbol or include issues.Also applies to: 49-50, 374-375, 452-453, 530-531
src/tasks/write_task.cpp (2)
10-12: Includes align with new usage.
The added headers match the newsnapshot_arrayandflat_hash_setusage; please ensure the Abseil container dependency is linked for this target as well.
393-413: Verify mapping snapshot lifetime across yields.
snapshot_arraystores rawMappingSnapshot*and the loop yields after each snapshot. If another task can retire snapshots during a yield, these pointers may become invalid. Please confirm snapshots are pinned/ref‑counted for the duration or hold strong refs before yielding.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
e80a5c8 to
548ca0c
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/file_gc.cpp (1)
24-41: GuardYieldToNextRoundagainst missing task context.
GetRetainedFilesis a public API (include/file_gc.h:16) with no documented task context requirement, yet it unconditionally callsThdTask()->YieldToNextRound()at line 40. SinceThdTask()(src/tasks/task.cpp:350-353) returnsshard->running_with no null check, andshard->running_is explicitly set tonullptrwhen no task is active (src/storage/shard.cpp:390), the function will crash if invoked outside task context. Add a defensive guard:- if ((page_id & 0xFF) == 0) - { - ThdTask()->YieldToNextRound(); - } + if ((page_id & 0xFF) == 0) + { + if (auto *task = ThdTask()) + { + task->YieldToNextRound(); + } + }Alternatively, document that
GetRetainedFilesrequires execution under aKvTaskcontext.
🤖 Fix all issues with AI agents
In `@src/tasks/write_task.cpp`:
- Around line 393-413: meta->mapping_snapshots_ may contain raw MappingSnapshot*
so calling mapping->shared_from_this() can throw std::bad_weak_ptr; inspect how
MappingSnapshot instances are created and owned and ensure they derive from
std::enable_shared_from_this and are always created/held in std::shared_ptr for
the duration of GC, or stop calling shared_from_this and change
meta->mapping_snapshots_ to hold std::shared_ptr<MappingSnapshot> directly.
Concretely, update the producer code that populates mapping_snapshots_ to use
std::make_shared<MappingSnapshot> (or otherwise transfer ownership into a
std::shared_ptr), change the mapping_snapshots_ container type to
std::vector<std::shared_ptr<MappingSnapshot>> (or similar) so the loop that
builds snapshot_array can copy those shared_ptrs instead of calling
MappingSnapshot::shared_from_this(), and ensure MappingSnapshot inherits
enable_shared_from_this if shared_from_this is still needed.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
include/file_gc.hinclude/storage/page_mapper.hinclude/tasks/background_write.hsrc/file_gc.cppsrc/tasks/background_write.cppsrc/tasks/write_task.cpp
🚧 Files skipped from review as they are similar to previous changes (4)
- src/tasks/background_write.cpp
- include/storage/page_mapper.h
- include/file_gc.h
- include/tasks/background_write.h
🧰 Additional context used
🧬 Code graph analysis (2)
src/tasks/write_task.cpp (1)
src/file_gc.cpp (2)
GetRetainedFiles(24-43)GetRetainedFiles(24-26)
src/file_gc.cpp (1)
src/tasks/task.cpp (2)
ThdTask(350-353)ThdTask(350-350)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: test
🔇 Additional comments (3)
src/tasks/write_task.cpp (1)
10-12: LGTM on the new includes.src/file_gc.cpp (2)
11-11: LGTM.
48-50: LGTM on the signature propagation.Also applies to: 371-375, 449-453, 529-531
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
Here are some reminders before you submit the pull request
fixes eloqdb/eloqstore#issue_idctest --test-dir build/tests/Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.