Skip to content

Fix TruncateDataPage to handle empty pages and tail maintenance#212

Merged
thweetkomputer merged 1 commit intomainfrom
fix-truncate
Dec 16, 2025
Merged

Fix TruncateDataPage to handle empty pages and tail maintenance#212
thweetkomputer merged 1 commit intomainfrom
fix-truncate

Conversation

@thweetkomputer
Copy link
Collaborator

@thweetkomputer thweetkomputer commented Dec 15, 2025

Here are some reminders before you submit the pull request

  • Add tests for the change
  • Document changes
  • Reference the link of issue using fixes eloqdb/eloqstore#issue_id
  • Reference the link of RFC if exists
  • Pass ctest --test-dir build/tests/

Summary by CodeRabbit

  • Bug Fixes
    • Improved data truncation: detects overflow tails and performs cleanup only when needed.
    • Safer truncation path: empty pages are freed and page links updated to maintain correct tail pointers.
    • More robust iteration: defaults/resetting prevent uninitialized-state during iteration.
    • Tighter error handling during overflow cleanup to avoid partial/incorrect state.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Dec 15, 2025

Walkthrough

Truncation now detects whether keys exist past the truncation point and conditionally performs overflow-value deletions and TTL updates; empty pages may be freed and previous-page links updated. Iterator and region-member initializations were adjusted, changing when certain members are zeroed or left uninitialized until Reset().

Changes

Cohort / File(s) Summary
Truncation tail detection & cleanup
batch_write_task.cpp
Introduces has_trunc_tail while scanning to trunc_pos; gates overflow-value deletion and TTL updates on its presence, aborts on non-OK overflow-delete errors, and adds logic to free an emptied page and relink the previous page (with early returns where applicable).
Iterator init changes
data_page.cpp
Removes explicit initializer assignments for curr_offset_ and curr_restart_idx_ in DataPageIter constructor; these remain uninitialized until Reset() is called. Also resets overflow_ to false in Invalidate().
Region count value-initialized
data_page.h
PageRegionIter::num_regions_ is now value-initialized to zero by default, altering default-constructed state.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Closely review batch_write_task.cpp::TruncateDataPage for control-flow correctness around has_trunc_tail, the do-while overflow/TTL loop, and early-return conditions.
  • Verify correct behavior and safety when freeing an emptied page and updating prev_page_id/tail pointers.
  • Audit error-handling changes for overflow deletion to ensure callers/transactions respond correctly to non-OK returns.
  • Check all DataPageIter usage sites to ensure no reliance on the removed constructor initializations before Reset(); confirm Invalidate()'s overflow_ reset aligns with expected iterator lifecycle.
  • Confirm PageRegionIter::num_regions_ value-initialization doesn't introduce off-by-one or iteration logic regressions.

Poem

🐰 I nibble bytes and hop through rows,
I sniff for tails where the truncation goes,
Empty pages pop and links realign,
Iterators wake only when they're fine.
Hoppity-hop — tidy data, carrot-bright! 🥕

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description contains only a contributor checklist template with no actual description of the changes, objectives, or rationale for the fix. Add a meaningful description explaining what was fixed, why it was needed, and reference the issue number using 'fixes eloqdb/eloqstore#issue_id' format.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title directly and specifically describes the main change: fixing TruncateDataPage to handle empty pages and tail maintenance, which aligns with the primary modifications in batch_write_task.cpp and related files.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix-truncate

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
batch_write_task.cpp (1)

1687-1694: Variable shadowing of err is valid but reduces clarity.

At line 1687, auto [prev_page, err] creates a new err that shadows the outer one declared at line 1635 (and assigned at line 1665). While this is valid C++ and works correctly here, consider using a distinct name like load_err for improved readability.

-        auto [prev_page, err] = LoadDataPage(prev_page_id);
-        if (err != KvError::NoError)
+        auto [prev_page, load_err] = LoadDataPage(prev_page_id);
+        if (load_err != KvError::NoError)
         {
-            return {false, err};
+            return {false, load_err};
         }
         prev_page.SetNextPageId(MaxPageId);
-        err = WritePage(std::move(prev_page));
-        return {false, err};
+        KvError write_err = WritePage(std::move(prev_page));
+        return {false, write_err};
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5b59088 and c925e7f.

📒 Files selected for processing (1)
  • batch_write_task.cpp (1 hunks)
🔇 Additional comments (2)
batch_write_task.cpp (2)

1644-1658: LGTM!

The introduction of has_trunc_tail correctly separates the "preserve keys" phase from the "cleanup truncated keys" phase. The loop properly breaks when finding the first key >= trunc_pos, leaving the iterator positioned at that key for subsequent cleanup processing.


1659-1674: Proper error handling added for overflow deletion.

The conditional check ensures cleanup only runs when there's actually a tail to truncate (fixing the original bug). The do-while structure correctly processes the current iterator position first, then advances. The early return on DelOverflowValue error prevents continuing with corrupted state.

@thweetkomputer thweetkomputer changed the title fix TruncateDataPage Fix TruncateDataPage to handle empty pages and tail maintenance Dec 16, 2025
@thweetkomputer thweetkomputer merged commit 56b2e83 into main Dec 16, 2025
3 checks passed
@thweetkomputer thweetkomputer deleted the fix-truncate branch December 16, 2025 02:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants