Skip to content

fix: resolve many tdb bugs#35006

Merged
guanshengliang merged 16 commits intomainfrom
fix/tdb-bugs
Apr 1, 2026
Merged

fix: resolve many tdb bugs#35006
guanshengliang merged 16 commits intomainfrom
fix/tdb-bugs

Conversation

@localvar
Copy link
Copy Markdown
Contributor

@localvar localvar commented Mar 31, 2026

Description

Summary

This PR fixes a wide range of bugs found through AI code review of source/libs/tdb/, including memory leaks, use-after-free, resource leaks on error paths, data corruption risks, and portability issues. No functional behavior changes — all fixes target correctness and robustness of existing code.

Changes by category

Memory & Resource Leaks

  • tdbBtreeDecodeCurrentCell: extract shared helper from tdbBtreeNext/tdbBtreePrev with centralized key/val cleanup, fixing overflow key/val memory leaks
  • tdbPagerFetchPage: release page refcount on init failure, uninitialized page, and pager mismatch error paths
  • tdbPagerFetchFreePage: release page refcount on init failure
  • tdbPagerOpen: free pager and close fd on all failure paths (fd invalid, file-id generation, file size query)
  • tdbPagerBegin: close journal fd and remove journal file on hashset creation failure
  • tdbPagerAbort/Restore: free pageBuf on encrypt failure
  • tdbPCacheAlterImpl: destroy already-created pages on growth failure
  • tdbEncryptPage: free encrypt buffer on CBC_Encrypt failure
  • tdbPageDefragment: free aCellIdx on early return
  • tdbTbcOpen: use tdbOsFree (not taosMemoryFree) for consistency

Data Corruption & Logic Bugs

  • tdbBtreeDecodePayload: refactor overflow decode — fix wrong read offsets (kLen - nLeftKeybytes) and use-after-free where pDecoder->pVal pointed into released page memory; now pre-allocates val buffer and copies data
  • tdbBtreeDecodePayload (encode path): fix 4 bugs in overflow value decode loop — variable shadowing, wrong destination offset, wrong pgno source offset, double decrement
  • tdbFreeOvflPage: fix variable shadowing that masked error propagation
  • tdbDb.c: use correct pointer (pHashNext for hash chain, pNext for list chain) when removing pager
  • tdbPagerAbort/Restore: validate journal pgno against dbOrigSize to guard against corruption
  • tdbPageDropCell: fix use-after-free in log — log pointer before freeing

Hashset (open-addressing)

  • hashset remove: fix probe chain breakage by rehashing displaced entries (Robin Hood / backward-shift deletion)
  • hashset add: restore hashset state on allocation failure, add probe count guard

Error Propagation

  • tdbPagerRemoveFreePage: propagate error from tdbTbPopFreePage
  • tdbPagerInitPage: release page lock on CBC_Decrypt failure
  • tdbPageCreate: return error on page lock init failure

Portability & Correctness

  • TDB_FLAG_NO macro: add parentheses to fix operator precedence (~(flags))
  • TDB_GET_U24: use byte-level reads instead of potentially unaligned 32-bit access
  • TDB_O_RDWR macro: add missing parentheses
  • tdbRealloc/tdbFree: change size arithmetic from int to size_t to avoid signed overflow
  • tdbPageCopy: use %td format specifier for ptrdiff_t
  • tdbUtil.h: move #include "tdbInt.h" inside header guard
  • tdbBtcOpen: fix variable shadowing
  • calcLocalSize: extract duplicated local-size calculation into shared helper

Cleanup

  • Remove dead variables, stale commented-out debug code, and unused #if 0 blocks

Issue(s)

  • Close/close/Fix/fix/Resolve/resolve: Issue Link

Checklist

Please check the items in the checklist if applicable.

  • Is the user manual updated?
  • Are the test cases passed and automated?
  • Is there no significant decrease in test coverage?

@localvar localvar requested review from a team, dapan1121 and guanshengliang as code owners March 31, 2026 02:55
Copilot AI review requested due to automatic review settings March 31, 2026 02:55
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR targets correctness and robustness of the embedded TDB (B-tree) library, focusing on overflow-page handling, resource cleanup on failure paths, and portability fixes (alignment/endian/operator precedence).

Changes:

  • Fix portability and correctness issues in low-level macros and allocators (e.g., TDB_GET_U24, TDB_FLAG_NO, tdbRealloc header sizing).
  • Strengthen pager/pcache reliability by cleaning up resources on error paths (journal/file open failures, encryption failures, cache growth failures, fetch/init failures).
  • Refactor and harden B-tree overflow encode/decode logic (including new helper extraction and additional bounds/assert checks), plus propagate callback errors in table traversal.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
test/ci/count_assert.py Updates CI allowlist for expected assert/ASSERT_CORE occurrences due to new/changed assertions.
source/libs/tdb/src/inc/tdbUtil.h Moves include under header guard for safer include semantics.
source/libs/tdb/src/inc/tdbOs.h Fixes macro operator precedence for TDB_O_RDWR.
source/libs/tdb/src/inc/tdbInt.h Makes U24 reads unaligned-safe and fixes TDB_FLAG_NO precedence bug.
source/libs/tdb/src/db/tdbUtil.c Uses size_t metadata for tdbRealloc/tdbFree for better portability.
source/libs/tdb/src/db/tdbTable.c Removes dead variables, fixes allocator mismatch, and returns traversal callback errors.
source/libs/tdb/src/db/tdbPCache.c Cleans up already-created pages on cache growth failure.
source/libs/tdb/src/db/tdbPager.c Fixes multiple failure-path leaks, adds journal pgno validation, improves page release/unlock behavior, and reworks internal hashset.
source/libs/tdb/src/db/tdbPage.c Adds overflow bounds checks, improves error handling paths, and fixes a leak in defragment failure.
source/libs/tdb/src/db/tdbDb.c Fixes pager removal from hash chain by using pHashNext.
source/libs/tdb/src/db/tdbBtree.c Refactors overflow payload decode/encode logic and adds/updates pgno validity assertions.
Comments suppressed due to low confidence (1)

source/libs/tdb/src/db/tdbBtree.c:1456

  • If tdbRealloc(NULL, pBt->pageSize) fails, this branch releases ofp but returns ret (which is still 0 here). This will silently report success even though encoding cannot proceed. Return an appropriate error code (e.g., terrno / OOM) instead of ret in this failure path.
  // local buffer for cell
  SCell *pBuf = tdbRealloc(NULL, pBt->pageSize);
  if (pBuf == NULL) {
    tdbPCacheRelease(pBt->pPager->pCache, ofp, pTxn);
    return ret;
  }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread source/libs/tdb/src/db/tdbBtree.c Outdated
Comment thread source/libs/tdb/src/db/tdbPager.c
Comment thread source/libs/tdb/src/db/tdbPage.c
@localvar localvar changed the title fix: resolve multiple TDB library bugs including resource leaks, overflow handling, and portability issues fix: resolve many tdb bugs Mar 31, 2026
@localvar localvar marked this pull request as draft March 31, 2026 08:17
localvar added 16 commits April 1, 2026 08:52
== has higher precedence than &, so the macro evaluated as
(flags) & ((flag) == 0) instead of ((flags) & (flag)) == 0.
…in tdbRealloc, header guard order, variable shadowing
…t buffer on CBC_Encrypt failure in tdbEncryptPage\n- remove dead pKey variable and tdbFree(pKey) call in tdbTbOpen\n- use tdbOsFree instead of taosMemoryFree in tdbTbcOpen error path\n- release page lock on CBC_Decrypt failure in tdbPagerInitPage\n- propagate error from tdbTbPopFreePage in tdbPagerRemoveFreePage\n- add overflow array bounds check in tdbPageInsertCell\n- use portable %td format for ptrdiff_t in tdbPageCopy\n- return error on page lock init failure in tdbPageCreate\n- log pointer before freeing in tdbPageDropCell\n- add missing parentheses in TDB_O_RDWR macro\n- free aCellIdx on early return in tdbPageDefragment\n- remove stale commented-out debug logging
- tdbPagerOpen: free pager and close fd on all failure paths (fd
  invalid, file-id generation, getFileSize)
- tdbPagerBegin: close jfd and remove journal file on hashset
  creation failure
- tdbPagerAbort/tdbPagerRestore: validate pgno from journal against
  dbOrigSize to guard against corruption
- tdbPagerFetchPage: release page refcount on init failure,
  uninitialized page, and pager mismatch error paths
- tdbPagerFetchFreePage: release page refcount on init failure
- Remove stale commented-out debug code
@localvar localvar marked this pull request as ready for review April 1, 2026 03:28
Copilot AI review requested due to automatic review settings April 1, 2026 03:28
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread source/libs/tdb/src/db/tdbPager.c
Comment thread source/libs/tdb/src/db/tdbPage.c
Comment thread source/libs/tdb/src/db/tdbBtree.c
@guanshengliang
Copy link
Copy Markdown
Contributor

@localvar 云上 Assert 开启了嘛

@guanshengliang guanshengliang merged commit 2f3405e into main Apr 1, 2026
20 of 21 checks passed
@localvar localvar deleted the fix/tdb-bugs branch April 1, 2026 09:17
jiajingbin pushed a commit that referenced this pull request Apr 8, 2026
wangmm0220 pushed a commit that referenced this pull request Apr 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants