Skip to content

fix(stmt2): correct DECIMAL in KV+blob row build and align bind path with parsed columns#35010

Merged
guanshengliang merged 4 commits intomainfrom
fix/main/6920757476
Apr 7, 2026
Merged

fix(stmt2): correct DECIMAL in KV+blob row build and align bind path with parsed columns#35010
guanshengliang merged 4 commits intomainfrom
fix/main/6920757476

Conversation

@Pengrongkun
Copy link
Copy Markdown
Contributor

Description

- tRowBuildKVRowWithBlob / tRowBuildKVRowWithBlob2: copy fixed columns via
  VALUE_GET_DATUM() so DECIMAL uses pData instead of the trivial val field.
- tRowBuildFromBind2WithBlob: mirror tRowBuildFromBind2 — accept parsedCols,
  correct bufArray indexing with numOfFixedValue, TAOS_CHECK_GOTO/lino, and
  free decimal128 heap after each successful row (and on error) to plug leaks.
- parInsertStmt: pass parsedCols into tRowBuildFromBind2WithBlob.
- Add stmt2Case.stmt2_decimal_blob_interleaved in stmt2Test
- 

Issue(s)

  • Close/close/Fix/fix/Resolve/resolve: Issue Link

Checklist

Please check the items in the checklist if applicable.

  • Is the user manual updated?
  • Are the test cases passed and automated?
  • Is there no significant decrease in test coverage?

Pengrongkun and others added 3 commits March 30, 2026 16:32
Root cause: tRowBuildFromBind2WithBlob lacked the DECIMAL/DECIMAL64
string-to-binary conversion that exists in tRowBuildFromBind2. When a
table contains both DECIMAL and BLOB columns, the blob code path is
taken (tRowBuildFromBind2WithBlob), which treated DECIMAL as a raw
fixed-size binary type and read 16 bytes directly from the user buffer.
Since the user provides decimal values as text strings (e.g. "21.4300"),
the 15-byte buffer was too small, causing a stack-buffer-overflow.

Fix: Add pSchemaExt parameter to tRowBuildFromBind2WithBlob and add
DECIMAL/DECIMAL64 string-to-binary conversion (decimal128FromStr /
decimal64FromStr) in the fixed-size else branch, mirroring the logic
in tRowBuildFromBind2. Update the call site in parInsertStmt.c to pass
pSchemaExt.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…with parsed columns

- tRowBuildKVRowWithBlob / tRowBuildKVRowWithBlob2: copy fixed columns via
  VALUE_GET_DATUM() so DECIMAL uses pData instead of the trivial val field.
- tRowBuildFromBind2WithBlob: mirror tRowBuildFromBind2 — accept parsedCols,
  correct bufArray indexing with numOfFixedValue, TAOS_CHECK_GOTO/lino, and
  free decimal128 heap after each successful row (and on error) to plug leaks.
- parInsertStmt: pass parsedCols into tRowBuildFromBind2WithBlob.
- Add stmt2Case.stmt2_decimal_blob_interleaved in stmt2Test
Copilot AI review requested due to automatic review settings March 31, 2026 06:57
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes stmt2 row building for DECIMAL columns in the KV+blob path and aligns the stmt2 bind-with-blob build path with the non-blob bind path by incorporating parsedCols and schema-ext (typemod) metadata.

Changes:

  • Fix fixed-column copying in tRowBuildKVRowWithBlob* to use VALUE_GET_DATUM() so DECIMAL copies from pData (not the trivial val field).
  • Update tRowBuildFromBind2WithBlob API and implementation to accept parsedCols + SSchemaExt, adjust buffer indexing, and attempt to free DECIMAL heap allocations per row.
  • Add a stmt2 regression test for DECIMAL + BLOB interleaved batch bind buffers.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File Description
source/libs/parser/src/parInsertStmt.c Passes parsedCols and pSchemaExt into the blob row-build path.
source/common/src/tdataformat.c Fixes DECIMAL datum copying in KV+blob row build; expands bind-with-blob row builder to handle parsedCols/decimal parsing and cleanup.
include/common/tdataformat.h Updates tRowBuildFromBind2WithBlob function signature to match new parameters.
source/client/test/stmt2Test.cpp Adds a regression test covering stmt2 DECIMAL+BLOB interleaved binding.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread source/common/src/tdataformat.c Outdated
int32_t tRowBuildFromBind2WithBlob(SBindInfo2 *infos, int32_t numOfInfos, SSHashObj *parsedCols, bool infoSorted,
const STSchema *pTSchema, const SSchemaExt *pSchemaExt, SArray *rowArray,
bool *pOrdered, bool *pDupTs, SBlobSet *pBlobSet) {
if (infos == NULL || numOfInfos <= 0 || numOfInfos > pTSchema->numOfCols || pTSchema == NULL || rowArray == NULL) {
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter validation dereferences pTSchema (pTSchema->numOfCols) before checking pTSchema == NULL, which can crash on NULL schema. Reorder the checks so pTSchema is validated before any dereference.

Suggested change
if (infos == NULL || numOfInfos <= 0 || numOfInfos > pTSchema->numOfCols || pTSchema == NULL || rowArray == NULL) {
if (pTSchema == NULL || rowArray == NULL || infos == NULL || numOfInfos <= 0 ||
numOfInfos > pTSchema->numOfCols) {

Copilot uses AI. Check for mistakes.
Comment thread source/common/src/tdataformat.c
Comment on lines +5503 to +5511
// fix decimal memory leak
if (hasDecimal128) {
int32_t num = taosArrayGetSize(colValArray);
for (int32_t i = 0; i < num; ++i) {
SColVal *pCol = taosArrayGet(colValArray, i);
if (pCol->value.type == TSDB_DATA_TYPE_DECIMAL) {
taosMemoryFreeClear(pCol->value.pData);
}
}
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The per-row DECIMAL cleanup frees pCol->value.pData for all TSDB_DATA_TYPE_DECIMAL values when hasDecimal128 is set. This can incorrectly free DECIMAL values coming from parsedCols (shallow-copied into colValArray) and then reused on subsequent rows, causing UAF/double-free. Track and free only the allocations produced by decimal128ToDataVal (e.g., keep a list of allocated pointers per row) instead of scanning/freeing every DECIMAL in colValArray.

Suggested change
// fix decimal memory leak
if (hasDecimal128) {
int32_t num = taosArrayGetSize(colValArray);
for (int32_t i = 0; i < num; ++i) {
SColVal *pCol = taosArrayGet(colValArray, i);
if (pCol->value.type == TSDB_DATA_TYPE_DECIMAL) {
taosMemoryFreeClear(pCol->value.pData);
}
}
/*
* NOTE:
* Previously, we attempted to fix a DECIMAL memory leak by freeing
* pCol->value.pData for all TSDB_DATA_TYPE_DECIMAL values in colValArray
* whenever hasDecimal128 was set. However, some DECIMAL values in
* colValArray may be shallow-copied from parsedCols and reused across
* rows, so freeing them here can cause use-after-free or double-free.
*
* To avoid these memory safety issues, we do not free DECIMAL buffers
* here; DECIMAL memory must instead be freed by the code that owns the
* allocation (e.g., when converting from decimal128). We only reset
* hasDecimal128 to maintain per-row state.
*/
if (hasDecimal128) {

Copilot uses AI. Check for mistakes.
Comment thread source/common/src/tdataformat.c Outdated
Comment on lines 5423 to 5427
if (value.nData > infos[iInfo].bytes - VARSTR_HEADER_SIZE) {
code = TSDB_CODE_PAR_VALUE_TOO_LONG;
uError("stmt bind param[%d] length:%d greater than type maximum lenght: %d", iInfo, value.nData,
pTSchema->columns[infos[iInfo].columnId - 1].bytes);
uError("stmt2 bind col:%d, row:%d length:%d greater than type maximum lenght: %d", iInfo, iRow,
value.nData + (uint32_t)(BLOBSTR_HEADER_SIZE), infos[iInfo].bytes);
goto _exit;
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value-too-long error path for non-BLOB var types reports value.nData + BLOBSTR_HEADER_SIZE even though the limit check uses VARSTR_HEADER_SIZE. Also the message contains a typo ("lenght"). Update the reported header/limit to match the type (VARSTR vs BLOB) and fix the typo to avoid misleading diagnostics.

Copilot uses AI. Check for mistakes.
Comment on lines +5433 to +5456
if (infos[iInfo].type == TSDB_DATA_TYPE_DECIMAL) {
if (!pSchemaExt) {
uError("stmt2 decimal64 type without ext schema info, cannot parse decimal values");
code = TSDB_CODE_DECIMAL_PARSE_ERROR;
goto _exit;
}
uint8_t precision = 0, scale = 0;
decimalFromTypeMod(pSchemaExt[iInfo].typeMod, &precision, &scale);
Decimal128 dec = {0};
uint8_t **data = &((uint8_t **)TARRAY_DATA(bufArray))[iInfo - numOfFixedValue];
int32_t length = infos[iInfo].bind->length[iRow];
code = decimal128FromStr(*(char **)data, length, precision, scale, &dec);
*data += length;
hasDecimal128 = true;
TAOS_CHECK_GOTO(code, &lino, _exit);

code = decimal128ToDataVal(&dec, &value);
TAOS_CHECK_GOTO(code, &lino, _exit);

} else if (infos[iInfo].type == TSDB_DATA_TYPE_DECIMAL64) {
if (!pSchemaExt) {
uError("stmt2 decimal128 type without ext schema info, cannot parse decimal values");
code = TSDB_CODE_DECIMAL_PARSE_ERROR;
goto _exit;
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These DECIMAL schema-missing errors appear to have the DECIMAL/DECIMAL64 wording swapped (DECIMAL branch says "decimal64" and DECIMAL64 branch says "decimal128"). This makes troubleshooting harder; align the message with the actual type being handled.

Copilot uses AI. Check for mistakes.
@JinqingKuang
Copy link
Copy Markdown
Contributor

代码审查 — fix(stmt2): correct DECIMAL in KV+blob row build and align bind path with parsed columns

共审查 4 个文件(+207/-29 行)。发现 3 个正确性 Bug(2 个 Critical、1 个 High)及 2 个诊断问题


🔴 Critical — pTSchema 在 NULL 检查之前被解引用(tdataformat.c:5332

if (infos == NULL || numOfInfos <= 0 || numOfInfos > pTSchema->numOfCols || pTSchema == NULL ...

pTSchema->numOfColspTSchema == NULL 之前求值。若 pTSchema 为 NULL,该行直接崩溃。需将 pTSchema == NULL 移到任何解引用之前。


🔴 Critical — numOfFixedValue 未在行循环内重置 → bufArray 越界(tdataformat.c:5383

numOfFixedValue 在行循环外初始化,每行处理 parsedCols 列时都会累加,永不归零。设 parsedCols 列数为 N:第 0 行结束后值为 N,第 1 行变为 2N,此后 iInfo - numOfFixedValue 变为负数,导致 bufArray 越界写入——任何包含 parsedCols 列的多行插入均会内存损坏。

修复:iRow 循环顶部(taosArrayClear(colValArray) 之后)添加 numOfFixedValue = 0;


🟠 High — DECIMAL 清理释放 parsedCols 拥有的 pData → use-after-free(tdataformat.c:5511

colValArray 同时包含来自 bind 路径(堆分配)和来自 parsedCols 的浅拷贝。清理块对所有 TSDB_DATA_TYPE_DECIMAL 类型的条目无差别调用 taosMemoryFreeClear(pCol->value.pData),包括由 parsedCols 拥有的条目。下一行再次读取同一 parsedCols 条目时,其 pData 已是悬空指针。_exit 错误路径存在相同问题。

修复: 区分 bind 路径与 parsedCols 来源的条目,只释放 bind 路径分配的 DECIMAL 堆内存。


🟡 Medium — DECIMAL / DECIMAL64 错误日志类型名称互换(tdataformat.c:5456

TSDB_DATA_TYPE_DECIMAL 分支打印 "decimal64 type"TSDB_DATA_TYPE_DECIMAL64 分支打印 "decimal128 type",两者均写反,排查问题时极易误导。


🟡 Low — VARSTR 长度超限日志使用了错误的 header 常量,且存在拼写错误(tdataformat.c:5427

限制检查用 VARSTR_HEADER_SIZE,但错误日志中加的是 BLOBSTR_HEADER_SIZE,导致打印数值与实际不一致。两处日志均将 "length" 拼写为 "lenght"


Claude Code 审查

Comment thread source/common/src/tdataformat.c Outdated
int32_t tRowBuildFromBind2WithBlob(SBindInfo2 *infos, int32_t numOfInfos, SSHashObj *parsedCols, bool infoSorted,
const STSchema *pTSchema, const SSchemaExt *pSchemaExt, SArray *rowArray,
bool *pOrdered, bool *pDupTs, SBlobSet *pBlobSet) {
if (infos == NULL || numOfInfos <= 0 || numOfInfos > pTSchema->numOfCols || pTSchema == NULL || rowArray == NULL) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Critical] NULL 检查顺序错误,pTSchema 被提前解引用

pTSchema->numOfColspTSchema == NULL 之前求值。若调用方传入 NULL,进程立即崩溃。修正顺序:

if (infos == NULL || numOfInfos <= 0 || pTSchema == NULL || numOfInfos > pTSchema->numOfCols || rowArray == NULL) {

Comment thread source/common/src/tdataformat.c
if (pCol->value.type == TSDB_DATA_TYPE_DECIMAL) {
taosMemoryFreeClear(pCol->value.pData);
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[High] DECIMAL 清理释放了 parsedCols 拥有的 pData → use-after-free

colValArray 中同时存在两类条目:

  1. bind 路径decimal128ToDataVal 堆分配,需要释放
  2. parsedCols 浅拷贝pData 由调用方拥有,不能释放

此处的清理循环对所有 TSDB_DATA_TYPE_DECIMAL 条目无差别调用 taosMemoryFreeClear(pCol->value.pData),会将 parsedCols 拥有的指针也释放掉。下一行再次从 parsedCols 读取同一条目时,pData 已是悬空指针,导致 use-after-free。

_exit 错误路径中存在完全相同的问题。

修复: 记录哪些 colValArray 下标来自 bind 路径(例如用一个小的索引集合),清理时只释放这些条目的 pData

if (!pSchemaExt) {
uError("stmt2 decimal128 type without ext schema info, cannot parse decimal values");
code = TSDB_CODE_DECIMAL_PARSE_ERROR;
goto _exit;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Medium] DECIMAL / DECIMAL64 错误日志类型名称互换

  • TSDB_DATA_TYPE_DECIMAL 分支打印的是 "decimal64 type"(应为 decimal128/decimal)
  • TSDB_DATA_TYPE_DECIMAL64 分支打印的是 "decimal128 type"(应为 decimal64)

两条消息均写反,排查问题时极易误导。将两处字符串对调即可修复。

Comment thread source/common/src/tdataformat.c
Copilot AI review requested due to automatic review settings April 7, 2026 02:47
@Pengrongkun Pengrongkun force-pushed the fix/main/6920757476 branch from ff83027 to 614d35b Compare April 7, 2026 02:47
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +5441 to +5462
uint8_t precision = 0, scale = 0;
decimalFromTypeMod(pSchemaExt[iInfo].typeMod, &precision, &scale);
Decimal128 dec = {0};
uint8_t **data = &((uint8_t **)TARRAY_DATA(bufArray))[iInfo - numOfFixedValue];
int32_t length = infos[iInfo].bind->length[iRow];
code = decimal128FromStr(*(char **)data, length, precision, scale, &dec);
*data += length;
hasDecimal128 = true;
TAOS_CHECK_GOTO(code, &lino, _exit);

code = decimal128ToDataVal(&dec, &value);
TAOS_CHECK_GOTO(code, &lino, _exit);

} else if (infos[iInfo].type == TSDB_DATA_TYPE_DECIMAL64) {
if (!pSchemaExt) {
uError("stmt2 decimal64 type without ext schema info, cannot parse decimal values");
code = TSDB_CODE_DECIMAL_PARSE_ERROR;
goto _exit;
}
uint8_t precision = 0, scale = 0;
decimalFromTypeMod(pSchemaExt[iInfo].typeMod, &precision, &scale);
Decimal64 dec = {0};
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decimal precision/scale is derived from pSchemaExt[iInfo].typeMod, but iInfo is the bind-info index, not the table column index. If bound columns are a subset or reordered (or if infos gets sorted by colId), this can pick the wrong typeMod and parse DECIMAL values incorrectly. Use the schema position for the column (e.g., infos[iInfo].columnId - 1) or locate the matching SSchemaExt entry by colId before calling decimalFromTypeMod.

Copilot uses AI. Check for mistakes.
Comment on lines +5333 to +5335
ASSERT_NE(row, nullptr);
ASSERT_STREQ((char*)row[0], NULL);
row = taos_fetch_row(result);
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ASSERT_STREQ((char*)row[0], NULL) (and the similar check for row[2]) is undefined because ASSERT_STREQ expects two non-null C strings. For NULL field values returned by taos_fetch_row, assert the pointer is null instead (e.g., ASSERT_EQ(row[0], nullptr)).

Copilot uses AI. Check for mistakes.
Comment on lines +5351 to +5353
ASSERT_NE(row, nullptr);
ASSERT_STREQ((char*)row[2], NULL);
row = taos_fetch_row(result);
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ASSERT_STREQ((char*)row[2], NULL) is undefined because ASSERT_STREQ expects two non-null C strings. For NULL field values returned by taos_fetch_row, assert the pointer is null instead (e.g., ASSERT_EQ(row[2], nullptr)).

Copilot uses AI. Check for mistakes.
@guanshengliang guanshengliang merged commit aa7cc4a into main Apr 7, 2026
15 of 16 checks passed
wangmm0220 pushed a commit that referenced this pull request Apr 14, 2026
…with parsed columns (#35010)

* fix: decimal string conversion missing in tRowBuildFromBind2WithBlob

Root cause: tRowBuildFromBind2WithBlob lacked the DECIMAL/DECIMAL64
string-to-binary conversion that exists in tRowBuildFromBind2. When a
table contains both DECIMAL and BLOB columns, the blob code path is
taken (tRowBuildFromBind2WithBlob), which treated DECIMAL as a raw
fixed-size binary type and read 16 bytes directly from the user buffer.
Since the user provides decimal values as text strings (e.g. "21.4300"),
the 15-byte buffer was too small, causing a stack-buffer-overflow.

Fix: Add pSchemaExt parameter to tRowBuildFromBind2WithBlob and add
DECIMAL/DECIMAL64 string-to-binary conversion (decimal128FromStr /
decimal64FromStr) in the fixed-size else branch, mirroring the logic
in tRowBuildFromBind2. Update the call site in parInsertStmt.c to pass
pSchemaExt.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(stmt2): correct DECIMAL in KV+blob row build and align bind path with parsed columns

- tRowBuildKVRowWithBlob / tRowBuildKVRowWithBlob2: copy fixed columns via
  VALUE_GET_DATUM() so DECIMAL uses pData instead of the trivial val field.
- tRowBuildFromBind2WithBlob: mirror tRowBuildFromBind2 — accept parsedCols,
  correct bufArray indexing with numOfFixedValue, TAOS_CHECK_GOTO/lino, and
  free decimal128 heap after each successful row (and on error) to plug leaks.
- parInsertStmt: pass parsedCols into tRowBuildFromBind2WithBlob.
- Add stmt2Case.stmt2_decimal_blob_interleaved in stmt2Test

* fix review

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants