fix(stmt2): correct DECIMAL in KV+blob row build and align bind path with parsed columns#35010
fix(stmt2): correct DECIMAL in KV+blob row build and align bind path with parsed columns#35010guanshengliang merged 4 commits intomainfrom
Conversation
Root cause: tRowBuildFromBind2WithBlob lacked the DECIMAL/DECIMAL64 string-to-binary conversion that exists in tRowBuildFromBind2. When a table contains both DECIMAL and BLOB columns, the blob code path is taken (tRowBuildFromBind2WithBlob), which treated DECIMAL as a raw fixed-size binary type and read 16 bytes directly from the user buffer. Since the user provides decimal values as text strings (e.g. "21.4300"), the 15-byte buffer was too small, causing a stack-buffer-overflow. Fix: Add pSchemaExt parameter to tRowBuildFromBind2WithBlob and add DECIMAL/DECIMAL64 string-to-binary conversion (decimal128FromStr / decimal64FromStr) in the fixed-size else branch, mirroring the logic in tRowBuildFromBind2. Update the call site in parInsertStmt.c to pass pSchemaExt. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…with parsed columns - tRowBuildKVRowWithBlob / tRowBuildKVRowWithBlob2: copy fixed columns via VALUE_GET_DATUM() so DECIMAL uses pData instead of the trivial val field. - tRowBuildFromBind2WithBlob: mirror tRowBuildFromBind2 — accept parsedCols, correct bufArray indexing with numOfFixedValue, TAOS_CHECK_GOTO/lino, and free decimal128 heap after each successful row (and on error) to plug leaks. - parInsertStmt: pass parsedCols into tRowBuildFromBind2WithBlob. - Add stmt2Case.stmt2_decimal_blob_interleaved in stmt2Test
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
There was a problem hiding this comment.
Pull request overview
This PR fixes stmt2 row building for DECIMAL columns in the KV+blob path and aligns the stmt2 bind-with-blob build path with the non-blob bind path by incorporating parsedCols and schema-ext (typemod) metadata.
Changes:
- Fix fixed-column copying in
tRowBuildKVRowWithBlob*to useVALUE_GET_DATUM()so DECIMAL copies frompData(not the trivialvalfield). - Update
tRowBuildFromBind2WithBlobAPI and implementation to acceptparsedCols+SSchemaExt, adjust buffer indexing, and attempt to free DECIMAL heap allocations per row. - Add a stmt2 regression test for DECIMAL + BLOB interleaved batch bind buffers.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| source/libs/parser/src/parInsertStmt.c | Passes parsedCols and pSchemaExt into the blob row-build path. |
| source/common/src/tdataformat.c | Fixes DECIMAL datum copying in KV+blob row build; expands bind-with-blob row builder to handle parsedCols/decimal parsing and cleanup. |
| include/common/tdataformat.h | Updates tRowBuildFromBind2WithBlob function signature to match new parameters. |
| source/client/test/stmt2Test.cpp | Adds a regression test covering stmt2 DECIMAL+BLOB interleaved binding. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| int32_t tRowBuildFromBind2WithBlob(SBindInfo2 *infos, int32_t numOfInfos, SSHashObj *parsedCols, bool infoSorted, | ||
| const STSchema *pTSchema, const SSchemaExt *pSchemaExt, SArray *rowArray, | ||
| bool *pOrdered, bool *pDupTs, SBlobSet *pBlobSet) { | ||
| if (infos == NULL || numOfInfos <= 0 || numOfInfos > pTSchema->numOfCols || pTSchema == NULL || rowArray == NULL) { |
There was a problem hiding this comment.
The parameter validation dereferences pTSchema (pTSchema->numOfCols) before checking pTSchema == NULL, which can crash on NULL schema. Reorder the checks so pTSchema is validated before any dereference.
| if (infos == NULL || numOfInfos <= 0 || numOfInfos > pTSchema->numOfCols || pTSchema == NULL || rowArray == NULL) { | |
| if (pTSchema == NULL || rowArray == NULL || infos == NULL || numOfInfos <= 0 || | |
| numOfInfos > pTSchema->numOfCols) { |
| // fix decimal memory leak | ||
| if (hasDecimal128) { | ||
| int32_t num = taosArrayGetSize(colValArray); | ||
| for (int32_t i = 0; i < num; ++i) { | ||
| SColVal *pCol = taosArrayGet(colValArray, i); | ||
| if (pCol->value.type == TSDB_DATA_TYPE_DECIMAL) { | ||
| taosMemoryFreeClear(pCol->value.pData); | ||
| } | ||
| } |
There was a problem hiding this comment.
The per-row DECIMAL cleanup frees pCol->value.pData for all TSDB_DATA_TYPE_DECIMAL values when hasDecimal128 is set. This can incorrectly free DECIMAL values coming from parsedCols (shallow-copied into colValArray) and then reused on subsequent rows, causing UAF/double-free. Track and free only the allocations produced by decimal128ToDataVal (e.g., keep a list of allocated pointers per row) instead of scanning/freeing every DECIMAL in colValArray.
| // fix decimal memory leak | |
| if (hasDecimal128) { | |
| int32_t num = taosArrayGetSize(colValArray); | |
| for (int32_t i = 0; i < num; ++i) { | |
| SColVal *pCol = taosArrayGet(colValArray, i); | |
| if (pCol->value.type == TSDB_DATA_TYPE_DECIMAL) { | |
| taosMemoryFreeClear(pCol->value.pData); | |
| } | |
| } | |
| /* | |
| * NOTE: | |
| * Previously, we attempted to fix a DECIMAL memory leak by freeing | |
| * pCol->value.pData for all TSDB_DATA_TYPE_DECIMAL values in colValArray | |
| * whenever hasDecimal128 was set. However, some DECIMAL values in | |
| * colValArray may be shallow-copied from parsedCols and reused across | |
| * rows, so freeing them here can cause use-after-free or double-free. | |
| * | |
| * To avoid these memory safety issues, we do not free DECIMAL buffers | |
| * here; DECIMAL memory must instead be freed by the code that owns the | |
| * allocation (e.g., when converting from decimal128). We only reset | |
| * hasDecimal128 to maintain per-row state. | |
| */ | |
| if (hasDecimal128) { |
| if (value.nData > infos[iInfo].bytes - VARSTR_HEADER_SIZE) { | ||
| code = TSDB_CODE_PAR_VALUE_TOO_LONG; | ||
| uError("stmt bind param[%d] length:%d greater than type maximum lenght: %d", iInfo, value.nData, | ||
| pTSchema->columns[infos[iInfo].columnId - 1].bytes); | ||
| uError("stmt2 bind col:%d, row:%d length:%d greater than type maximum lenght: %d", iInfo, iRow, | ||
| value.nData + (uint32_t)(BLOBSTR_HEADER_SIZE), infos[iInfo].bytes); | ||
| goto _exit; |
There was a problem hiding this comment.
The value-too-long error path for non-BLOB var types reports value.nData + BLOBSTR_HEADER_SIZE even though the limit check uses VARSTR_HEADER_SIZE. Also the message contains a typo ("lenght"). Update the reported header/limit to match the type (VARSTR vs BLOB) and fix the typo to avoid misleading diagnostics.
| if (infos[iInfo].type == TSDB_DATA_TYPE_DECIMAL) { | ||
| if (!pSchemaExt) { | ||
| uError("stmt2 decimal64 type without ext schema info, cannot parse decimal values"); | ||
| code = TSDB_CODE_DECIMAL_PARSE_ERROR; | ||
| goto _exit; | ||
| } | ||
| uint8_t precision = 0, scale = 0; | ||
| decimalFromTypeMod(pSchemaExt[iInfo].typeMod, &precision, &scale); | ||
| Decimal128 dec = {0}; | ||
| uint8_t **data = &((uint8_t **)TARRAY_DATA(bufArray))[iInfo - numOfFixedValue]; | ||
| int32_t length = infos[iInfo].bind->length[iRow]; | ||
| code = decimal128FromStr(*(char **)data, length, precision, scale, &dec); | ||
| *data += length; | ||
| hasDecimal128 = true; | ||
| TAOS_CHECK_GOTO(code, &lino, _exit); | ||
|
|
||
| code = decimal128ToDataVal(&dec, &value); | ||
| TAOS_CHECK_GOTO(code, &lino, _exit); | ||
|
|
||
| } else if (infos[iInfo].type == TSDB_DATA_TYPE_DECIMAL64) { | ||
| if (!pSchemaExt) { | ||
| uError("stmt2 decimal128 type without ext schema info, cannot parse decimal values"); | ||
| code = TSDB_CODE_DECIMAL_PARSE_ERROR; | ||
| goto _exit; |
There was a problem hiding this comment.
These DECIMAL schema-missing errors appear to have the DECIMAL/DECIMAL64 wording swapped (DECIMAL branch says "decimal64" and DECIMAL64 branch says "decimal128"). This makes troubleshooting harder; align the message with the actual type being handled.
代码审查 — fix(stmt2): correct DECIMAL in KV+blob row build and align bind path with parsed columns共审查 4 个文件(+207/-29 行)。发现 3 个正确性 Bug(2 个 Critical、1 个 High)及 2 个诊断问题。 🔴 Critical —
|
| int32_t tRowBuildFromBind2WithBlob(SBindInfo2 *infos, int32_t numOfInfos, SSHashObj *parsedCols, bool infoSorted, | ||
| const STSchema *pTSchema, const SSchemaExt *pSchemaExt, SArray *rowArray, | ||
| bool *pOrdered, bool *pDupTs, SBlobSet *pBlobSet) { | ||
| if (infos == NULL || numOfInfos <= 0 || numOfInfos > pTSchema->numOfCols || pTSchema == NULL || rowArray == NULL) { |
There was a problem hiding this comment.
[Critical] NULL 检查顺序错误,pTSchema 被提前解引用
pTSchema->numOfCols 在 pTSchema == NULL 之前求值。若调用方传入 NULL,进程立即崩溃。修正顺序:
if (infos == NULL || numOfInfos <= 0 || pTSchema == NULL || numOfInfos > pTSchema->numOfCols || rowArray == NULL) {| if (pCol->value.type == TSDB_DATA_TYPE_DECIMAL) { | ||
| taosMemoryFreeClear(pCol->value.pData); | ||
| } | ||
| } |
There was a problem hiding this comment.
[High] DECIMAL 清理释放了 parsedCols 拥有的 pData → use-after-free
colValArray 中同时存在两类条目:
- bind 路径:
decimal128ToDataVal堆分配,需要释放 parsedCols浅拷贝:pData由调用方拥有,不能释放
此处的清理循环对所有 TSDB_DATA_TYPE_DECIMAL 条目无差别调用 taosMemoryFreeClear(pCol->value.pData),会将 parsedCols 拥有的指针也释放掉。下一行再次从 parsedCols 读取同一条目时,pData 已是悬空指针,导致 use-after-free。
_exit 错误路径中存在完全相同的问题。
修复: 记录哪些 colValArray 下标来自 bind 路径(例如用一个小的索引集合),清理时只释放这些条目的 pData。
| if (!pSchemaExt) { | ||
| uError("stmt2 decimal128 type without ext schema info, cannot parse decimal values"); | ||
| code = TSDB_CODE_DECIMAL_PARSE_ERROR; | ||
| goto _exit; |
There was a problem hiding this comment.
[Medium] DECIMAL / DECIMAL64 错误日志类型名称互换
TSDB_DATA_TYPE_DECIMAL分支打印的是"decimal64 type"(应为 decimal128/decimal)TSDB_DATA_TYPE_DECIMAL64分支打印的是"decimal128 type"(应为 decimal64)
两条消息均写反,排查问题时极易误导。将两处字符串对调即可修复。
ff83027 to
614d35b
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| uint8_t precision = 0, scale = 0; | ||
| decimalFromTypeMod(pSchemaExt[iInfo].typeMod, &precision, &scale); | ||
| Decimal128 dec = {0}; | ||
| uint8_t **data = &((uint8_t **)TARRAY_DATA(bufArray))[iInfo - numOfFixedValue]; | ||
| int32_t length = infos[iInfo].bind->length[iRow]; | ||
| code = decimal128FromStr(*(char **)data, length, precision, scale, &dec); | ||
| *data += length; | ||
| hasDecimal128 = true; | ||
| TAOS_CHECK_GOTO(code, &lino, _exit); | ||
|
|
||
| code = decimal128ToDataVal(&dec, &value); | ||
| TAOS_CHECK_GOTO(code, &lino, _exit); | ||
|
|
||
| } else if (infos[iInfo].type == TSDB_DATA_TYPE_DECIMAL64) { | ||
| if (!pSchemaExt) { | ||
| uError("stmt2 decimal64 type without ext schema info, cannot parse decimal values"); | ||
| code = TSDB_CODE_DECIMAL_PARSE_ERROR; | ||
| goto _exit; | ||
| } | ||
| uint8_t precision = 0, scale = 0; | ||
| decimalFromTypeMod(pSchemaExt[iInfo].typeMod, &precision, &scale); | ||
| Decimal64 dec = {0}; |
There was a problem hiding this comment.
Decimal precision/scale is derived from pSchemaExt[iInfo].typeMod, but iInfo is the bind-info index, not the table column index. If bound columns are a subset or reordered (or if infos gets sorted by colId), this can pick the wrong typeMod and parse DECIMAL values incorrectly. Use the schema position for the column (e.g., infos[iInfo].columnId - 1) or locate the matching SSchemaExt entry by colId before calling decimalFromTypeMod.
| ASSERT_NE(row, nullptr); | ||
| ASSERT_STREQ((char*)row[0], NULL); | ||
| row = taos_fetch_row(result); |
There was a problem hiding this comment.
ASSERT_STREQ((char*)row[0], NULL) (and the similar check for row[2]) is undefined because ASSERT_STREQ expects two non-null C strings. For NULL field values returned by taos_fetch_row, assert the pointer is null instead (e.g., ASSERT_EQ(row[0], nullptr)).
| ASSERT_NE(row, nullptr); | ||
| ASSERT_STREQ((char*)row[2], NULL); | ||
| row = taos_fetch_row(result); |
There was a problem hiding this comment.
ASSERT_STREQ((char*)row[2], NULL) is undefined because ASSERT_STREQ expects two non-null C strings. For NULL field values returned by taos_fetch_row, assert the pointer is null instead (e.g., ASSERT_EQ(row[2], nullptr)).
…with parsed columns (#35010) * fix: decimal string conversion missing in tRowBuildFromBind2WithBlob Root cause: tRowBuildFromBind2WithBlob lacked the DECIMAL/DECIMAL64 string-to-binary conversion that exists in tRowBuildFromBind2. When a table contains both DECIMAL and BLOB columns, the blob code path is taken (tRowBuildFromBind2WithBlob), which treated DECIMAL as a raw fixed-size binary type and read 16 bytes directly from the user buffer. Since the user provides decimal values as text strings (e.g. "21.4300"), the 15-byte buffer was too small, causing a stack-buffer-overflow. Fix: Add pSchemaExt parameter to tRowBuildFromBind2WithBlob and add DECIMAL/DECIMAL64 string-to-binary conversion (decimal128FromStr / decimal64FromStr) in the fixed-size else branch, mirroring the logic in tRowBuildFromBind2. Update the call site in parInsertStmt.c to pass pSchemaExt. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(stmt2): correct DECIMAL in KV+blob row build and align bind path with parsed columns - tRowBuildKVRowWithBlob / tRowBuildKVRowWithBlob2: copy fixed columns via VALUE_GET_DATUM() so DECIMAL uses pData instead of the trivial val field. - tRowBuildFromBind2WithBlob: mirror tRowBuildFromBind2 — accept parsedCols, correct bufArray indexing with numOfFixedValue, TAOS_CHECK_GOTO/lino, and free decimal128 heap after each successful row (and on error) to plug leaks. - parInsertStmt: pass parsedCols into tRowBuildFromBind2WithBlob. - Add stmt2Case.stmt2_decimal_blob_interleaved in stmt2Test * fix review --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Description
Issue(s)
Checklist
Please check the items in the checklist if applicable.