Skip to content

[fix](insert) enable_insert_strict should not affect semantic of enable_strict_cast#63794

Open
jacktengg wants to merge 1 commit into
apache:masterfrom
jacktengg:fix-insert-strict
Open

[fix](insert) enable_insert_strict should not affect semantic of enable_strict_cast#63794
jacktengg wants to merge 1 commit into
apache:masterfrom
jacktengg:fix-insert-strict

Conversation

@jacktengg
Copy link
Copy Markdown
Contributor

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@jacktengg
Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 31650 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 0d84131c6b670a5844337a143b0e1ef2826e7423, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17825	4153	4119	4119
q2	q3	10759	1402	817	817
q4	4695	512	344	344
q5	7704	2272	2112	2112
q6	255	191	138	138
q7	972	772	651	651
q8	9386	1770	1617	1617
q9	5212	5001	5030	5001
q10	6380	2223	1894	1894
q11	441	285	246	246
q12	635	439	296	296
q13	18104	3439	2814	2814
q14	265	258	230	230
q15	q16	810	772	710	710
q17	878	938	953	938
q18	7228	5733	5617	5617
q19	1170	1261	1081	1081
q20	527	411	263	263
q21	5664	2582	2462	2462
q22	435	361	300	300
Total cold run time: 99345 ms
Total hot run time: 31650 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4440	4390	4367	4367
q2	q3	4538	4948	4353	4353
q4	2140	2210	1382	1382
q5	4531	4355	4534	4355
q6	304	210	165	165
q7	2157	2017	1684	1684
q8	2666	2349	2292	2292
q9	8393	8012	8056	8012
q10	4843	4726	4317	4317
q11	637	434	406	406
q12	756	761	536	536
q13	3347	3595	3000	3000
q14	308	299	283	283
q15	q16	714	738	652	652
q17	1419	1370	1496	1370
q18	7968	7455	7244	7244
q19	1165	1130	1103	1103
q20	2224	2203	1945	1945
q21	5343	4519	4455	4455
q22	518	463	402	402
Total cold run time: 58411 ms
Total hot run time: 52323 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 171760 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 0d84131c6b670a5844337a143b0e1ef2826e7423, data reload: false

query5	4346	657	514	514
query6	341	211	207	207
query7	4333	558	306	306
query8	327	231	218	218
query9	8769	4095	4061	4061
query10	451	373	303	303
query11	5776	2550	2264	2264
query12	186	128	127	127
query13	1291	621	452	452
query14	6181	5478	5177	5177
query14_1	4464	4490	4452	4452
query15	217	206	190	190
query16	1017	503	449	449
query17	1144	755	611	611
query18	2719	496	372	372
query19	224	213	173	173
query20	151	144	134	134
query21	217	142	121	121
query22	13699	13531	13375	13375
query23	17375	16458	16182	16182
query23_1	16382	16237	16408	16237
query24	7521	1769	1334	1334
query24_1	1319	1314	1348	1314
query25	554	493	411	411
query26	1344	308	170	170
query27	2713	581	337	337
query28	4411	1977	1958	1958
query29	975	600	490	490
query30	306	230	200	200
query31	1138	1076	959	959
query32	92	79	72	72
query33	534	364	294	294
query34	1180	1160	657	657
query35	781	792	698	698
query36	1404	1402	1254	1254
query37	151	100	89	89
query38	3203	3141	3165	3141
query39	920	923	875	875
query39_1	870	900	903	900
query40	222	143	124	124
query41	65	62	62	62
query42	111	109	104	104
query43	334	332	289	289
query44	
query45	213	208	198	198
query46	1065	1196	751	751
query47	2405	2396	2258	2258
query48	407	396	303	303
query49	635	504	380	380
query50	977	332	264	264
query51	4403	4337	4257	4257
query52	103	103	92	92
query53	247	275	205	205
query54	311	268	261	261
query55	96	91	85	85
query56	303	297	300	297
query57	1467	1436	1352	1352
query58	297	270	275	270
query59	1633	1663	1453	1453
query60	328	348	337	337
query61	178	177	175	175
query62	703	656	590	590
query63	252	214	206	206
query64	2437	894	621	621
query65	
query66	1666	477	364	364
query67	29698	29618	29501	29501
query68	
query69	472	359	308	308
query70	1056	1002	1013	1002
query71	305	262	267	262
query72	2981	2701	2336	2336
query73	866	767	431	431
query74	5099	4991	4752	4752
query75	2713	2595	2270	2270
query76	2313	1144	797	797
query77	405	417	334	334
query78	12231	12372	11868	11868
query79	1320	1057	705	705
query80	577	537	451	451
query81	455	326	241	241
query82	315	153	125	125
query83	361	282	250	250
query84	253	140	114	114
query85	888	543	444	444
query86	400	345	351	345
query87	3420	3382	3257	3257
query88	3618	2711	2708	2708
query89	443	415	349	349
query90	1975	175	181	175
query91	179	166	141	141
query92	77	80	74	74
query93	1480	1477	861	861
query94	538	347	313	313
query95	675	374	354	354
query96	1052	780	342	342
query97	2741	2711	2587	2587
query98	240	243	234	234
query99	1204	1161	1020	1020
Total cold run time: 253188 ms
Total hot run time: 171760 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage `` 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 0.00% (0/61) 🎉
Increment coverage report
Complete coverage report

…le_strict_cast

Problem Summary:

`enable_insert_strict` and `enable_strict_cast` are two independent session
variables: the former controls how many error rows a load/insert tolerates
(row filtering and `insert_max_filter_ratio`), while the latter controls the
strictness of cast/conversion semantics (e.g. whether out-of-range or
overflowing casts fail or produce NULL). They were wrongly coupled:

- In FE, `SessionVariable.enableStrictCast()` returned `enableInsertStrict`
  instead of `enableStrictCast` whenever the current statement was an insert.
  As a result, toggling `enable_insert_strict` silently changed cast semantics
  for insert statements, and `enable_strict_cast` had no effect on inserts.
- In BE, `OlapTableBlockConvertor` truncated over-length string columns with a
  `substring` only when `enable_insert_strict` was false, so string truncation
  behavior was driven by the load-tolerance flag rather than by the cast/plan
  decided in FE.

This commit decouples the two variables:

- FE `SessionVariable.enableStrictCast()` now always returns `enableStrictCast`,
  regardless of whether the statement is an insert.
- FE `BindSink` drives string truncation off the cast semantics (via
  `truncateString`, which depends on `enableStrictCast()`), and also handles
  unbounded source string types whose length is reported as -1 (e.g.
  `string`/`text`): such sources are always longer than a bounded char/varchar
  target and are now correctly truncated with `substring`.
- BE `OlapTableBlockConvertor` no longer performs `enable_insert_strict`-gated
  substring truncation. FE is now responsible for planning the `substring`, and
  BE always validates that each row's byte length does not exceed the schema
  length (since schema length is in bytes while substring works on chars).

After the fix, `enable_strict_cast` controls cast/truncation semantics for
inserts as documented, and `enable_insert_strict` only controls error-row
filtering, independent of each other.

Fixed an issue where `enable_insert_strict` incorrectly changed the semantics
of `enable_strict_cast` for INSERT statements. The two session variables are
now independent.

- Test: Regression test
    - Updated and added regression tests under `regression-test/suites/load_p0`
      and `nereids_function_p0` to cover the decoupled behavior of
      `enable_insert_strict` and `enable_strict_cast`.
- Behavior changed: Yes (`enable_insert_strict` no longer affects cast
  strictness; `enable_strict_cast` now applies to INSERT statements)
@jacktengg jacktengg force-pushed the fix-insert-strict branch from 0d84131 to bb5943a Compare May 29, 2026 07:52
@jacktengg
Copy link
Copy Markdown
Contributor Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review completed. I did not find blocking correctness issues in the PR diff.

Critical checkpoint conclusions:

  • Goal and tests: The PR decouples insert row-filter strictness from strict cast semantics, moves string truncation planning to FE for inserts, and adds regression coverage for values/select/TVF cases.
  • Scope: The code changes are focused on FE insert coercion, BE load validation, session variable lookup, and matching regression tests.
  • Concurrency/lifecycle: No new concurrency, lifecycle, or static initialization concerns found.
  • Configuration/compatibility: No new configuration item or persisted format/protocol change found.
  • Parallel paths: Nereids insert and load paths were considered; stream/TVF tests were updated for the changed validation behavior.
  • Error handling/data correctness: BE still validates byte length after any FE substring, preserving byte-length enforcement for multi-byte strings.
  • Test coverage: Regression tests cover the main combinations of enable_insert_strict, enable_strict_cast, enable_insert_value_auto_cast, and insert_max_filter_ratio. I did not run the suite locally in this review environment.
  • Observability/performance: No new logging or hot-path performance issue found beyond the expected removal of BE-side substring execution.

User focus: No additional user-provided review focus was specified.

@jacktengg
Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 31027 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit bb5943ab949b2a6819d177dcfc1871fbca65ea8d, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17629	4006	4003	4003
q2	q3	10761	1348	819	819
q4	4683	484	350	350
q5	7590	2418	2101	2101
q6	253	177	136	136
q7	915	786	637	637
q8	9372	1635	1511	1511
q9	6218	4995	4917	4917
q10	6433	2213	1884	1884
q11	433	272	245	245
q12	692	433	294	294
q13	18209	3397	2781	2781
q14	271	259	234	234
q15	q16	814	767	702	702
q17	1105	919	951	919
q18	7342	5860	5549	5549
q19	1352	1350	1102	1102
q20	515	390	253	253
q21	5615	2621	2290	2290
q22	427	355	300	300
Total cold run time: 100629 ms
Total hot run time: 31027 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4388	4357	4261	4261
q2	q3	4563	4966	4328	4328
q4	2098	2223	1399	1399
q5	4424	4295	4608	4295
q6	264	225	155	155
q7	1945	1875	1625	1625
q8	2519	2133	2189	2133
q9	8051	8023	8007	8007
q10	4887	4836	4303	4303
q11	560	400	375	375
q12	744	772	564	564
q13	3279	3594	3017	3017
q14	317	323	278	278
q15	q16	739	734	658	658
q17	1368	1348	1344	1344
q18	7864	7361	6716	6716
q19	1137	1105	1160	1105
q20	2210	2228	1951	1951
q21	5286	4642	4465	4465
q22	512	457	420	420
Total cold run time: 57155 ms
Total hot run time: 51399 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 171986 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit bb5943ab949b2a6819d177dcfc1871fbca65ea8d, data reload: false

query5	4314	668	502	502
query6	328	215	192	192
query7	4279	562	318	318
query8	331	229	215	215
query9	8777	3967	3966	3966
query10	475	357	296	296
query11	5813	2393	2241	2241
query12	179	127	131	127
query13	1266	585	429	429
query14	6148	5446	5175	5175
query14_1	4506	4457	4487	4457
query15	215	204	189	189
query16	1026	463	437	437
query17	1167	755	614	614
query18	2756	493	363	363
query19	244	201	170	170
query20	141	138	129	129
query21	216	141	117	117
query22	13641	13601	13368	13368
query23	17366	16451	16235	16235
query23_1	16478	16355	16336	16336
query24	7550	1811	1310	1310
query24_1	1355	1338	1323	1323
query25	572	521	436	436
query26	1309	331	186	186
query27	2724	597	361	361
query28	4459	2023	2010	2010
query29	975	649	535	535
query30	304	245	203	203
query31	1142	1072	967	967
query32	82	78	73	73
query33	553	351	285	285
query34	1217	1158	661	661
query35	777	810	713	713
query36	1413	1433	1321	1321
query37	155	106	91	91
query38	3272	3253	3124	3124
query39	984	954	937	937
query39_1	918	926	924	924
query40	244	152	126	126
query41	65	68	63	63
query42	111	109	110	109
query43	329	334	314	314
query44	
query45	221	211	202	202
query46	1144	1215	753	753
query47	2432	2403	2334	2334
query48	414	426	310	310
query49	634	494	387	387
query50	1043	347	260	260
query51	4424	4376	4358	4358
query52	107	106	95	95
query53	258	285	208	208
query54	317	262	263	262
query55	95	94	89	89
query56	319	320	315	315
query57	1443	1438	1349	1349
query58	289	270	271	270
query59	1582	1635	1434	1434
query60	320	365	316	316
query61	156	147	154	147
query62	708	648	576	576
query63	241	199	205	199
query64	2379	802	693	693
query65	
query66	1671	476	356	356
query67	29622	29607	29580	29580
query68	
query69	475	336	307	307
query70	1027	975	1009	975
query71	309	266	266	266
query72	3016	2706	2398	2398
query73	852	814	436	436
query74	5151	4951	4787	4787
query75	2687	2588	2256	2256
query76	2291	1185	834	834
query77	414	409	339	339
query78	12348	12385	11883	11883
query79	1457	1077	748	748
query80	1334	542	454	454
query81	514	283	242	242
query82	977	161	119	119
query83	344	275	255	255
query84	262	141	107	107
query85	944	558	461	461
query86	456	339	337	337
query87	3432	3397	3252	3252
query88	3612	2741	2718	2718
query89	461	399	352	352
query90	1907	179	181	179
query91	181	166	142	142
query92	75	75	73	73
query93	1702	1463	859	859
query94	725	352	273	273
query95	681	398	429	398
query96	1040	787	337	337
query97	2720	2720	2654	2654
query98	237	237	228	228
query99	1180	1142	1037	1037
Total cold run time: 255512 ms
Total hot run time: 171986 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 100.00% (1/1) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.99% (20997/38891)
Line Coverage 37.53% (199001/530184)
Region Coverage 33.83% (156056/461234)
Branch Coverage 34.81% (67903/195059)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (1/1) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 57.94% (22070/38089)
Line Coverage 41.23% (218024/528836)
Region Coverage 37.53% (174755/465654)
Branch Coverage 38.41% (75210/195789)

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 100.00% (1/1) 🎉
Increment coverage report
Complete coverage report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants