Skip to content

[fix](fe) Preserve narrowing datetimev2 casts in simplify in predicate#63343

Merged
morrySnow merged 1 commit into
apache:masterfrom
starocean999:master_0413
May 20, 2026
Merged

[fix](fe) Preserve narrowing datetimev2 casts in simplify in predicate#63343
morrySnow merged 1 commit into
apache:masterfrom
starocean999:master_0413

Conversation

@starocean999
Copy link
Copy Markdown
Contributor

What problem does this PR solve?

SimplifyInPredicate could incorrectly remove narrowing DATETIMEV2 casts inside IN predicates.
For expressions such as CAST(datetimev2(6) AS DATETIMEV2(3)) IN (...), the rewrite could turn the predicate
into a direct comparison on the original DATETIMEV2(6) column, which is not semantics-preserving because the cast
uses precision reduction and rounding. The rule also used a binary-based microsecond alignment check instead of a
decimal scale check, which could incorrectly treat some literals as losslessly convertible.

This change restricts the rewrite to non-narrowing DATETIMEV2 casts and fixes the literal alignment check to use
decimal scale factors. It also adds FE unit tests and a regression case for the reported DATETIMEV2 scenario.

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@starocean999
Copy link
Copy Markdown
Contributor Author

/review

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review result: no blocking issues found.

Critical checkpoint conclusions:

  • Goal and proof: The PR fixes SimplifyInPredicate so it no longer removes semantics-changing narrowing DATETIMEV2 casts, and fixes decimal-scale literal alignment. The added unit/regression cases cover the intended behavior, including the result-preserving narrowing-cast case.
  • Scope and clarity: The implementation is small and focused on the existing rewrite rule.
  • Concurrency/lifecycle/config/compatibility: Not applicable; this is a local FE expression rewrite and test change, with no new shared state, lifecycle management, config, storage format, or wire protocol changes.
  • Parallel paths: The modified path is the existing IN-predicate simplification path; date/datev2 behavior is unchanged, and the datetimev2 path now preserves narrowing casts.
  • Conditions: The new scale guard is the right semantic condition for avoiding narrowing cast removal, and the literal alignment check now uses decimal scale factors rather than binary factors.
  • Test coverage: Unit tests and an ordered regression test were added. I did not run the tests in this review environment.
  • Observability/transactions/data writes: Not applicable.
  • Performance: No meaningful regression identified; the added checks are constant/simple per literal and remain within the existing rewrite logic.
  • User focus points: No additional user-provided review focus was present.

Minor note: git diff --check reports a blank line at EOF in the new .out file, but this is not a correctness blocker.

@starocean999
Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 42.86% (3/7) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 31526 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 8919006a89835669d7c436073db44e45750c67ad, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	18055	4055	4094	4055
q2	q3	10811	1374	816	816
q4	4688	480	344	344
q5	7640	2283	2106	2106
q6	248	176	146	146
q7	928	768	636	636
q8	9347	1782	1647	1647
q9	5178	4886	4892	4886
q10	6400	2076	1794	1794
q11	459	285	252	252
q12	644	426	298	298
q13	18124	3420	2782	2782
q14	265	259	232	232
q15	q16	826	767	711	711
q17	1006	849	902	849
q18	6993	5791	5557	5557
q19	1303	1387	1045	1045
q20	660	446	302	302
q21	6384	2900	2744	2744
q22	469	474	324	324
Total cold run time: 100428 ms
Total hot run time: 31526 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4821	4720	4864	4720
q2	q3	4876	5244	4592	4592
q4	2169	2257	1387	1387
q5	4901	4691	4573	4573
q6	282	189	145	145
q7	1884	1693	1529	1529
q8	2369	2058	2033	2033
q9	7576	7349	7179	7179
q10	4485	4396	3995	3995
q11	524	402	351	351
q12	714	724	512	512
q13	2977	3409	2781	2781
q14	271	290	251	251
q15	q16	691	700	612	612
q17	1269	1226	1236	1226
q18	7328	6769	6857	6769
q19	1138	1072	1094	1072
q20	2211	2195	1939	1939
q21	5417	4642	4518	4518
q22	538	474	402	402
Total cold run time: 56441 ms
Total hot run time: 50586 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 169594 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 8919006a89835669d7c436073db44e45750c67ad, data reload: false

query5	4334	661	519	519
query6	330	224	197	197
query7	4261	589	315	315
query8	338	233	218	218
query9	8839	3972	3952	3952
query10	446	344	303	303
query11	5826	2347	2198	2198
query12	181	124	121	121
query13	1285	646	440	440
query14	6006	5376	5085	5085
query14_1	4396	4362	4350	4350
query15	219	209	188	188
query16	998	466	456	456
query17	1130	745	591	591
query18	2470	488	354	354
query19	212	202	160	160
query20	134	131	127	127
query21	214	135	113	113
query22	13579	13526	13265	13265
query23	17294	16312	16087	16087
query23_1	16091	16122	16100	16100
query24	7439	1760	1303	1303
query24_1	1292	1286	1331	1286
query25	543	480	422	422
query26	1317	321	174	174
query27	2710	547	328	328
query28	4449	1921	1931	1921
query29	959	621	499	499
query30	305	235	199	199
query31	1130	1067	940	940
query32	88	74	74	74
query33	529	362	288	288
query34	1181	1127	623	623
query35	756	777	679	679
query36	1313	1361	1173	1173
query37	167	105	92	92
query38	3209	3120	3049	3049
query39	938	945	894	894
query39_1	894	872	871	871
query40	230	179	121	121
query41	64	63	61	61
query42	112	109	108	108
query43	320	336	289	289
query44	
query45	207	196	185	185
query46	1065	1156	705	705
query47	2330	2289	2177	2177
query48	396	397	283	283
query49	631	479	399	399
query50	954	345	244	244
query51	4331	4255	4187	4187
query52	103	103	94	94
query53	259	277	199	199
query54	325	287	253	253
query55	89	97	86	86
query56	294	301	307	301
query57	1417	1387	1308	1308
query58	296	275	270	270
query59	1569	1592	1411	1411
query60	309	325	308	308
query61	160	154	150	150
query62	682	630	561	561
query63	243	196	202	196
query64	2389	831	658	658
query65	
query66	1720	473	350	350
query67	29387	29968	29704	29704
query68	
query69	453	347	298	298
query70	1048	946	987	946
query71	304	274	272	272
query72	3046	2723	2453	2453
query73	874	752	419	419
query74	5073	4880	4730	4730
query75	2694	2584	2243	2243
query76	2282	1133	776	776
query77	404	417	342	342
query78	12196	12087	11483	11483
query79	1284	1054	758	758
query80	625	587	500	500
query81	453	283	244	244
query82	245	162	123	123
query83	280	277	257	257
query84	267	148	114	114
query85	941	619	546	546
query86	362	367	322	322
query87	3367	3392	3197	3197
query88	3556	2677	2691	2677
query89	428	392	338	338
query90	2178	180	187	180
query91	196	182	153	153
query92	84	81	72	72
query93	1422	1420	872	872
query94	558	392	315	315
query95	683	473	338	338
query96	1079	794	331	331
query97	2668	2697	2542	2542
query98	234	236	229	229
query99	1115	1102	967	967
Total cold run time: 251119 ms
Total hot run time: 169594 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 0.75% (5/671) 🎉
Increment coverage report
Complete coverage report

@starocean999 starocean999 marked this pull request as ready for review May 20, 2026 02:06
@github-actions github-actions Bot added the approved Indicates a PR has been approved by one committer. label May 20, 2026
@github-actions
Copy link
Copy Markdown
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Copy Markdown
Contributor

PR approved by anyone and no changes requested.

@morrySnow morrySnow merged commit b96bb6b into apache:master May 20, 2026
36 checks passed
github-actions Bot pushed a commit that referenced this pull request May 20, 2026
#63343)

### What problem does this PR solve?
SimplifyInPredicate could incorrectly remove narrowing DATETIMEV2 casts
inside IN predicates.
For expressions such as `CAST(datetimev2(6) AS DATETIMEV2(3)) IN (...)`,
the rewrite could turn the predicate
into a direct comparison on the original DATETIMEV2(6) column, which is
not semantics-preserving because the cast
uses precision reduction and rounding. The rule also used a binary-based
microsecond alignment check instead of a
decimal scale check, which could incorrectly treat some literals as
losslessly convertible.

This change restricts the rewrite to non-narrowing DATETIMEV2 casts and
fixes the literal alignment check to use
decimal scale factors. It also adds FE unit tests and a regression case
for the reported DATETIMEV2 scenario.
github-actions Bot pushed a commit that referenced this pull request May 20, 2026
#63343)

### What problem does this PR solve?
SimplifyInPredicate could incorrectly remove narrowing DATETIMEV2 casts
inside IN predicates.
For expressions such as `CAST(datetimev2(6) AS DATETIMEV2(3)) IN (...)`,
the rewrite could turn the predicate
into a direct comparison on the original DATETIMEV2(6) column, which is
not semantics-preserving because the cast
uses precision reduction and rounding. The rule also used a binary-based
microsecond alignment check instead of a
decimal scale check, which could incorrectly treat some literals as
losslessly convertible.

This change restricts the rewrite to non-narrowing DATETIMEV2 casts and
fixes the literal alignment check to use
decimal scale factors. It also adds FE unit tests and a regression case
for the reported DATETIMEV2 scenario.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/4.0.x dev/4.1.x reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants