Skip to content

[fix](fe) Accept spaced TIMESTAMPTZ offsets in range partition bounds#63281

Open
starocean999 wants to merge 4 commits into
apache:masterfrom
starocean999:master_0408
Open

[fix](fe) Accept spaced TIMESTAMPTZ offsets in range partition bounds#63281
starocean999 wants to merge 4 commits into
apache:masterfrom
starocean999:master_0408

Conversation

@starocean999
Copy link
Copy Markdown
Contributor

Problem Summary:
TIMESTAMPTZ range partition bounds with explicit offsets were still inconsistent across FE paths.
CREATE TABLE ... PARTITION BY RANGE ... VALUES LESS THAN ('2024-01-15 13:00:00 +00:00')
failed because the raw bound string was reparsed by PartitionKey and rejected the space before the
timezone offset. This also overlaps with the existing TIMESTAMPTZ partition-boundary handling that
must preserve explicit UTC offsets during metadata reconstruction.

This PR normalizes spaced numeric timezone offsets in the TIMESTAMPTZ partition-key parsing path and
adds focused FE unit tests plus regression coverage for:

  • direct TIMESTAMPTZ partition key parsing with explicit offsets
  • CREATE TABLE with VALUES LESS THAN TIMESTAMPTZ bounds
  • CREATE TABLE LIKE round-trip preservation of TIMESTAMPTZ range boundaries

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@starocean999
Copy link
Copy Markdown
Contributor Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found one correctness regression in the TIMESTAMPTZ partition-boundary conversion path that can route/query data against the wrong UTC partition boundary when the partition DDL omits an explicit offset under a non-UTC session time zone. There is also a small FE test style issue that should be fixed before merge.

Critical checkpoint conclusions:

  • Goal/test proof: The PR addresses spaced explicit TIMESTAMPTZ offsets in range partition bounds and adds FE/regression coverage for explicit offsets; however it does not preserve existing no-explicit-offset semantics, so the goal is only partially met.
  • Scope/focus: The code change is small and focused in catalog partition key parsing.
  • Concurrency/lifecycle/locks/config: No new concurrency, lifecycle, locking, or configuration concerns found.
  • Compatibility/persistence: No storage/EditLog format change found; the concern is behavioral compatibility of existing TIMESTAMPTZ partition DDL.
  • Parallel paths: Range LESS THAN and CREATE LIKE explicit-offset paths are covered, but the no-offset TIMESTAMPTZ path is a parallel existing behavior that now regresses.
  • Test coverage: Added tests cover explicit offsets with spaces, but missing a non-UTC session case without an explicit offset that would catch this regression.
  • Observability/performance: No additional observability or performance concerns found.
  • User focus: No additional user-provided review focus was supplied.

Comment thread fe/fe-core/src/main/java/org/apache/doris/catalog/PartitionKey.java Outdated
@starocean999
Copy link
Copy Markdown
Contributor Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed for PR 63281 at head 739adfc.

No additional blocking issues found beyond the already-known review threads. The latest commit restores TIMESTAMPTZ partition-bound handling for values without explicit offsets and fixes the FE test indentation issue, so I am not duplicating those prior comments.

Critical checkpoint conclusions:

  • Goal and tests: The PR aims to accept TIMESTAMPTZ range partition bounds with explicit spaced offsets while preserving no-offset session-time-zone behavior. The current code and added FE/regression tests cover explicit offset create/create-like paths and no-offset session timezone conversion.
  • Scope and clarity: The change is focused on partition-bound literal conversion and metadata round-trip formatting. The helper logic is local to PartitionKey.
  • Concurrency and locking: No new shared mutable state, metadata locking, or concurrent lifecycle changes were introduced.
  • Lifecycle/static initialization: No new static initialization dependency or lifecycle-sensitive object ownership was introduced.
  • Configuration: No new configuration items were added.
  • Compatibility: The change affects FE parsing/serialization of TIMESTAMPTZ partition bounds but does not introduce storage-format or FE-BE protocol changes.
  • Parallel paths: Nereids partition definition conversion and legacy partition key parsing were both considered; the added toLegacyLiteral() path preserves TIMESTAMPTZ UTC boundary strings for round trips.
  • Conditional checks: The explicit-time-zone branch is bounded to TIMESTAMPTZ parsing and preserves the old no-offset path.
  • Test coverage: Added FE unit tests and a regression suite. Existing review comments covered missing no-offset regression coverage and indentation; both are addressed in the latest diff.
  • Observability: No new runtime operational path requiring logs or metrics was introduced.
  • Transactions/persistence/data writes: No transaction, edit-log, or data-write path changes were introduced; partition metadata formatting/parsing remains within existing DDL flows.
  • FE review checks: No new FE locking, exception-boundary, or visible-version concerns found.

User focus: No additional user-provided review focus was specified.

@starocean999
Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 30922 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 739adfcded87864c23b0267fa5959362007c018a, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17630	3899	3835	3835
q2	q3	10744	1416	837	837
q4	4681	472	340	340
q5	7570	2277	2130	2130
q6	256	172	135	135
q7	922	760	629	629
q8	9365	1688	1654	1654
q9	6666	4889	4861	4861
q10	6460	2160	1787	1787
q11	437	272	241	241
q12	691	420	293	293
q13	18204	3387	2818	2818
q14	270	255	231	231
q15	q16	813	787	705	705
q17	1086	907	1021	907
q18	6901	5761	5436	5436
q19	1297	1288	1079	1079
q20	507	411	265	265
q21	5535	2580	2433	2433
q22	431	358	306	306
Total cold run time: 100466 ms
Total hot run time: 30922 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4168	4092	4071	4071
q2	q3	4460	4931	4282	4282
q4	2095	2176	1391	1391
q5	4594	4267	4284	4267
q6	235	171	130	130
q7	2197	1988	1712	1712
q8	2503	2086	2015	2015
q9	7823	7687	7771	7687
q10	4517	4539	4065	4065
q11	573	393	367	367
q12	874	748	510	510
q13	3272	3592	2985	2985
q14	295	307	263	263
q15	q16	711	725	721	721
q17	1322	1320	1307	1307
q18	7847	7555	7057	7057
q19	1110	1086	1098	1086
q20	2211	2226	1933	1933
q21	5259	4562	4497	4497
q22	540	462	403	403
Total cold run time: 56606 ms
Total hot run time: 50749 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 169236 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 739adfcded87864c23b0267fa5959362007c018a, data reload: false

query5	4320	667	515	515
query6	353	216	202	202
query7	4219	557	328	328
query8	334	230	216	216
query9	8816	4084	4081	4081
query10	464	349	307	307
query11	5809	2398	2198	2198
query12	191	129	131	129
query13	1277	616	443	443
query14	6032	5378	5082	5082
query14_1	4385	4382	4338	4338
query15	217	203	182	182
query16	1044	461	443	443
query17	1170	752	607	607
query18	2803	508	371	371
query19	245	211	165	165
query20	141	139	131	131
query21	218	143	124	124
query22	13615	13590	13335	13335
query23	17211	16452	16011	16011
query23_1	16229	16219	16208	16208
query24	7392	1756	1288	1288
query24_1	1305	1311	1336	1311
query25	575	502	434	434
query26	1337	317	174	174
query27	2727	589	343	343
query28	4412	1963	1967	1963
query29	1019	652	509	509
query30	309	246	204	204
query31	1123	1062	958	958
query32	96	78	77	77
query33	568	371	322	322
query34	1158	1107	654	654
query35	777	798	729	729
query36	1300	1354	1235	1235
query37	163	102	93	93
query38	3220	3167	3117	3117
query39	926	909	916	909
query39_1	889	883	862	862
query40	223	153	126	126
query41	66	64	62	62
query42	113	110	110	110
query43	328	343	292	292
query44	
query45	212	207	193	193
query46	1053	1219	736	736
query47	2234	2284	2176	2176
query48	394	412	287	287
query49	628	495	381	381
query50	995	361	249	249
query51	4342	4296	4242	4242
query52	107	110	97	97
query53	263	281	208	208
query54	316	272	262	262
query55	93	90	88	88
query56	295	323	313	313
query57	1373	1389	1281	1281
query58	298	282	267	267
query59	1595	1636	1435	1435
query60	317	357	318	318
query61	158	152	155	152
query62	667	623	560	560
query63	241	204	207	204
query64	2487	842	697	697
query65	
query66	1703	484	365	365
query67	30021	29353	29831	29353
query68	
query69	466	362	316	316
query70	1028	1004	1015	1004
query71	311	277	273	273
query72	3005	2649	2365	2365
query73	865	776	444	444
query74	5075	4893	4765	4765
query75	2649	2599	2258	2258
query76	2305	1129	776	776
query77	403	406	336	336
query78	12124	12085	11468	11468
query79	1481	1014	733	733
query80	1232	534	460	460
query81	523	276	240	240
query82	932	155	122	122
query83	345	277	252	252
query84	259	137	113	113
query85	914	539	453	453
query86	456	357	336	336
query87	3518	3363	3235	3235
query88	3610	2662	2679	2662
query89	444	387	341	341
query90	1944	194	190	190
query91	189	158	137	137
query92	80	82	75	75
query93	1559	1396	833	833
query94	718	330	312	312
query95	682	386	432	386
query96	1028	757	338	338
query97	2695	2677	2580	2580
query98	238	231	233	231
query99	1088	1096	1026	1026
Total cold run time: 254127 ms
Total hot run time: 169236 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 42.50% (17/40) 🎉
Increment coverage report
Complete coverage report

@starocean999
Copy link
Copy Markdown
Contributor Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found one blocking issue in the current PR. The main goal is clear: keep TIMESTAMPTZ partition boundaries in UTC, preserving explicit offsets and session-time-zone behavior for no-offset values. The added tests cover explicit numeric offsets, no-offset session conversion, create-like, and auto partitioning, but they miss valid named-zone spellings.

Critical checkpoint conclusions: the change is mostly focused and small; no new concurrency, lifecycle, configuration, persistence, or FE-BE protocol paths are involved; data correctness is affected by the inline issue because valid TIMESTAMPTZ partition boundaries using named/lowercase zones can now be rejected; tests prove the numeric-offset and no-offset cases but do not cover named zones; observability is not a concern for this DDL parse path.

Existing review threads: I did not duplicate the earlier no-offset semantic regression or the checkstyle indentation thread.

User focus: no additional user-provided review focus was specified.

Comment thread fe/fe-core/src/main/java/org/apache/doris/catalog/PartitionKey.java Outdated
@starocean999
Copy link
Copy Markdown
Contributor Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review result: no additional blocking issues found in the current PR head. The earlier inline concerns about no-offset TIMESTAMPTZ session-time-zone semantics, FE checkstyle indentation, and named/lowercase explicit time zones were already present in existing threads and appear addressed by the latest changes.

Critical checkpoints:

  • Goal and proof: the PR aims to preserve/accept TIMESTAMPTZ range partition bounds across explicit offsets, generated DDL/create-like, session-time-zone no-offset values, named/lowercase zones, and auto partition boundaries. Added FE unit tests and a regression suite cover these scenarios.
  • Scope and clarity: changes are focused on partition bound conversion/serialization paths and targeted tests.
  • Concurrency/lifecycle: no new shared mutable runtime state, locks, threads, static initialization dependency, or lifecycle-sensitive object ownership found.
  • Config/compatibility/protocol: no new config, storage format, or FE-BE thrift protocol fields found. Existing TIMESTAMPTZ metadata is normalized through current partition-key paths.
  • Parallel paths: range partition creation, less-than syntax, create-like DDL round-trip, and auto partition creation were checked. I did not find another distinct TIMESTAMPTZ partition path requiring the same fix in this PR scope.
  • Special checks: the explicit-time-zone detection is limited to choosing the existing TIMESTAMPTZ literal parser versus the existing session-time-zone conversion path; invalid inputs still fail during literal parsing.
  • Tests: new unit/regression coverage is relevant and uses ordered query output. I did not run the test suite in this review environment.
  • Observability: no new long-running or distributed runtime path requiring additional logs/metrics.
  • Transactions/persistence/data correctness: no transaction processing or persistence log write/replay changes found; partition boundary correctness remains the key data-routing concern and is covered by the added tests.
  • Performance: no hot-path repeated scans or heavy allocations beyond short literal parsing during DDL/partition creation.
  • User focus: no additional user-provided review focus was specified.

@starocean999
Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 31532 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 871ed02e830857ab4a8a99febfd306b00b6346c8, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17809	3956	3935	3935
q2	q3	10809	1395	807	807
q4	4685	473	344	344
q5	7621	2292	2145	2145
q6	235	173	138	138
q7	922	773	645	645
q8	9425	1657	1678	1657
q9	5133	4924	4936	4924
q10	6385	2076	1786	1786
q11	439	276	249	249
q12	626	425	297	297
q13	18084	3343	2772	2772
q14	266	258	239	239
q15	q16	826	769	722	722
q17	1006	946	967	946
q18	6981	5812	5667	5667
q19	1303	1219	1058	1058
q20	505	514	306	306
q21	6182	2879	2578	2578
q22	547	370	317	317
Total cold run time: 99789 ms
Total hot run time: 31532 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4570	4542	4501	4501
q2	q3	4882	5313	4581	4581
q4	2128	2193	1414	1414
q5	5032	4646	4634	4634
q6	230	171	134	134
q7	1893	1663	1537	1537
q8	2383	2114	2055	2055
q9	7659	7251	7209	7209
q10	4438	4391	3987	3987
q11	526	382	346	346
q12	700	723	512	512
q13	3045	3411	2776	2776
q14	280	272	258	258
q15	q16	683	715	633	633
q17	1257	1233	1235	1233
q18	7431	6849	6892	6849
q19	1101	1079	1111	1079
q20	2186	2214	1924	1924
q21	5347	4655	4472	4472
q22	523	458	404	404
Total cold run time: 56294 ms
Total hot run time: 50538 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 170545 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 871ed02e830857ab4a8a99febfd306b00b6346c8, data reload: false

query5	4327	648	517	517
query6	324	238	222	222
query7	4232	528	294	294
query8	336	243	226	226
query9	8843	4020	4053	4020
query10	454	344	300	300
query11	5792	2346	2207	2207
query12	182	132	127	127
query13	1278	603	439	439
query14	6049	5410	5052	5052
query14_1	4363	4382	4359	4359
query15	217	207	183	183
query16	1006	466	444	444
query17	1155	752	606	606
query18	2597	494	371	371
query19	228	208	175	175
query20	142	133	129	129
query21	222	150	122	122
query22	13567	13568	13396	13396
query23	17224	16432	16036	16036
query23_1	16265	16179	16134	16134
query24	7528	1762	1287	1287
query24_1	1307	1300	1315	1300
query25	569	505	455	455
query26	1317	337	173	173
query27	2679	579	347	347
query28	4418	1945	1922	1922
query29	1027	649	531	531
query30	310	239	211	211
query31	1117	1061	933	933
query32	96	74	78	74
query33	560	364	302	302
query34	1227	1118	625	625
query35	757	773	674	674
query36	1348	1312	1203	1203
query37	154	104	99	99
query38	3196	3134	3017	3017
query39	930	932	885	885
query39_1	876	882	877	877
query40	229	148	124	124
query41	77	75	61	61
query42	112	109	107	107
query43	327	319	293	293
query44	
query45	210	204	194	194
query46	1032	1224	721	721
query47	2416	2335	2211	2211
query48	377	400	301	301
query49	628	492	384	384
query50	1005	339	265	265
query51	4310	4253	4273	4253
query52	103	105	94	94
query53	248	272	202	202
query54	315	264	254	254
query55	92	89	83	83
query56	316	317	315	315
query57	1436	1433	1307	1307
query58	292	266	263	263
query59	1566	1605	1455	1455
query60	328	330	327	327
query61	155	159	161	159
query62	671	621	536	536
query63	244	203	201	201
query64	2396	823	622	622
query65	
query66	1725	469	352	352
query67	29922	29871	29804	29804
query68	
query69	463	337	313	313
query70	1028	961	1027	961
query71	304	274	265	265
query72	2994	2657	2700	2657
query73	826	741	399	399
query74	5065	4929	4746	4746
query75	2692	2599	2261	2261
query76	2300	1132	757	757
query77	398	403	327	327
query78	12074	12150	11641	11641
query79	1473	1034	791	791
query80	689	537	449	449
query81	451	285	236	236
query82	1419	162	128	128
query83	353	278	245	245
query84	268	135	114	114
query85	901	549	457	457
query86	392	366	322	322
query87	3428	3337	3229	3229
query88	3506	2669	2636	2636
query89	449	391	334	334
query90	1946	183	172	172
query91	180	165	134	134
query92	83	79	78	78
query93	1480	1406	924	924
query94	557	337	304	304
query95	679	371	343	343
query96	1065	851	337	337
query97	2696	2677	2563	2563
query98	237	248	228	228
query99	1102	1117	1026	1026
Total cold run time: 253093 ms
Total hot run time: 170545 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 33.33% (17/51) 🎉
Increment coverage report
Complete coverage report

@starocean999 starocean999 marked this pull request as ready for review May 19, 2026 01:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants