Skip to content

Conversation

@eldenmoon
Copy link
Member

What problem does this PR solve?

This commit adds a comprehensive suite of regression tests for the VARIANT data type,
covering various scenarios including:

   - Data Manipulation: Tests for INSERT, UPDATE, DELETE, and INSERT INTO SELECT operations
     on tables with VARIANT columns, including variations with Merge-on-Write (MOW) tables
     and concurrent operations.
   - Predefined Schemas: Extensive tests for VARIANT columns with predefined schemas,
     covering all supported data types (including complex and nested types), type casting,
     and schema evolution (adding/dropping columns).
   - Indexing:
       - Verifies the functionality of inverted indexes on VARIANT columns, including index
         creation with field patterns (exact, wildcard, glob), multi-index support, and
         index usage in query filtering.
       - Tests inverted index behavior with custom analyzers.
       - Ensures correct query results when filtering on indexed VARIANT sub-fields using
         MATCH, =, IN, IS NULL, and array_contains.
   - Compaction: Includes tests to ensure data correctness and schema consistency after
     cumulative and full compaction on tables with VARIANT columns, especially with sparse
     columns and predefined schemas.
   - Functions: Adds tests for VARIANT-related functions like variant_type and element
     access functions.
   - Edge Cases: Covers scenarios like loading from S3, handling of NULL values, and bloom
     filters on VARIANT columns.

Overall, this commit significantly strengthens the test coverage for the VARIANT data
type, ensuring its stability and correctness across a wide range of use cases.

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

This commit adds a comprehensive suite of regression tests for the VARIANT data type,
covering various scenarios including:

   - Data Manipulation: Tests for INSERT, UPDATE, DELETE, and INSERT INTO SELECT operations
     on tables with VARIANT columns, including variations with Merge-on-Write (MOW) tables
     and concurrent operations.
   - Predefined Schemas: Extensive tests for VARIANT columns with predefined schemas,
     covering all supported data types (including complex and nested types), type casting,
     and schema evolution (adding/dropping columns).
   - Indexing:
       - Verifies the functionality of inverted indexes on VARIANT columns, including index
         creation with field patterns (exact, wildcard, glob), multi-index support, and
         index usage in query filtering.
       - Tests inverted index behavior with custom analyzers.
       - Ensures correct query results when filtering on indexed VARIANT sub-fields using
         MATCH, =, IN, IS NULL, and array_contains.
   - Compaction: Includes tests to ensure data correctness and schema consistency after
     cumulative and full compaction on tables with VARIANT columns, especially with sparse
     columns and predefined schemas.
   - Functions: Adds tests for VARIANT-related functions like variant_type and element
     access functions.
   - Edge Cases: Covers scenarios like loading from S3, handling of NULL values, and bloom
     filters on VARIANT columns.

Overall, this commit significantly strengthens the test coverage for the VARIANT data
type, ensuring its stability and correctness across a wide range of use cases.

fix compaction with 0 max_subcolumns_count
@Thearas
Copy link
Contributor

Thearas commented Aug 13, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@eldenmoon
Copy link
Member Author

run buildall

Copy link
Contributor

@csun5285 csun5285 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@doris-robot
Copy link

TPC-H: Total hot run time: 33626 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit ecf1592aec1bc57c62a2cabcab696aeaad844e88, data reload: false

------ Round 1 ----------------------------------
q1	17596	5210	4989	4989
q2	1908	311	174	174
q3	10324	1255	718	718
q4	10212	1009	515	515
q5	7521	2641	2320	2320
q6	173	159	131	131
q7	990	730	601	601
q8	9296	1307	1075	1075
q9	6958	5096	5154	5096
q10	6939	2380	1966	1966
q11	471	290	265	265
q12	347	341	219	219
q13	17781	3706	3027	3027
q14	242	250	230	230
q15	552	488	478	478
q16	424	423	384	384
q17	602	840	357	357
q18	7360	7040	6946	6946
q19	1168	952	572	572
q20	353	338	227	227
q21	3944	2558	2349	2349
q22	1062	990	987	987
Total cold run time: 106223 ms
Total hot run time: 33626 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5157	5121	5126	5121
q2	250	318	220	220
q3	2179	2710	2288	2288
q4	1362	1818	1350	1350
q5	4203	4428	4537	4428
q6	262	173	132	132
q7	2051	2013	1734	1734
q8	2726	2599	2586	2586
q9	7281	7224	7359	7224
q10	3134	3330	2857	2857
q11	580	545	527	527
q12	733	1022	650	650
q13	3496	3816	3299	3299
q14	310	303	272	272
q15	518	457	490	457
q16	461	507	451	451
q17	1185	1554	1364	1364
q18	7865	7776	7743	7743
q19	866	917	1029	917
q20	1888	1945	1801	1801
q21	4825	4382	4239	4239
q22	1059	1040	1026	1026
Total cold run time: 52391 ms
Total hot run time: 50686 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 185026 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit ecf1592aec1bc57c62a2cabcab696aeaad844e88, data reload: false

query1	1003	388	442	388
query2	6532	1720	1709	1709
query3	6752	222	216	216
query4	26292	23772	23264	23264
query5	4361	631	508	508
query6	310	220	204	204
query7	4636	513	288	288
query8	280	231	215	215
query9	8613	2865	2876	2865
query10	473	342	284	284
query11	15854	15025	15040	15025
query12	170	120	117	117
query13	1681	554	422	422
query14	9432	5828	5722	5722
query15	215	206	172	172
query16	7662	627	467	467
query17	1213	741	617	617
query18	2058	394	306	306
query19	184	191	151	151
query20	132	115	114	114
query21	214	124	143	124
query22	4021	4337	3969	3969
query23	34211	33225	33235	33225
query24	8145	2358	2342	2342
query25	535	462	396	396
query26	1233	265	170	170
query27	2709	501	348	348
query28	4315	2217	2208	2208
query29	731	554	444	444
query30	280	216	184	184
query31	912	802	738	738
query32	80	76	76	76
query33	559	385	342	342
query34	828	872	537	537
query35	783	832	768	768
query36	967	999	897	897
query37	118	104	85	85
query38	4065	4047	3934	3934
query39	1464	1430	1391	1391
query40	228	123	110	110
query41	62	57	53	53
query42	119	115	112	112
query43	504	492	483	483
query44	1327	844	832	832
query45	185	166	192	166
query46	867	1017	655	655
query47	1749	1808	1754	1754
query48	376	401	303	303
query49	712	481	394	394
query50	613	682	387	387
query51	4276	4119	4113	4113
query52	118	112	102	102
query53	235	254	192	192
query54	580	600	526	526
query55	92	87	89	87
query56	316	321	294	294
query57	1198	1195	1132	1132
query58	276	265	272	265
query59	2691	2772	2582	2582
query60	352	336	328	328
query61	130	126	129	126
query62	823	735	668	668
query63	224	189	192	189
query64	4277	1037	699	699
query65	4333	4212	4205	4205
query66	1086	413	322	322
query67	15414	15440	14967	14967
query68	8805	924	629	629
query69	483	325	287	287
query70	1216	1137	1114	1114
query71	457	321	314	314
query72	5532	4764	4700	4700
query73	719	599	353	353
query74	8980	9059	8896	8896
query75	4092	3088	2560	2560
query76	3601	1127	725	725
query77	809	433	319	319
query78	9420	9596	8806	8806
query79	1644	834	608	608
query80	632	551	542	542
query81	489	249	220	220
query82	222	135	107	107
query83	288	263	247	247
query84	299	105	83	83
query85	819	376	337	337
query86	348	316	306	306
query87	4371	4309	4109	4109
query88	2858	2253	2227	2227
query89	394	323	306	306
query90	2100	233	230	230
query91	141	140	114	114
query92	86	71	70	70
query93	1131	963	639	639
query94	677	400	270	270
query95	402	316	310	310
query96	484	613	278	278
query97	2638	2674	2550	2550
query98	237	217	205	205
query99	1452	1425	1287	1287
Total cold run time: 273012 ms
Total hot run time: 185026 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 32.39 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit ecf1592aec1bc57c62a2cabcab696aeaad844e88, data reload: false

query1	0.05	0.04	0.03
query2	0.08	0.04	0.04
query3	0.25	0.07	0.08
query4	1.62	0.11	0.11
query5	0.42	0.42	0.39
query6	1.18	0.63	0.66
query7	0.02	0.01	0.02
query8	0.05	0.04	0.03
query9	0.62	0.52	0.50
query10	0.57	0.57	0.56
query11	0.15	0.11	0.11
query12	0.15	0.11	0.11
query13	0.62	0.60	0.61
query14	0.79	0.83	0.85
query15	0.89	0.85	0.86
query16	0.38	0.38	0.38
query17	1.06	1.08	1.08
query18	0.22	0.19	0.19
query19	1.87	1.86	1.86
query20	0.01	0.02	0.01
query21	15.40	0.92	0.55
query22	0.78	1.08	0.69
query23	15.01	1.38	0.59
query24	6.46	0.86	0.55
query25	0.49	0.24	0.10
query26	0.66	0.16	0.13
query27	0.06	0.06	0.06
query28	9.53	0.99	0.44
query29	12.56	3.90	3.26
query30	3.09	3.02	2.99
query31	2.82	0.58	0.38
query32	3.23	0.55	0.47
query33	3.04	3.12	3.16
query34	16.27	5.47	4.90
query35	4.89	4.96	4.92
query36	0.69	0.50	0.49
query37	0.09	0.07	0.07
query38	0.05	0.05	0.04
query39	0.03	0.03	0.03
query40	0.18	0.14	0.14
query41	0.08	0.03	0.03
query42	0.04	0.03	0.02
query43	0.04	0.03	0.03
Total cold run time: 106.49 s
Total hot run time: 32.39 s

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 100.00% (2/2) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 59.30% (16707/28172)
Line Coverage 48.21% (151824/314912)
Region Coverage 36.99% (113775/307593)
Branch Coverage 40.00% (50590/126469)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (2/2) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 82.91% (22921/27647)
Line Coverage 76.39% (240541/314884)
Region Coverage 64.41% (203426/315824)
Branch Coverage 67.89% (86900/128003)

Copy link
Member

@airborne12 airborne12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Aug 13, 2025
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@eldenmoon eldenmoon merged commit ee5ba3c into apache:master Aug 13, 2025
26 of 29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants