Skip to content

[refine](column) use strong-typed ColumnUInt8 for null_map in ColumnNullable#63491

Merged
yiguolei merged 3 commits into
apache:masterfrom
Mryange:ColumnUInt8-for-nullmap
May 26, 2026
Merged

[refine](column) use strong-typed ColumnUInt8 for null_map in ColumnNullable#63491
yiguolei merged 3 commits into
apache:masterfrom
Mryange:ColumnUInt8-for-nullmap

Conversation

@Mryange
Copy link
Copy Markdown
Contributor

@Mryange Mryange commented May 21, 2026

What problem does this PR solve?

The null_map column in ColumnNullable was stored as a generic IColumn::WrappedPtr, requiring assert_cast<ColumnUInt8> at every call site that accessed it. This had two problems:

  1. Redundant runtime casts: Every access to get_null_map_column() or get_null_map_column_ptr() performed an assert_cast, adding unnecessary overhead on hot paths (filter evaluation, null propagation, etc.).
  2. Type safety: The generic WrappedPtr type provided no compile-time guarantee that the null map is always a ColumnUInt8, making it possible to accidentally store a wrong column type.

Root cause: The _null_map member was typed as a generic IColumn::WrappedPtr instead of the concrete ColumnUInt8::WrappedPtr.

Fix: Change _null_map to ColumnUInt8::WrappedPtr so the type is fixed at the member level. The constructor now validates and converts to ColumnUInt8::MutablePtr on entry via a helper assert_mutable_null_map(). All call sites that previously did assert_cast<ColumnUInt8> now get the correct type directly.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@Mryange
Copy link
Copy Markdown
Contributor Author

Mryange commented May 21, 2026

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review result: no blocking findings found.

Critical checkpoint conclusions:

  • Goal and tests: The PR is a focused refactor to strongly type ColumnNullable null maps as ColumnUInt8; the changed code appears to accomplish that goal. Existing CI format checks pass; the macOS BE UT failure is an environment JDK mismatch, not code-related. No new test is strictly required for this narrow type refactor, but a BE compile/UT run is still the main residual validation.
  • Scope and clarity: The change is small and mostly limited to ColumnNullable, COW alias exposure, and direct call-site cast removal.
  • Concurrency: No new shared mutable state, locks, threads, or async paths are introduced.
  • Lifecycle/static initialization: No static/global lifecycle changes were introduced.
  • Configuration: No configuration items were added or changed.
  • Compatibility: No storage format, protocol, function symbol, or FE/BE compatibility impact found.
  • Parallel code paths: Reviewed representative null-map access call sites and no missed required analogous change was found.
  • Conditions/error handling: Constructor validation continues to reject constant null maps and enforces the ColumnUInt8 invariant at the construction boundary.
  • Test coverage/results: No test-result files changed. CI status shows style checks passed; the only observed failed check is unrelated runner JDK setup.
  • Observability: Not applicable for this internal column representation refactor.
  • Transaction/persistence/data writes: Not applicable; no transaction, persistence, or data visibility paths changed.
  • Performance: The refactor removes repeated null-map downcasts on hot accessors and does not introduce obvious extra work.

User focus: No additional user-provided review focus was specified.

@Mryange
Copy link
Copy Markdown
Contributor Author

Mryange commented May 22, 2026

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 31717 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 669ebf2546dcadf9475ff7461755e27e6f76534c, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17776	3850	3823	3823
q2	q3	10785	1421	822	822
q4	4692	482	344	344
q5	7682	2334	2150	2150
q6	242	180	138	138
q7	910	808	636	636
q8	9393	1756	1724	1724
q9	5560	4943	4886	4886
q10	6465	2070	1822	1822
q11	454	284	249	249
q12	643	424	297	297
q13	18239	3405	2806	2806
q14	257	252	230	230
q15	q16	820	780	716	716
q17	960	947	932	932
q18	6968	5775	5702	5702
q19	1160	1304	1207	1207
q20	564	468	316	316
q21	5783	2877	2597	2597
q22	459	487	320	320
Total cold run time: 99812 ms
Total hot run time: 31717 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4569	4500	4494	4494
q2	q3	4912	5280	4559	4559
q4	2141	2209	1428	1428
q5	5050	4588	4644	4588
q6	254	189	127	127
q7	1892	1755	1549	1549
q8	2438	2141	2056	2056
q9	7594	7269	7218	7218
q10	4515	4420	4013	4013
q11	526	381	346	346
q12	712	721	509	509
q13	3082	3400	2825	2825
q14	267	281	245	245
q15	q16	670	694	609	609
q17	1269	1234	1231	1231
q18	7166	6776	6790	6776
q19	1097	1111	1064	1064
q20	2224	2194	1934	1934
q21	5331	4613	4475	4475
q22	524	452	431	431
Total cold run time: 56233 ms
Total hot run time: 50477 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 171321 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 669ebf2546dcadf9475ff7461755e27e6f76534c, data reload: false

query5	4320	659	514	514
query6	331	220	198	198
query7	4319	549	301	301
query8	325	231	219	219
query9	8860	3968	4048	3968
query10	448	328	307	307
query11	5759	2362	2224	2224
query12	183	128	122	122
query13	1288	590	401	401
query14	6017	5490	5029	5029
query14_1	4339	4402	4298	4298
query15	213	208	182	182
query16	992	471	435	435
query17	961	721	591	591
query18	2443	487	357	357
query19	212	197	158	158
query20	130	131	130	130
query21	216	151	120	120
query22	13773	13536	13815	13536
query23	17208	16448	16145	16145
query23_1	16222	16266	16230	16230
query24	7509	1795	1295	1295
query24_1	1332	1307	1310	1307
query25	594	500	438	438
query26	1324	332	177	177
query27	2706	545	349	349
query28	4499	1951	1979	1951
query29	1013	634	526	526
query30	304	236	200	200
query31	1123	1077	939	939
query32	97	79	77	77
query33	553	383	310	310
query34	1169	1106	658	658
query35	795	799	685	685
query36	1346	1316	1198	1198
query37	157	108	93	93
query38	3229	3147	3092	3092
query39	936	938	889	889
query39_1	872	897	864	864
query40	234	153	130	130
query41	74	69	70	69
query42	115	116	113	113
query43	325	334	287	287
query44	
query45	216	210	210	210
query46	1100	1190	769	769
query47	2329	2261	2116	2116
query48	405	413	304	304
query49	668	521	410	410
query50	988	365	255	255
query51	4318	4268	4228	4228
query52	106	106	96	96
query53	260	296	221	221
query54	325	302	290	290
query55	100	97	89	89
query56	331	338	322	322
query57	1410	1416	1312	1312
query58	309	287	295	287
query59	1574	1629	1407	1407
query60	367	327	305	305
query61	166	159	155	155
query62	684	620	545	545
query63	247	198	208	198
query64	2390	783	626	626
query65	
query66	1725	478	353	353
query67	30414	30379	30265	30265
query68	
query69	469	350	300	300
query70	1101	971	946	946
query71	295	271	264	264
query72	3064	2706	2498	2498
query73	836	770	438	438
query74	5098	4883	4758	4758
query75	2707	2609	2277	2277
query76	2293	1141	750	750
query77	407	414	322	322
query78	12404	12297	11740	11740
query79	1518	1043	766	766
query80	686	555	459	459
query81	462	280	244	244
query82	1374	162	126	126
query83	336	276	249	249
query84	323	141	110	110
query85	888	585	451	451
query86	399	356	342	342
query87	3426	3364	3218	3218
query88	3515	2679	2659	2659
query89	448	387	336	336
query90	1849	186	190	186
query91	182	166	141	141
query92	81	77	76	76
query93	1467	1408	913	913
query94	541	340	322	322
query95	689	381	349	349
query96	1005	822	342	342
query97	2720	2696	2594	2594
query98	233	225	231	225
query99	1135	1110	986	986
Total cold run time: 254208 ms
Total hot run time: 171321 ms

HappenLee
HappenLee previously approved these changes May 25, 2026
Copy link
Copy Markdown
Contributor

@HappenLee HappenLee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Mryange
Copy link
Copy Markdown
Contributor Author

Mryange commented May 25, 2026

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review result: request changes.

Critical checkpoint conclusions:

  • Goal/test: The PR aims to make ColumnNullable's null map strongly typed as ColumnUInt8. The direction is reasonable, but the PR description does not report any executed test, and I could not verify formatter compliance because this runner lacks clang-format 16.
  • Scope/focus: The core type change is focused, but the downstream call-site edits introduced widespread unformatted code.
  • Concurrency/lifecycle: No new concurrency, lock ordering, or non-obvious lifecycle ownership issue was identified in the reviewed paths. The COW ownership model remains the main relevant lifecycle concern.
  • Configuration/compatibility/protocol: No new configuration, storage format, or FE-BE protocol compatibility change was identified.
  • Parallel code paths: The null-map accessor call sites were updated broadly across column, function, table-function, and segment reader paths.
  • Error handling/data correctness: I did not identify an additional ignored Status or visibility/delete-bitmap correctness issue in this refactor.
  • Tests: No test execution is documented in the PR body. This refactor should at minimum pass BE build/style and relevant column/table-function tests.
  • Observability/performance: No new observability requirement. The intended performance improvement is removal of redundant casts, but it must not land with formatter violations.

User focus: No additional user-provided review focus was present.

Blocking issue: BE C++ formatting has not been applied. Several added lines are visibly mis-indented across touched files, which violates the repository code-style requirement. Please run the Doris BE formatter (build-support/clang-format.sh) with clang-format 16 and update the PR.

@@ -340,11 +339,10 @@
DCHECK(value_col_ptr->is_nullable())
<< "posexplode fast path requires nullable value column";
auto* val_nullable = assert_cast<ColumnNullable*>(value_col_ptr);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This added line is over-indented relative to the surrounding block. The same formatting drift appears in several touched BE files, so please run the required Doris formatter (build-support/clang-format.sh with clang-format 16) rather than fixing only this line.

assert_cast<ColumnNullable*>(column.get())->get_nested_column_ptr().get())
->insert_range_from(*_elements_column, _cur_offset, max_step);
static_cast<ColumnUInt8*>(
static_cast<ColumnNullable*>(column.get())->get_null_map_column_ptr().get())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block is no longer clang-formatted: the statements inside if (_is_nullable) are aligned with the if instead of being indented under it. Please run the BE formatter for the whole PR.

@Mryange
Copy link
Copy Markdown
Contributor Author

Mryange commented May 25, 2026

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a blocking correctness regression in the ColumnNullable refactor. The typed null-map storage is a good direction, but the adapter used in for_each_subcolumn breaks existing read-only traversal paths. Existing formatting comments were already present and are not repeated here.

Critical checkpoint conclusions:

  • Goal/test: The PR aims to make ColumnNullable null maps strongly typed as ColumnUInt8. The core goal is only partially met because one traversal path now fails; no PR test demonstrates recursive column traversal on Nullable columns.
  • Scope: The actual PR diff is focused on the null-map refactor, aside from formatting drift already reported in existing review threads.
  • Concurrency/lifecycle/config/compatibility: No new concurrency, lifecycle, config, or serialization-format compatibility concern found in the actual PR diff.
  • Parallel paths: Most null-map access call sites were updated consistently, but for_each_subcolumn needs a safe typed-to-base adapter.
  • Testing: Missing coverage for IColumn::dump_structure(), null_map_check(), column_boolean_check(), and recursive mutate/traversal on ColumnNullable after this type change.
  • Data correctness/performance/observability: No data visibility or persistence issue found; the reported issue is a runtime correctness regression in column traversal.

User focus: No additional user-provided review focus was specified.

callback(_nested_column);
callback(_null_map);
IColumn::WrappedPtr null_map(static_cast<const ColumnUInt8::Ptr&>(_null_map));
callback(null_map);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This temporary wrapper makes read-only subcolumn traversal fail for nullable columns. IColumn::dump_structure(), count_const_column(), column_boolean_check(), and null_map_check() all call for_each_subcolumn with callbacks that only inspect the subcolumn and do not replace it. In those paths this local null_map still aliases _null_map after callback(null_map), so the final null_map.get() goes through non-const chameleon_ptr::get() and calls assert_mutable_ref() with use_count() > 1, throwing COW::assert_mutable. For example, calling dump_structure() on a ColumnNullable will now fail just because the traversal copied the null-map pointer. Please avoid creating an extra owner for the no-replacement case, or otherwise update _null_map only through a path that does not require exclusive ownership when the callback did not detach/replace the pointer.

@Mryange
Copy link
Copy Markdown
Contributor Author

Mryange commented May 25, 2026

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the current PR diff against the Doris code review checklist. I did not find additional non-duplicate blocking issues beyond the already-open review context. The earlier formatting concerns appear addressed in the latest head, and the nullable subcolumn traversal path now avoids updating the null-map wrapper when the callback leaves it unchanged.

Critical checkpoint conclusions:

  • Goal and coverage: the PR narrows null-map storage/access to and updates callers accordingly. The implementation is focused on COW/type-safety cleanup; I did not see new functional tests in this diff, and I did not run the full BE test suites locally.
  • Scope and clarity: the change is mostly mechanical and focused on nullable null-map typing plus necessary call-site updates.
  • Concurrency/lifecycle: no new threads, shared mutable globals, lock ordering, or non-obvious lifecycle management were introduced in the reviewed diff.
  • Configuration/compatibility: no new configs, serialization formats, persistent metadata, or FE/BE protocol fields are added.
  • Parallel paths: table-function, expression, and storage reader call sites that directly touched nullable null maps were updated consistently in the PR diff.
  • Error handling/invariants: the new null-map type checks fail fast for non- null maps, matching Doris invariants.
  • Data correctness: storage reader changes preserve status propagation and still append into the destination null-map column; I did not identify a version/delete-bitmap/transaction visibility impact.
  • Performance/memory: no obvious new large untracked allocation or hot-path regression was identified; COW mutation paths still detach subcolumns through .
  • Observability: no new operational path requiring logs or metrics was introduced.

User focus: no additional user-provided review focus was specified.

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the current PR diff against the Doris code review checklist. I did not find additional non-duplicate blocking issues beyond the already-open review context. The earlier formatting concerns appear addressed in the latest head, and the nullable subcolumn traversal path now avoids updating the null-map wrapper when the callback leaves it unchanged.

Critical checkpoint conclusions:

  • Goal and coverage: the PR narrows ColumnNullable null-map storage/access to ColumnUInt8 and updates callers accordingly. The implementation is focused on COW/type-safety cleanup; I did not see new functional tests in this diff, and I did not run the full BE test suites locally.
  • Scope and clarity: the change is mostly mechanical and focused on nullable null-map typing plus necessary call-site updates.
  • Concurrency/lifecycle: no new threads, shared mutable globals, lock ordering, or non-obvious lifecycle management were introduced in the reviewed diff.
  • Configuration/compatibility: no new configs, serialization formats, persistent metadata, or FE/BE protocol fields are added.
  • Parallel paths: table-function, expression, and storage reader call sites that directly touched nullable null maps were updated consistently in the PR diff.
  • Error handling/invariants: the new null-map type checks fail fast for non-ColumnUInt8 null maps, matching Doris invariants.
  • Data correctness: storage reader changes preserve status propagation and still append into the destination null-map column; I did not identify a version/delete-bitmap/transaction visibility impact.
  • Performance/memory: no obvious new large untracked allocation or hot-path regression was identified; COW mutation paths still detach subcolumns through for_each_subcolumn.
  • Observability: no new operational path requiring logs or metrics was introduced.

User focus: no additional user-provided review focus was specified.

@Mryange
Copy link
Copy Markdown
Contributor Author

Mryange commented May 25, 2026

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 32295 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 32e7ee121b288183fac97d5c132314803832fd22, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17630	4239	4219	4219
q2	q3	10806	1469	848	848
q4	4680	484	344	344
q5	7600	2348	2161	2161
q6	247	181	142	142
q7	958	786	648	648
q8	9363	1826	1752	1752
q9	5272	5037	4998	4998
q10	6396	2227	1886	1886
q11	466	280	256	256
q12	637	433	301	301
q13	18111	3433	2788	2788
q14	271	263	239	239
q15	q16	824	767	708	708
q17	949	902	956	902
q18	7157	5788	5546	5546
q19	1307	1408	1159	1159
q20	580	441	287	287
q21	6477	2997	2788	2788
q22	460	382	323	323
Total cold run time: 100191 ms
Total hot run time: 32295 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	5034	5059	4862	4862
q2	q3	5027	5312	4709	4709
q4	2522	2319	1442	1442
q5	4898	4713	4862	4713
q6	239	182	131	131
q7	1885	1789	1634	1634
q8	2561	2298	2294	2294
q9	7615	7458	7488	7458
q10	4833	4721	4228	4228
q11	581	434	396	396
q12	741	756	567	567
q13	3054	3367	2783	2783
q14	283	281	255	255
q15	q16	697	696	627	627
q17	1336	1290	1302	1290
q18	7537	7003	6980	6980
q19	1138	1123	1125	1123
q20	2241	2239	1955	1955
q21	5488	4826	4642	4642
q22	551	475	430	430
Total cold run time: 58261 ms
Total hot run time: 52519 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 177509 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 32e7ee121b288183fac97d5c132314803832fd22, data reload: false

query5	4322	693	527	527
query6	340	223	203	203
query7	4260	592	332	332
query8	332	242	242	242
query9	8814	4324	4222	4222
query10	434	362	302	302
query11	5855	2634	2286	2286
query12	196	133	128	128
query13	1284	625	449	449
query14	6284	5589	5284	5284
query14_1	4617	4559	4591	4559
query15	216	213	192	192
query16	1000	471	447	447
query17	987	761	629	629
query18	2464	518	370	370
query19	229	211	170	170
query20	138	145	141	141
query21	221	150	123	123
query22	13627	13526	13313	13313
query23	17627	16704	16377	16377
query23_1	16471	16480	16467	16467
query24	7525	1837	1384	1384
query24_1	1386	1372	1370	1370
query25	597	527	442	442
query26	1314	330	176	176
query27	2708	583	350	350
query28	4510	2085	2074	2074
query29	1047	639	534	534
query30	314	251	203	203
query31	1141	1114	960	960
query32	97	79	76	76
query33	566	373	303	303
query34	1233	1202	672	672
query35	810	823	694	694
query36	1456	1449	1310	1310
query37	156	102	91	91
query38	3284	3230	3118	3118
query39	934	927	904	904
query39_1	890	871	897	871
query40	234	167	125	125
query41	64	65	63	63
query42	113	109	109	109
query43	337	340	302	302
query44	
query45	212	203	197	197
query46	1113	1280	774	774
query47	2417	2440	2323	2323
query48	421	434	297	297
query49	634	519	395	395
query50	982	353	280	280
query51	4423	4296	4297	4296
query52	107	109	95	95
query53	254	284	211	211
query54	306	278	264	264
query55	95	92	87	87
query56	309	312	314	312
query57	1471	1442	1352	1352
query58	308	278	262	262
query59	1638	1683	1502	1502
query60	327	323	345	323
query61	169	160	156	156
query62	715	666	604	604
query63	258	211	209	209
query64	2451	840	629	629
query65	
query66	1801	509	370	370
query67	32483	33231	32949	32949
query68	
query69	468	340	309	309
query70	1016	1024	1029	1024
query71	320	266	272	266
query72	3117	2799	2437	2437
query73	858	766	469	469
query74	7431	5152	4965	4965
query75	2836	2693	2320	2320
query76	2285	1249	800	800
query77	412	428	343	343
query78	12815	12644	12075	12075
query79	1473	1070	766	766
query80	666	554	448	448
query81	476	305	252	252
query82	1418	169	128	128
query83	381	284	264	264
query84	316	138	119	119
query85	937	593	468	468
query86	403	374	325	325
query87	3481	3402	3265	3265
query88	3752	2798	2765	2765
query89	475	434	382	382
query90	1909	184	191	184
query91	184	170	142	142
query92	91	77	73	73
query93	1493	1462	958	958
query94	547	356	300	300
query95	687	400	352	352
query96	1078	796	358	358
query97	2783	2733	2604	2604
query98	235	224	226	224
query99	1214	1174	1069	1069
Total cold run time: 261790 ms
Total hot run time: 177509 ms

@Mryange Mryange force-pushed the ColumnUInt8-for-nullmap branch from 32e7ee1 to 3ef87ba Compare May 26, 2026 02:53
@Mryange Mryange force-pushed the ColumnUInt8-for-nullmap branch from 3ef87ba to 1eeeefb Compare May 26, 2026 02:54
@Mryange
Copy link
Copy Markdown
Contributor Author

Mryange commented May 26, 2026

/review

@Mryange
Copy link
Copy Markdown
Contributor Author

Mryange commented May 26, 2026

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found one additional blocking COW issue beyond the existing review threads.

Critical checkpoint conclusions:

  • Goal/test coverage: the PR appears to tighten ColumnNullable null-map typing and update call sites/tests, but the mutable constructor path is not covered for nested nullable inputs with a generic MutableColumnPtr null map.
  • Scope/focus: mostly focused, but the constructor ownership change is not a safe minimal refactor because it changes rvalue consumption behavior.
  • Concurrency/lifecycle/config/compatibility: no new thread concurrency, config, storage-format, or FE/BE protocol compatibility issue found in the reviewed diff.
  • Parallel paths: storage reader/table-function/expression call sites were checked; no distinct issue found there beyond the already raised formatting threads.
  • Error handling/memory/COW: the new mutable null-map cast can leave multiple owners and then call non-const WrappedPtr access that requires exclusive ownership, causing valid construction to fail.
  • Tests: existing tests do not appear to exercise this constructor failure mode. Regression output changes were reviewed; no additional issue found.
  • Observability/performance: no additional observability or performance blocker found.

User focus: no additional user-provided review focus was specified.

nested_nullable.apply_null_map(get_null_map_column());
_null_map = nested_nullable._null_map;
_nested_column = nested_nullable.get_nested_column_ptr();
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This delegating constructor no longer consumes the rvalue null-map pointer. assert_mutable_null_map(null_map_) creates a ColumnUInt8::MutablePtr from the raw pointer while the original MutableColumnPtr argument still owns the same object. When the nested column is already nullable, the delegated constructor immediately calls get_null_map_column() in the flattening path, and the non-const ColumnUInt8::WrappedPtr access asserts exclusive ownership; with the original argument plus _null_map, use_count() is greater than 1 and construction can throw COW::assert_mutable. Even when that path is not taken, callers passing std::move(null_map) are left with a still-owning pointer, which is a behavioral change from the old std::move(null_map_) member initialization. Please make the helper consume/reset the source MutableColumnPtr after type-checking (or otherwise move through a typed owner) so the constructed ColumnNullable is the sole owner before any non-const _null_map access.

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requesting changes for one COW ownership regression in the nullable-column refactor.

Critical checkpoint conclusions:

  • Goal/test: The goal is to make ColumnNullable null maps strongly typed as ColumnUInt8. The code mostly follows that goal, but the mutable constructor path breaks nested nullable flattening; I did not find a test covering this path.
  • Scope: The PR is focused, but the constructor ownership conversion is not small/safe enough because it creates an extra owner before a mutable access.
  • Concurrency: No new thread/concurrency behavior found.
  • Lifecycle/COW: This change is lifecycle-sensitive due intrusive COW ownership; the reported issue is a refcount/lifetime regression.
  • Config/compatibility/persistence: No new configs, protocol/storage-format compatibility, or persistence changes found.
  • Parallel paths: The shared/const constructor path handles nested nullable separately; the distinct mutable constructor path has the reported failure.
  • Tests: Existing tests were only adjusted for API usage; coverage for nested nullable construction with a MutableColumnPtr null map is missing.
  • Observability/performance: No additional observability requirement found; the performance intent is reasonable once the ownership bug is fixed.

Existing review context was considered. I did not duplicate the existing for_each_subcolumn or formatting threads. User focus: no additional user-provided focus points.

throw doris::Exception(ErrorCode::INTERNAL_ERROR,
"ColumnNullable cannot have constant null map");
__builtin_unreachable();
auto& nested_nullable = assert_cast<ColumnNullable&>(*_nested_column);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can throw when the mutable constructor is used to flatten Nullable(Nullable(T)). The delegating constructor above calls assert_mutable_null_map(null_map_) while null_map_ still owns the same ColumnUInt8, so _null_map has an extra owner until the delegating constructor parameter is destroyed. If _nested_column is nullable, this non-const get_null_map_column() goes through ColumnUInt8::WrappedPtr::operator*() and assert_mutable_ref(), sees use_count() > 1, and fails before the null maps can be merged. A valid path is ColumnNullable::create(std::move(nullable_nested), std::move(null_map_as_mutable_column_ptr)). Please consume/move the original MutableColumnPtr before creating the typed owner, or pass the null map to apply_null_map through a const path that does not require exclusive ownership.

@Mryange
Copy link
Copy Markdown
Contributor Author

Mryange commented May 26, 2026

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review result: no new blocking issues found in the current head beyond the already-known inline threads.

Critical checkpoint conclusions:

  • Goal/test proof: The PR changes ColumnNullable null-map storage to a concrete ColumnUInt8::WrappedPtr and updates callers. The code is consistent with that goal. Existing BE/unit coverage was adjusted at call sites; I did not run full BE tests in this runner.
  • Scope/focus: The change is mostly focused on the null-map type refactor and cast removal.
  • Concurrency/lifecycle: No new thread, static initialization, or lock lifecycle changes found. COW ownership paths were the main risk area reviewed.
  • Configuration/compatibility: No config, storage format, or FE-BE protocol compatibility changes found.
  • Parallel paths: Reviewed core column accessors plus function, table-function, and segment-reader call sites that consume nullable null maps; no distinct missing path found.
  • Error handling: Existing status-returning iterator calls remain checked. Constructor validation keeps null-map type/size checks.
  • Data correctness: Null-map propagation in nullable constructor flattening, conditional functions, table functions, and array/map/struct readers appears version-neutral and data-shape preserving.
  • Memory/performance: The refactor removes repeated null-map casts and does not add obvious hot-path allocations beyond existing pointer conversions.
  • Observability: No new observability needed for this internal refactor.
  • Tests: Existing touched tests are formatting/API updates rather than new behavioral coverage; full verification is still dependent on CI.

User focus: no additional user-provided review focus was specified.

@Mryange
Copy link
Copy Markdown
Contributor Author

Mryange commented May 26, 2026

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 42.28% (63/149) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.83% (20889/38804)
Line Coverage 37.42% (197954/528973)
Region Coverage 33.69% (154981/459955)
Branch Coverage 34.70% (67487/194494)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 80.54% (120/149) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.86% (28069/38005)
Line Coverage 57.81% (305001/527621)
Region Coverage 54.98% (255316/464379)
Branch Coverage 56.53% (110360/195220)

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 31738 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 2244ac88ce443888965042aa313735e40abba622, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17733	4306	4237	4237
q2	q3	10788	1445	841	841
q4	4691	477	347	347
q5	7658	2305	2117	2117
q6	359	180	138	138
q7	944	778	638	638
q8	9586	1664	1592	1592
q9	7126	5056	5018	5018
q10	6469	2254	1851	1851
q11	436	291	253	253
q12	691	427	294	294
q13	18189	3392	2808	2808
q14	267	254	250	250
q15	q16	821	780	711	711
q17	1001	963	925	925
q18	7111	5887	5613	5613
q19	1160	1211	1094	1094
q20	513	390	259	259
q21	5744	2599	2448	2448
q22	437	361	304	304
Total cold run time: 101724 ms
Total hot run time: 31738 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4478	4435	4417	4417
q2	q3	4588	4979	4376	4376
q4	2110	2249	1433	1433
q5	4583	4401	5290	4401
q6	275	197	144	144
q7	2068	1844	1648	1648
q8	2684	2339	2200	2200
q9	8251	8097	8144	8097
q10	5030	4817	4311	4311
q11	655	459	431	431
q12	758	762	543	543
q13	3315	3653	2976	2976
q14	335	330	288	288
q15	q16	716	734	659	659
q17	1422	1397	1376	1376
q18	8132	7289	6837	6837
q19	1139	1140	1099	1099
q20	2244	2236	1972	1972
q21	5365	4698	4510	4510
q22	522	461	421	421
Total cold run time: 58670 ms
Total hot run time: 52139 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 171136 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 2244ac88ce443888965042aa313735e40abba622, data reload: false

query5	4348	664	520	520
query6	334	218	199	199
query7	4226	563	300	300
query8	317	227	227	227
query9	8857	4138	4133	4133
query10	454	353	297	297
query11	5837	2569	2253	2253
query12	184	132	125	125
query13	1315	619	452	452
query14	6184	5501	5214	5214
query14_1	4539	4555	4531	4531
query15	221	210	191	191
query16	995	473	460	460
query17	1151	762	604	604
query18	2472	499	369	369
query19	223	208	175	175
query20	136	127	127	127
query21	212	143	121	121
query22	13687	13622	13373	13373
query23	17340	16641	16267	16267
query23_1	16339	16423	16399	16399
query24	7523	1786	1337	1337
query24_1	1309	1312	1335	1312
query25	588	514	455	455
query26	1320	334	182	182
query27	2696	579	353	353
query28	4456	1994	1996	1994
query29	971	608	500	500
query30	303	241	201	201
query31	1132	1085	941	941
query32	89	75	75	75
query33	525	343	290	290
query34	1187	1119	655	655
query35	778	796	683	683
query36	1405	1391	1270	1270
query37	153	102	89	89
query38	3236	3189	3091	3091
query39	923	901	897	897
query39_1	871	891	859	859
query40	248	150	123	123
query41	64	64	63	63
query42	108	110	112	110
query43	332	326	298	298
query44	
query45	213	208	193	193
query46	1075	1185	724	724
query47	2346	2296	2265	2265
query48	373	419	306	306
query49	630	498	401	401
query50	953	352	250	250
query51	4414	4271	4224	4224
query52	104	103	93	93
query53	262	283	206	206
query54	310	265	266	265
query55	94	88	86	86
query56	303	301	294	294
query57	1441	1409	1294	1294
query58	295	267	260	260
query59	1616	1691	1438	1438
query60	320	325	296	296
query61	165	155	154	154
query62	697	658	573	573
query63	251	204	205	204
query64	2422	810	616	616
query65	
query66	1717	495	366	366
query67	29739	29797	29515	29515
query68	
query69	451	339	304	304
query70	1017	1041	1052	1041
query71	320	272	270	270
query72	2987	2748	2423	2423
query73	843	793	439	439
query74	5153	4943	4812	4812
query75	2875	2620	2276	2276
query76	2324	1160	749	749
query77	408	410	338	338
query78	12339	12494	11737	11737
query79	1411	1053	773	773
query80	634	527	442	442
query81	462	285	247	247
query82	636	159	125	125
query83	345	279	244	244
query84	260	147	110	110
query85	876	541	451	451
query86	421	333	328	328
query87	3475	3410	3251	3251
query88	3609	2752	2725	2725
query89	444	413	350	350
query90	1953	181	186	181
query91	177	163	142	142
query92	82	81	74	74
query93	1467	1461	905	905
query94	562	337	311	311
query95	683	486	351	351
query96	1069	813	345	345
query97	2774	2714	2603	2603
query98	243	234	226	226
query99	1181	1165	1035	1035
Total cold run time: 253829 ms
Total hot run time: 171136 ms

@yiguolei yiguolei merged commit d0536c0 into apache:master May 26, 2026
32 of 33 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants