Skip to content

[refactor](fe) Replace TFunctionBinaryType with Function.BinaryType to decouple from Thrift#61786

Merged
morrySnow merged 1 commit intoapache:masterfrom
morrySnow:clean-expr
Mar 27, 2026
Merged

[refactor](fe) Replace TFunctionBinaryType with Function.BinaryType to decouple from Thrift#61786
morrySnow merged 1 commit intoapache:masterfrom
morrySnow:clean-expr

Conversation

@morrySnow
Copy link
Copy Markdown
Contributor

What problem does this PR solve?

Problem Summary:

Function and its subclasses were tightly coupled to the Thrift class TFunctionBinaryType. This refactoring introduces a pure-Java BinaryType enum inside Function, replaces all internal usages of TFunctionBinaryType with it, and confines Thrift conversion to FunctionToThriftConverter only.

Changes:

  • Added Function.BinaryType enum (pure Java, no Thrift dependency) with 8 values: BUILTIN, HIVE, NATIVE, IR, RPC, JAVA_UDF, AGG_STATE, PYTHON_UDF
  • Added toThriftBinaryType() and fromThriftBinaryType() conversion methods in FunctionToThriftConverter
  • Updated 17 files to replace TFunctionBinaryType with Function.BinaryType
  • TFunctionBinaryType now only appears in FunctionToThriftConverter.java (the Thrift serialization boundary)

Release note

None

Check List (For Author)

  • Test: No need to test (pure refactoring, no behavioral change — only enum type replacement with identical values)
  • Behavior changed: No
  • Does this need documentation: No

…o decouple from Thrift

### What problem does this PR solve?

Issue Number: close #xxx

Problem Summary: Function and its subclasses were tightly coupled to the Thrift
class TFunctionBinaryType. This refactoring introduces a pure-Java BinaryType
enum inside Function, replaces all internal usages of TFunctionBinaryType with
it, and confines Thrift conversion to FunctionToThriftConverter only.

### Release note

None

### Check List (For Author)

- Test: No need to test (pure refactoring, no behavioral change)
- Behavior changed: No
- Does this need documentation: No

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@morrySnow
Copy link
Copy Markdown
Contributor Author

run buildall

@morrySnow
Copy link
Copy Markdown
Contributor Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Summary

This PR introduces a pure-Java Function.BinaryType enum to replace the Thrift-generated TFunctionBinaryType throughout the FE codebase, confining Thrift usage to FunctionToThriftConverter. This is a clean refactoring that improves separation of concerns.

Critical Checkpoint Conclusions

Goal and correctness: The goal is to decouple FE internal code from Thrift types. The code achieves this — all 17 files are updated consistently and all TFunctionBinaryType references are successfully confined to FunctionToThriftConverter.java and its test.

Modification scope: The change is minimal and focused — pure mechanical enum type replacement with identical enum names.

Serialization/persistence compatibility (EditLog/Image): No issue. Function persists via Gson (@SerializedName("bt")). Gson serializes enums by name. The new Function.BinaryType has identical names (BUILTIN, HIVE, NATIVE, IR, RPC, JAVA_UDF, AGG_STATE, PYTHON_UDF) as TFunctionBinaryType. Existing persisted metadata will deserialize correctly. Rolling upgrade compatibility is preserved.

Concurrency: Not applicable — no concurrent data structures are modified.

Lifecycle management: Not applicable.

Configuration items: None added.

Incompatible changes: None. Thrift wire format is unchanged since toThriftBinaryType() correctly converts at the serialization boundary.

Parallel code paths: Both cloud and non-cloud paths are handled identically since this is an enum type change only.

Test coverage: Existing tests in FunctionToSqlConverterTest and FunctionToThriftConverterTest are updated. No behavioral change means existing regression tests implicitly cover correctness.

Observability: Not applicable.

Transaction/persistence: EditLog read/write paths use Gson with identical enum names — verified compatible.

FE-BE variable passing: toThriftBinaryType() correctly maps all 8 enum values at the Thrift boundary in toThriftBase().

Performance: No impact — enum comparison is the same cost.

Minor Observation

fromThriftBinaryType() is currently dead code — it has zero callers in the codebase. It may be useful for future bidirectional conversion, but per the project's convention against speculative code, consider removing it or adding a comment explaining its anticipated use.

Verdict

No blocking issues found. The refactoring is correct, backward-compatible, and well-scoped.

/**
* Converts a Thrift {@link TFunctionBinaryType} to {@link Function.BinaryType}.
*/
public static Function.BinaryType fromThriftBinaryType(TFunctionBinaryType thriftType) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: fromThriftBinaryType() is currently dead code — there are zero callers in the entire codebase. The FE only converts Function.BinaryTypeTFunctionBinaryType (when sending to BE), never the reverse direction. Consider either removing this method to keep the diff minimal and avoid unused code, or adding a brief comment explaining its anticipated future use.

@morrySnow
Copy link
Copy Markdown
Contributor Author

Re: review comment about fromThriftBinaryType() being dead code —

fromThriftBinaryType() is intentionally included as the symmetric counterpart to toThriftBinaryType() to complete the bidirectional conversion API at the Thrift boundary. While there are currently no callers, this method is anticipated for future use cases such as deserializing TFunction from BE responses or Thrift RPC results back into Function.BinaryType. Keeping it avoids having to add it later when such a need arises, and the cost of the unused method is negligible.

@doris-robot
Copy link
Copy Markdown

TPC-H: Total hot run time: 26036 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 6f81e5df5508804dbe0aeed7228770972d0261cf, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17612	4445	4279	4279
q2	q3	10633	755	510	510
q4	4675	342	246	246
q5	7569	1199	1016	1016
q6	180	170	143	143
q7	772	836	674	674
q8	9284	1447	1281	1281
q9	4923	4445	4683	4445
q10	6291	1914	1638	1638
q11	478	252	236	236
q12	757	579	454	454
q13	18020	2665	1952	1952
q14	227	238	201	201
q15	q16	752	744	680	680
q17	749	847	434	434
q18	5922	5493	5226	5226
q19	1192	975	598	598
q20	527	489	374	374
q21	4568	1820	1404	1404
q22	339	295	245	245
Total cold run time: 95470 ms
Total hot run time: 26036 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4995	4706	4737	4706
q2	q3	3876	4333	3870	3870
q4	913	1210	781	781
q5	4079	4424	4319	4319
q6	183	183	150	150
q7	1816	1666	1607	1607
q8	2498	2702	2536	2536
q9	7504	7398	7243	7243
q10	3767	3979	3665	3665
q11	540	450	439	439
q12	506	615	493	493
q13	2493	2891	2396	2396
q14	418	324	288	288
q15	q16	720	801	736	736
q17	1227	1356	1317	1317
q18	7264	6742	6803	6742
q19	947	943	913	913
q20	2121	2392	2097	2097
q21	4036	3561	3319	3319
q22	529	433	390	390
Total cold run time: 50432 ms
Total hot run time: 48007 ms

@doris-robot
Copy link
Copy Markdown

TPC-DS: Total hot run time: 169553 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 6f81e5df5508804dbe0aeed7228770972d0261cf, data reload: false

query5	4320	626	508	508
query6	348	233	212	212
query7	4204	459	275	275
query8	352	238	234	234
query9	8746	2713	2725	2713
query10	527	380	350	350
query11	7021	5085	4884	4884
query12	188	129	124	124
query13	1322	479	353	353
query14	5717	3729	3464	3464
query14_1	2974	2831	2828	2828
query15	203	194	184	184
query16	1000	464	371	371
query17	892	739	611	611
query18	2447	464	357	357
query19	219	205	182	182
query20	137	124	125	124
query21	216	131	112	112
query22	13314	14408	14622	14408
query23	16527	16450	16126	16126
query23_1	16529	16253	15978	15978
query24	7275	1622	1228	1228
query24_1	1226	1223	1235	1223
query25	549	457	412	412
query26	1232	263	153	153
query27	2771	476	296	296
query28	4476	1821	1833	1821
query29	865	563	461	461
query30	294	231	188	188
query31	1002	934	878	878
query32	83	82	70	70
query33	515	340	282	282
query34	879	872	519	519
query35	638	685	610	610
query36	1099	1117	969	969
query37	135	94	84	84
query38	2964	2877	2885	2877
query39	863	836	802	802
query39_1	793	779	792	779
query40	231	149	137	137
query41	65	60	64	60
query42	257	255	252	252
query43	247	246	228	228
query44	
query45	197	191	178	178
query46	880	985	599	599
query47	2115	2161	2076	2076
query48	326	314	223	223
query49	639	458	377	377
query50	674	277	210	210
query51	4051	4121	3999	3999
query52	262	266	257	257
query53	286	333	290	290
query54	299	273	270	270
query55	89	87	83	83
query56	306	307	299	299
query57	1909	1912	1812	1812
query58	283	272	269	269
query59	2798	2974	2739	2739
query60	323	331	304	304
query61	153	149	153	149
query62	618	585	546	546
query63	316	280	274	274
query64	5089	1266	997	997
query65	
query66	1472	449	351	351
query67	24173	24309	24205	24205
query68	
query69	397	300	295	295
query70	969	915	912	912
query71	324	312	293	293
query72	2744	2659	2425	2425
query73	534	541	326	326
query74	9624	9594	9408	9408
query75	2845	2784	2465	2465
query76	2284	1023	666	666
query77	342	378	305	305
query78	11008	11162	10499	10499
query79	1146	767	571	571
query80	1287	611	534	534
query81	555	266	223	223
query82	1016	154	117	117
query83	343	272	252	252
query84	293	119	100	100
query85	901	498	472	472
query86	410	307	288	288
query87	3090	3119	3044	3044
query88	3517	2644	2617	2617
query89	421	372	340	340
query90	2024	180	179	179
query91	171	160	137	137
query92	78	77	68	68
query93	999	844	506	506
query94	664	326	301	301
query95	583	413	316	316
query96	647	517	230	230
query97	2490	2449	2425	2425
query98	229	217	222	217
query99	997	1000	939	939
Total cold run time: 249644 ms
Total hot run time: 169553 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 35.62% (26/73) 🎉
Increment coverage report
Complete coverage report

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Mar 27, 2026
@github-actions
Copy link
Copy Markdown
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Copy Markdown
Contributor

PR approved by anyone and no changes requested.

@morrySnow morrySnow merged commit 71951fe into apache:master Mar 27, 2026
29 of 31 checks passed
@morrySnow morrySnow deleted the clean-expr branch March 27, 2026 04:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants