[improvement](maxcompute) Simplify FE block ID requests for MaxCompute writes by hubgeter · Pull Request #62880 · apache/doris

hubgeter · 2026-04-27T13:30:40Z

What problem does this PR solve?

Related PR: #62578

PR [opt](maxcompute)Allocate write block ids from FE and add catalog-level write_max_block_bytes prop. #62578 moved MaxCompute write block ID allocation from BE-local counters to
Instead of calling FE through the BE JNI C++ bridge:

MaxCompute connector Java -> BE JNI C++ -> FE
the MaxCompute connector now requests FE directly through thrift:
MaxCompute connector Java -> FE

A new MaxComputeFeClient is added under the MaxCompute connector to handle FE
methods.

Removes the hardcoded MAX_BLOCK_COUNT variable from
fe/fe-core/src/main/java/org/apache/doris/datasource/maxcompute/MCTransaction.java
and moves it to the FE config max_compute_write_max_block_count
The default value is still 20000, so the existing behavior is preserved.

Release note

None

Check List (For Author)

Test
- Regression test
- Unit Test
- Manual test (add detailed scripts or steps below)
- No need to test or manual test. Explain why:
  - This is a refactor/code format and no logic has been changed.
  - Previous test can cover this change.
  - No code files have been changed.
  - Other reason
Behavior changed:
- No.
- Yes.
Does this need documentation?
- No.
- Yes.

Check List (For Reviewer who merge this PR)

Confirm the release note
Confirm test cases
Confirm document
Add branch pick label

…e writes

hello-stephen · 2026-04-27T13:30:45Z

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

What problem was fixed (it's best to include specific error reporting information). How it was fixed.
Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
What features were added. Why was this function added?
Which code was refactored and why was this part of the code refactored?
Which functions were optimized and what is the difference before and after the optimization?

hubgeter · 2026-04-27T13:30:55Z

/review

github-actions

I found two issues that should be addressed before this PR merges.

The new max_compute_write_max_block_count setting turns the previously hard FE-side block-id guard into an unconstrained runtime knob. If this is set above MaxCompute's actual per-session block cap, FE will now allocate out-of-contract block IDs and push the failure to the writer/runtime path.
MaxComputeFeClient now opens a fresh thrift socket for every block-id request. Because block IDs are requested on every writer rotation, this regresses the hot write path and increases FE connection churn compared with the old cached-client path.

Critical checkpoint conclusions:

Goal of the task: move MaxCompute block-id allocation from the BE JNI native bridge into the Java connector and keep the FE allocation semantics. The refactor mostly achieves that, and MaxComputeFeClientTest covers basic success/redirect/error retry behavior, but the two issues above mean the new path is not yet equivalent/safe enough.
Change scope: the JNI cleanup and direct FE client are reasonably focused, but the config change also widens a previously hard-enforced guard.
Concurrency: no new cross-thread locking issue was found in the touched code. MCTransaction.allocateBlockIdRange() still relies on AtomicLong, and the new FE client is used per writer instance.
Lifecycle/static init: no special lifecycle or static-initialization issue found in the reviewed paths.
Configuration: a new FE config was added. Its current shape is problematic because it makes the external block-id cap operator-configurable without preserving the hard upper bound.
Compatibility/incompatible change: no new storage-format or FE/BE thrift compatibility issue was found in the touched fields.
Parallel code paths: I did not find another live MaxCompute block-id allocation path after the JNI method removal.
Special conditions: NOT_MASTER redirect handling is still present, but the transport implementation changed enough to introduce the connection-churn regression above.
Test coverage: good unit coverage for the new redirect/retry logic; missing coverage for invalid block-count configuration and repeated block-allocation hot-path behavior.
Test result updates: none in this PR.
Observability: existing logs are sufficient for debugging this path.
Transaction/persistence/data-write correctness: block IDs are still allocated from FE transaction state, but allowing the limit to exceed the external cap can break write correctness.
Performance: per-request socket creation on block allocation is a hot-path regression.
Other issues: no additional distinct issues found.
User focus: /tmp/review_focus.txt contained no additional review focus, so there were no extra focus-specific findings.

github-actions · 2026-04-27T13:53:35Z

+     */
+    @ConfField(mutable = true, masterOnly = true, description = {
+            "Maximum number of MaxCompute Storage API write block IDs that can be allocated in one write session."})
+    public static long max_compute_write_max_block_count = 20000L;


PR #62578 made 20000 the FE-side guard for MaxCompute's per-session block-id cap. After turning that into an unconstrained config here, setting max_compute_write_max_block_count > 20000 lets allocateBlockIdRange() hand out block IDs that the previous FE-side guard explicitly refused. That moves the failure from deterministic FE validation to a later write-time error after data has already been produced. Can we keep 20000 as a hard upper bound and only make a lower soft cap configurable (or validate/clamp the config value before it is used)?

github-actions · 2026-04-27T13:53:35Z

+
+    private static <T> T callFe(TNetworkAddress address, int timeoutMs,
+            boolean useFramedTransport, FeCall<T> call) throws Exception {
+        TSocket socket = new TSocket(address.getHostname(), address.getPort(), timeoutMs);


Before this refactor, block-id allocation went through ThriftRpcHelper::rpc<FrontendServiceClient>(), so repeated requests reused Doris' FE client cache. callFe() now creates and closes a brand-new TSocket/TTransport for every requestBlockId(). Since requestBlockId() runs on every writer rotation (rotateCurrentBatchWriter()), small block sizes or many partitions now turn into one TCP connection per block. That is a hot-path regression and increases FE socket churn. Please keep a reusable client/transport per MaxComputeFeClient (or another pooled helper) instead of reconnecting for every allocation.

hubgeter · 2026-04-27T14:38:02Z

/review

hubgeter · 2026-04-27T14:38:13Z

run buildall

github-actions

Besides the two already-open review threads, I found one additional blocking issue.

Critical checkpoints

Goal / correctness: The refactor mostly achieves the FE-direct block-id path, but it does not preserve availability across FE master failover. The added tests only prove steady-state success / NOT_MASTER redirect.
Scope: The change is reasonably focused around MaxCompute block-id allocation.
Concurrency: MaxComputeFeClient is synchronized and appears single-writer scoped; I did not find a deadlock or lock-order issue in the touched code.
Lifecycle: The new client now lives for the whole writer lifetime, and that lifetime-long FE address pinning is the source of the failure below.
Config: max_compute_write_max_block_count is added; I am not duplicating the already-open thread about enforcing the historic 20000 upper bound.
Parallel code paths: I did not find another MaxCompute writer path in this PR that also needed the same fix.
Special conditions: NOT_MASTER is handled, but the old-master-down / no-redirect case is not.
Tests: Added FE unit and Java unit tests cover happy path, redirect, FE error, and retry. They do not cover master crash / heartbeat refresh behavior.
Test result files: none changed.
Observability: adequate for most FE-side failures; Java-side RPC failure logs could include txn_id / write_session_id, but that is secondary.
Transaction / persistence: no EditLog changes; external transaction bookkeeping remains in-memory as before.
Data writes: FE failover can now break further block-id allocation for an in-flight write.
FE/BE variables: the new FE address / thrift-server-type params are wired through the BE-to-Java path.
Performance: the latest revision fixes the already-open connection-churn regression by reusing a transport per writer.
Other: I did not find further distinct issues beyond the already-open threads and the inline comment below.

github-actions · 2026-04-27T15:41:22Z

+
+        Exception lastException = null;
+        for (int retryTimes = 0; retryTimes < FETCH_BLOCK_ID_MAX_RETRY_TIMES; retryTimes++) {
+            TNetworkAddress requestAddress = copyAddress(masterAddress);


Before this refactor, each block-id allocation started from ExecEnv::GetInstance()->cluster_info()->master_fe_addr, so the next allocation could follow BE heartbeat updates after a master crash. Here the request address always comes from the client's cached masterAddress, which only changes after a successful NOT_MASTER response. If the old master dies before it can return NOT_MASTER, BE can learn the new leader, but this writer never re-reads it and keeps retrying the dead endpoint for the rest of the query. Can we refresh the baseline FE address between independent requestBlockId() calls (or on RPC failures) instead of pinning the whole writer lifetime to the first address?

hello-stephen · 2026-04-27T15:54:55Z

BE UT Coverage Report

Increment line coverage 0.00% (0/5) 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	53.38% (20430/38270)
Line Coverage	36.93% (192588/521457)
Region Coverage	33.21% (149637/450585)
Branch Coverage	34.36% (65541/190750)

hello-stephen · 2026-04-27T16:04:16Z

FE UT Coverage Report

Increment line coverage 0.00% (0/1) 🎉
Increment coverage report
Complete coverage report

hello-stephen · 2026-04-27T17:26:21Z

BE Regression && UT Coverage Report

Increment line coverage 0.00% (0/5) 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	73.86% (27692/37495)
Line Coverage	57.65% (299855/520162)
Region Coverage	54.81% (249316/454911)
Branch Coverage	56.40% (108000/191480)

hello-stephen · 2026-04-27T17:33:55Z

FE Regression Coverage Report

Increment line coverage 0.00% (0/5) 🎉
Increment coverage report
Complete coverage report

yiguolei · 2026-04-28T11:55:26Z

+        return new MaxComputeFeClient(new TNetworkAddress(host, port), timeoutMs, serverType);
+    }
+
+    MaxComputeFeClient(TNetworkAddress masterAddress, int rpcTimeoutMs, String thriftServerType) {


为啥这么多构造函数，真的有用吗？

会 remove 的

yiguolei · 2026-04-29T05:53:49Z

run buildall

github-actions · 2026-04-29T05:54:54Z

PR approved by at least one committer and no changes requested.

ok

yiguolei · 2026-05-01T02:09:00Z

skip buildall

yiguolei · 2026-05-01T11:12:49Z

skip check_coverage

Gabriel39 · 2026-06-02T10:56:09Z

/review

github-actions

I completed a full pass over the PR and did not add duplicate inline comments. The main blocking concerns I found are already present in existing review threads and still appear unresolved in the final patch: the MaxCompute FE client still pins the initial master FE address across RPC failures, so it can keep retrying a dead former master instead of re-reading BE cluster-info after failover; and max_compute_write_max_block_count remains an unconstrained startup config, so values above MaxCompute's 20000 block/session limit can still move the failure from FE validation to later write-time errors.

Critical checkpoint conclusions:
Goal/test coverage: The PR intends to move MaxCompute block-id allocation into the Java connector and add tests for success, redirect, FE error, and RPC retry. The tests cover basic client behavior but not FE failover after a dead old master or config values above 20000.
Scope/focus: The change is mostly focused, but it changes the hot write path and FE/BE interaction behavior.
Concurrency/lifecycle: The Java client serializes access and closes reusable transport on close/errors. No new lock-order issue found, but writer lifetime now owns a persistent FE transport and the already-raised failover lifecycle concern remains.
Configuration: A new FE config is added as non-mutable, but it is still not validated/clamped against the external 20000 hard limit.
Compatibility/protocol: New FE address params are injected by BE and consumed by the Java connector; no additional FE-BE thrift mismatch found. Mixed-version behavior is intentionally rejected for missing txn_id.
Parallel paths: I found no other remaining native call sites after removing requestMaxComputeBlockId.
Error handling/observability: RPC and FE statuses are propagated as IOException, and success logging was moved to debug. Existing FE request INFO logging was outside this PR diff.
Data correctness/transactions: Block allocation remains FE-master backed and session-id checked. The unresolved config cap can still allocate IDs beyond the MaxCompute-supported range if misconfigured.
Performance: Transport reuse was added after the prior review comment, so I did not find a new connection-per-block issue.
User focus: No additional user-provided review focus was supplied.

1

[improvement](maxcompute) Simplify FE block ID requests for MaxComput…

734ce0a

…e writes

hubgeter requested review from CalvinKirs and morningman as code owners April 27, 2026 13:30

github-actions Bot requested changes Apr 27, 2026

View reviewed changes

fix comment

e508d30

github-actions Bot previously requested changes Apr 27, 2026

View reviewed changes