Search before asking
Fluss version
0.9.0 (latest release)
Please describe the bug 🐞
During a rolling upgrade of Fluss servers, the Admin client suffers from two problems:
| Exception Type |
Meaning |
Refresh Strategy |
NetworkException / TimeoutException |
Node unreachable (IP changed) |
Refresh cluster-level metadata (server list) |
NotLeaderOrFollowerException / LeaderNotAvailableException |
Leader switched |
Refresh table/partition-level metadata (leader assignment) |
Stale metadata never refreshed — permanent failure
example: Admin#listPartitionInfos、getLatestLakeSnapshot、getLatestKvSnapshots、getTableInfo
readOnlyGateway is backed by metadataUpdater::getRandomTabletServer as its node supplier. When all tablet server IPs change (e.g., pods restarting with new IPs in Kubernetes), the cached Cluster still holds the old IPs.
The RPC fails with a network error, but nothing triggers updateMetadata() — the CompletableFuture simply completes exceptionally and the stale Cluster remains unchanged. Subsequent calls keep resolving the same stale nodes, making this a permanent failure that no amount of caller-side retries can fix.
Stale leader routing — transient failure
Example: Admin#listOffsets
// FlussAdmin.java:529
metadataUpdater.updateTableOrPartitionMetadata(physicalTablePath.getTablePath(), null);
// ... then prepareListOffsetsRequests() calls leaderFor() and sends to that leader
Each call starts by refreshing metadata, so it picks up the latest leader assignment. However, if a leader-follower switch occurs in the small window between the metadata refresh and the actual RPC send, the request is routed to the old leader (now a follower).
This is a transient failure — a caller-side retry will trigger a fresh metadata update at the beginning of the next call, resolve the new leader, and succeed.
Solution
⏺ Therefore, we plan to address this in two steps:
- Urgently fix Problem 1 — refresh metadata upon network errors on the readOnlyGateway path, so the client can recover from stale server addresses.
- Problem 2 is less urgent — we can address it later by introducing a more robust metadata mechanism at the framework level.
- solve the ip change of coordinator server
- solve the leader change.
Are you willing to submit a PR?
Search before asking
Fluss version
0.9.0 (latest release)
Please describe the bug 🐞
During a rolling upgrade of Fluss servers, the Admin client suffers from two problems:
NetworkException/TimeoutExceptionNotLeaderOrFollowerException/LeaderNotAvailableExceptionStale metadata never refreshed — permanent failure
example: Admin#listPartitionInfos、getLatestLakeSnapshot、getLatestKvSnapshots、getTableInfo
readOnlyGatewayis backed bymetadataUpdater::getRandomTabletServeras its node supplier. When all tablet server IPs change (e.g., pods restarting with new IPs in Kubernetes), the cached Cluster still holds the old IPs.The RPC fails with a network error, but nothing triggers updateMetadata() — the CompletableFuture simply completes exceptionally and the stale Cluster remains unchanged. Subsequent calls keep resolving the same stale nodes, making this a permanent failure that no amount of caller-side retries can fix.
Stale leader routing — transient failure
Example: Admin#listOffsets
Each call starts by refreshing metadata, so it picks up the latest leader assignment. However, if a leader-follower switch occurs in the small window between the metadata refresh and the actual RPC send, the request is routed to the old leader (now a follower).
This is a transient failure — a caller-side retry will trigger a fresh metadata update at the beginning of the next call, resolve the new leader, and succeed.
Solution
⏺ Therefore, we plan to address this in two steps:
Are you willing to submit a PR?