Is Your Feature Request Related to a Problem?
Currently, when enableProxyAutoRenew is enabled, the proxy uses a short defaultInvisibleTimeMills (~60s) to pop messages and then periodically renews the invisible time via changeInvisibleTime RPC calls to the broker. This introduces several problems:
-
Handle mapping complexity: Every changeInvisibleTime call returns a new receipt handle from the broker. The proxy must maintain a mapping from the original handle (known to the client) to the latest renewed handle. This adds significant memory overhead and concurrency control complexity (ReceiptHandleGroup with semaphore-based locking).
-
Proxy crash makes client handles invalid: Since the proxy pops with a short invisible time (~60s), the client receives a short-lived handle. If the proxy crashes, the renewed handle (stored only in proxy memory) is lost, and the client's original handle has already expired from the broker's perspective. The client cannot ack through another proxy, leading to unavoidable duplicate consumption.
-
Unnecessary network overhead: The periodic changeInvisibleTime calls create continuous RPC traffic between proxy and broker for every un-acked message, which is wasteful when consumers are simply processing normally.
Describe the Solution You'd Like
Add a dynamic switch enableGrpcChannelReceiptHandleRenew (default true for backward compatibility) in ProxyConfig. This switch specifically targets gRPC (GrpcClientChannel) consumers:
- When
enableGrpcChannelReceiptHandleRenew = true: Behavior remains unchanged (short invisible time + periodic renew) for gRPC clients.
- When
enableGrpcChannelReceiptHandleRenew = false:
- The proxy uses the consumer group's
consumeTimeoutMinute (from SubscriptionGroupConfig) as the invisible time during pop for gRPC clients, instead of the short defaultInvisibleTimeMills.
- The proxy does not periodically renew handles via
changeInvisibleTime for GrpcClientChannel instances.
- Non-gRPC channels (e.g., Remoting) are unaffected and continue to renew normally.
- The proxy still saves handles and monitors client connectivity. When a client disconnects, the proxy uses the saved (unchanged) handle to nack/changeInvisibleTime, enabling fast message re-delivery.
- The client's original receipt handle remains valid for the full consume timeout duration, so even if the proxy crashes, the client can reconnect to another proxy and ack successfully.
This approach preserves fast crash recovery for client failures (proxy detects disconnect and nacks) while also being resilient to proxy failures (client handle stays valid).
Describe Alternatives You've Considered
-
Keep the current renew mechanism as-is: This works but has the drawbacks described above — particularly the proxy-crash scenario where client handles become invalid.
-
Persist renewed handles to external storage (e.g., Redis): This would survive proxy crashes but adds infrastructure dependency and latency. The proposed solution achieves the same resilience without any external dependency.
-
Use a fixed long invisible time without saving handles: This would lose the ability to nack on client disconnect, meaning client crashes would require waiting for the full invisible time to expire before re-delivery.
Additional Context
Failure scenario comparison:
| Scenario |
Current (renew=true) |
Proposed (renew=false) |
| Client crash, proxy alive |
Proxy stops renewing → ~1min recovery |
Proxy nacks → fast recovery |
| Proxy crash, client alive |
Client's short-lived handle expired, renewed handle lost → duplicate consumption |
Client's handle still valid → ack succeeds via another proxy |
| Both crash |
~1min recovery (short invisible time) |
Wait for consumeTimeout (configurable, e.g. 15min) |
The probability of simultaneous proxy + client failure is low, making the proposed approach a better trade-off for most production scenarios.
Is Your Feature Request Related to a Problem?
Currently, when
enableProxyAutoRenewis enabled, the proxy uses a shortdefaultInvisibleTimeMills(~60s) to pop messages and then periodically renews the invisible time viachangeInvisibleTimeRPC calls to the broker. This introduces several problems:Handle mapping complexity: Every
changeInvisibleTimecall returns a new receipt handle from the broker. The proxy must maintain a mapping from the original handle (known to the client) to the latest renewed handle. This adds significant memory overhead and concurrency control complexity (ReceiptHandleGroupwith semaphore-based locking).Proxy crash makes client handles invalid: Since the proxy pops with a short invisible time (~60s), the client receives a short-lived handle. If the proxy crashes, the renewed handle (stored only in proxy memory) is lost, and the client's original handle has already expired from the broker's perspective. The client cannot ack through another proxy, leading to unavoidable duplicate consumption.
Unnecessary network overhead: The periodic
changeInvisibleTimecalls create continuous RPC traffic between proxy and broker for every un-acked message, which is wasteful when consumers are simply processing normally.Describe the Solution You'd Like
Add a dynamic switch
enableGrpcChannelReceiptHandleRenew(defaulttruefor backward compatibility) inProxyConfig. This switch specifically targets gRPC (GrpcClientChannel) consumers:enableGrpcChannelReceiptHandleRenew = true: Behavior remains unchanged (short invisible time + periodic renew) for gRPC clients.enableGrpcChannelReceiptHandleRenew = false:consumeTimeoutMinute(fromSubscriptionGroupConfig) as the invisible time during pop for gRPC clients, instead of the shortdefaultInvisibleTimeMills.changeInvisibleTimeforGrpcClientChannelinstances.This approach preserves fast crash recovery for client failures (proxy detects disconnect and nacks) while also being resilient to proxy failures (client handle stays valid).
Describe Alternatives You've Considered
Keep the current renew mechanism as-is: This works but has the drawbacks described above — particularly the proxy-crash scenario where client handles become invalid.
Persist renewed handles to external storage (e.g., Redis): This would survive proxy crashes but adds infrastructure dependency and latency. The proposed solution achieves the same resilience without any external dependency.
Use a fixed long invisible time without saving handles: This would lose the ability to nack on client disconnect, meaning client crashes would require waiting for the full invisible time to expire before re-delivery.
Additional Context
Failure scenario comparison:
The probability of simultaneous proxy + client failure is low, making the proposed approach a better trade-off for most production scenarios.