Description
What version of gRPC are you using?
client: v1.70.0, server: v1.61.1, v1.63.2, v1.66.0.
What version of Go are you using (go version
)?
go version go1.22.11 linux/amd64
What operating system (Linux, Windows, …) and version?
several Linux versions, including: kernel 4.15.0-112-generic on Ubuntu
What did you do?
gRPC client with a custom resolver (using pick_first
LB), connecting to a multi-node server cluster. this codebase appeared to be working fine for about 6 years, up to and including gRPC v1.63.2 - or at least the issue below was never brought to our attention before. since upgrading to v1.70.0, the same code appears to occasionally enter infinite loop calling the custom resolver's ResolveNow()
method. the issue is non-deterministic, but has been observed 4-5 times in the past month, AFAICT - always on network transients (impaired connectivity to the gRPC servers cluster).
What did you expect to see?
the problem appears some time after the initial conn was established, lost, re-established, etc. after a net hiccup, i'd expect the lower gRPC layers to keep trying to dial the addresses returned by ResolveNow()
as per the backoff and connect timeout config - which it normally does, and has been doing for years. an example of this is at the top of the attached log.
What did you see instead?
occasionally (but rarely, despite forcibly injecting various network faults), the gRPC code enters an infinite loop of calling ResolveNow()
at a rate of >15 thousand calls/second, pegging several CPU cores (>210% CPU on an otherwise near-idle machine). the process does not recover. the issue appears to manifest somewhere around here:
2025-03-07T12:48:28.072 INFO [core] [Channel #1 SubChannel #2]Subchannel Connectivity change to IDLE
2025-03-07T12:48:28.072 INFO resolver.ResolveNow() {"lbgrpc": "1i855hu", "targets": "192.168.19.217:443,192.168.19.218:443,192.168.19.220:443,192.168.25.23:443,192.168.26.25:443,192.168.29.142:443,192.168.29.144:443,192.168.16.233:443,192.168.17.106:443,192.168.17.107:443,192.168.17.30:443,192.168.19.213:443,192.168.19.214:443,192.168.19.215:443", "calls-last-sec": 0}
2025-03-07T12:48:28.072 INFO [transport] [client-transport 0xc000bc46c8] loopyWriter exiting with error: received GOAWAY with no active streams
2025-03-07T12:48:28.072 INFO [pick-first-lb] [pick-first-lb 0xc000a37d70] Received SubConn state update: 0xc0002bb5e0, {ConnectivityState:IDLE ConnectionError:<nil> connectedAddress:{Addr: ServerName: Attributes:<nil> BalancerAttributes:<nil> Metadata:<nil>}}
2025-03-07T12:48:28.073 INFO [core] blockingPicker: the picked transport is not ready, loop back to repick
2025-03-07T12:48:28.073 INFO [core] [Channel #1]Resolver state updated: { "Addresses": [ { "Addr": "192.168.19.217:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.218:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.220:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.25.23:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.26.25:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.29.142:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.29.144:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.16.233:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.17.106:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.17.107:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.17.30:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.213:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.214:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.215:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null } ], "Endpoints": null, "ServiceConfig": null, "Attributes": null} ()
2025-03-07T12:48:28.073 INFO [core] [Channel #1]Channel Connectivity change to IDLE
2025-03-07T12:48:28.073 INFO [transport] [client-transport 0xc000bc46c8] Closing: connection error: desc = "error reading from server: read tcp 10.212.1.6:32844->192.168.19.215:443: use of closed network connection"
2025-03-07T12:48:28.073 INFO [pick-first-lb] [pick-first-lb 0xc000a37d70] Received new config { "shuffleAddressList": false}, resolver state { "Addresses": [ { "Addr": "192.168.19.217:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.218:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.220:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.25.23:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.26.25:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.29.142:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.29.144:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.16.233:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.17.106:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.17.107:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.17.30:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.213:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.214:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.215:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null } ], "Endpoints": null, "ServiceConfig": null, "Attributes": null}
2025-03-07T12:48:28.073 INFO [core] [Channel #1 SubChannel #2]addrConn: updateAddrs addrs (5 of 14): [{Addr: "192.168.19.217:443", ServerName: "", } {Addr: "192.168.19.218:443", ServerName: "", } {Addr: "192.168.19.220:443", ServerName: "", } {Addr: "192.168.25.23:443", ServerName: "", } {Addr: "192.168.26.25:443", ServerName: "", }]
2025-03-07T12:48:28.073 INFO [core] [Channel #1 SubChannel #2]Subchannel Connectivity change to CONNECTING
2025-03-07T12:48:28.073 INFO [core] [Channel #1 SubChannel #2]Subchannel picks a new address "192.168.19.215:443" to connect
2025-03-07T12:48:28.073 INFO [pick-first-lb] [pick-first-lb 0xc000a37d70] Received SubConn state update: 0xc0002bb5e0, {ConnectivityState:CONNECTING ConnectionError:<nil> connectedAddress:{Addr: ServerName: Attributes:<nil> BalancerAttributes:<nil> Metadata:<nil>}}
2025-03-07T12:48:28.073 INFO [core] [Channel #1]Channel Connectivity change to CONNECTING
2025-03-07T12:48:28.073 INFO [core] [Channel #1 SubChannel #2]Subchannel picks a new address "192.168.19.217:443" to connect
2025-03-07T12:48:28.073 INFO [core] Creating new client transport to "{Addr: \"192.168.19.215:443\", ServerName: \"lb-resolver\", }": connection error: desc = "transport: Error while dialing: dial tcp 192.168.19.215:443: operation was canceled"
2025-03-07T12:48:28.073 WARN [core] [Channel #1 SubChannel #2]grpc: addrConn.createTransport failed to connect to {Addr: "192.168.19.215:443", ServerName: "lb-resolver", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 192.168.19.215:443: operation was canceled"
2025-03-07T12:48:28.073 INFO resolver.ResolveNow() {"lbgrpc": "1i855hu", "targets": "192.168.19.218:443,192.168.19.220:443,192.168.25.23:443,192.168.26.25:443,192.168.29.142:443,192.168.29.144:443,192.168.16.233:443,192.168.17.106:443,192.168.17.107:443,192.168.17.30:443,192.168.19.213:443,192.168.19.214:443,192.168.19.215:443,192.168.19.217:443", "calls-last-sec": 1}
2025-03-07T12:48:28.074 INFO [core] [Channel #1]Resolver state updated: { "Addresses": [ { "Addr": "192.168.19.218:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.220:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.25.23:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.26.25:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.29.142:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.29.144:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.16.233:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.17.106:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.17.107:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.17.30:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.213:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.214:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.215:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.217:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null } ], "Endpoints": null, "ServiceConfig": null, "Attributes": null} ()
2025-03-07T12:48:28.074 INFO [pick-first-lb] [pick-first-lb 0xc000a37d70] Received new config { "shuffleAddressList": false}, resolver state { "Addresses": [ { "Addr": "192.168.19.218:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.220:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.25.23:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.26.25:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.29.142:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.29.144:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.16.233:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.17.106:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.17.107:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.17.30:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.213:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.214:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.215:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.217:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null } ], "Endpoints": null, "ServiceConfig": null, "Attributes": null}
2025-03-07T12:48:28.074 INFO [core] [Channel #1 SubChannel #2]addrConn: updateAddrs addrs (5 of 14): [{Addr: "192.168.19.218:443", ServerName: "", } {Addr: "192.168.19.220:443", ServerName: "", } {Addr: "192.168.25.23:443", ServerName: "", } {Addr: "192.168.26.25:443", ServerName: "", } {Addr: "192.168.29.142:443", ServerName: "", }]
2025-03-07T12:48:28.074 INFO [core] Creating new client transport to "{Addr: \"192.168.19.217:443\", ServerName: \"lb-resolver\", }": connection error: desc = "transport: Error while dialing: dial tcp 192.168.19.217:443: operation was canceled"
2025-03-07T12:48:28.074 WARN [core] [Channel #1 SubChannel #2]grpc: addrConn.createTransport failed to connect to {Addr: "192.168.19.217:443", ServerName: "lb-resolver", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 192.168.19.217:443: operation was canceled"
2025-03-07T12:48:28.074 INFO resolver.ResolveNow() {"lbgrpc": "1i855hu", "targets": "192.168.19.220:443,192.168.25.23:443,192.168.26.25:443,192.168.29.142:443,192.168.29.144:443,192.168.16.233:443,192.168.17.106:443,192.168.17.107:443,192.168.17.30:443,192.168.19.213:443,192.168.19.214:443,192.168.19.215:443,192.168.19.217:443,192.168.19.218:443", "calls-last-sec": 2}
2025-03-07T12:48:28.074 INFO [core] [Channel #1 SubChannel #2]Subchannel picks a new address "192.168.19.218:443" to connect
2025-03-07T12:48:28.074 INFO [core] [Channel #1]Resolver state updated: { "Addresses": [ { "Addr": "192.168.19.220:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.25.23:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.26.25:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.29.142:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.29.144:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.16.233:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.17.106:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.17.107:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.17.30:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.213:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.214:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.215:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.217:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.218:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null } ], "Endpoints": null, "ServiceConfig": null, "Attributes": null} ()
2025-03-07T12:48:28.074 INFO [pick-first-lb] [pick-first-lb 0xc000a37d70] Received new config { "shuffleAddressList": false}, resolver state { "Addresses": [ { "Addr": "192.168.19.220:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.25.23:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.26.25:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.29.142:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.29.144:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.16.233:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.17.106:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.17.107:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.17.30:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.213:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.214:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.215:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.217:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.218:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null } ], "Endpoints": null, "ServiceConfig": null, "Attributes": null}
2025-03-07T12:48:28.074 INFO [core] [Channel #1 SubChannel #2]addrConn: updateAddrs addrs (5 of 14): [{Addr: "192.168.19.220:443", ServerName: "", } {Addr: "192.168.25.23:443", ServerName: "", } {Addr: "192.168.26.25:443", ServerName: "", } {Addr: "192.168.29.142:443", ServerName: "", } {Addr: "192.168.29.144:443", ServerName: "", }]
2025-03-07T12:48:28.075 INFO [core] Creating new client transport to "{Addr: \"192.168.19.218:443\", ServerName: \"lb-resolver\", }": connection error: desc = "transport: Error while dialing: dial tcp 192.168.19.218:443: operation was canceled"
2025-03-07T12:48:28.075 WARN [core] [Channel #1 SubChannel #2]grpc: addrConn.createTransport failed to connect to {Addr: "192.168.19.218:443", ServerName: "lb-resolver", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 192.168.19.218:443: operation was canceled"
2025-03-07T12:48:28.075 INFO resolver.ResolveNow() {"lbgrpc": "1i855hu", "targets": "192.168.25.23:443,192.168.26.25:443,192.168.29.142:443,192.168.29.144:443,192.168.16.233:443,192.168.17.106:443,192.168.17.107:443,192.168.17.30:443,192.168.19.213:443,192.168.19.214:443,192.168.19.215:443,192.168.19.217:443,192.168.19.218:443,192.168.19.220:443", "calls-last-sec": 3}
2025-03-07T12:48:28.075 INFO [core] [Channel #1]Resolver state updated: { "Addresses": [ { "Addr": "192.168.25.23:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.26.25:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.29.142:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.29.144:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.16.233:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.17.106:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.17.107:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.17.30:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.213:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.214:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.215:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.217:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.218:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.220:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null } ], "Endpoints": null, "ServiceConfig": null, "Attributes": null} ()
2025-03-07T12:48:28.075 INFO [core] [Channel #1 SubChannel #2]Subchannel picks a new address "192.168.19.220:443" to connect
2025-03-07T12:48:28.075 INFO [pick-first-lb] [pick-first-lb 0xc000a37d70] Received new config { "shuffleAddressList": false}, resolver state { "Addresses": [ { "Addr": "192.168.25.23:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.26.25:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.29.142:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.29.144:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.16.233:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.17.106:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.17.107:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.17.30:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.213:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.214:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.215:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.217:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.218:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null }, { "Addr": "192.168.19.220:443", "ServerName": "", "Attributes": null, "BalancerAttributes": null, "Metadata": null } ], "Endpoints": null, "ServiceConfig": null, "Attributes": null}
2025-03-07T12:48:28.075 INFO [core] [Channel #1 SubChannel #2]addrConn: updateAddrs addrs (5 of 14): [{Addr: "192.168.25.23:443", ServerName: "", } {Addr: "192.168.26.25:443", ServerName: "", } {Addr: "192.168.29.142:443", ServerName: "", } {Addr: "192.168.29.144:443", ServerName: "", } {Addr: "192.168.16.233:443", ServerName: "", }]
see attached log, the relevant portion (reproduced above) starts at around PROBLEM MANIFESTS AROUND HERE
text.
sample.2025-03-07.12-54-08.log.gz
the only unusual entry in the log that i see before this ResolveNow()
infinite loop but don't normally find in the logs on any other net transients (which are handled gracefully) is this one (UPDATED): [core] blockingPicker: the picked transport is not ready, loop back to repick
loopyWriter exiting with error: connection error: desc = "keepalive ping failed to receive ACK within timeout".
potentially relevant gRPC client options:
kal := keepalive.ClientParameters{
Time: 10 * time.Second,
Timeout: 10 * time.Second,
PermitWithoutStream: true,
}
dialBackoffConfig := backoff.Config{
BaseDelay: 10 * time.Millisecond,
Multiplier: 5,
Jitter: 0.1,
MaxDelay: 3 * time.Second,
}
cp := grpc.ConnectParams{
Backoff: dialBackoffConfig,
MinConnectTimeout: 1 * time.Second,
}
opts := []grpc.DialOption{
grpc.WithBlock(),
grpc.WithDisableRetry(),
grpc.WithDefaultCallOptions(grpc.WaitForReady(true)),
grpc.WithKeepaliveParams(kal),
grpc.WithConnectParams(cp),
grpc.WithResolvers(customResolver),
}