doctor: add QUIC reachability check for the API port#819
Conversation
The existing HTTPS/HTTP probes in `miren doctor server` only cover TCP. A host firewall (UFW, firewalld, cloud security group) that allows TCP but not UDP on the API port — a common pitfall when operators add `8443/tcp` rules out of habit — silently drops every API request while passing the existing endpoint checks. Adds a `[✓|✗] QUIC` endpoint to the same Endpoints block, modeled on the existing checkHTTPS/checkHTTP helpers. Dials QUIC with the same ALPN the real server advertises (http3.NextProtoH3) so a successful handshake also exercises the TLS path. On timeout, the failure detail points directly at the most likely cause. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThis PR extends the doctor server command with QUIC/UDP connectivity diagnostics for the cluster API. The implementation adds necessary imports for context and QUIC libraries, then integrates a new Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Thanks @jcasimir! |
Summary
Adds a
[✓|✗] QUICendpoint check tomiren doctor server, alongside the existing HTTPS/HTTP checks.The existing endpoint probes only cover TCP. If a host firewall (UFW, firewalld, cloud security group) allows TCP on the API port but not UDP — a common configuration when operators add
8443/tcprules out of habit — every externalmiren deployandmiren cluster export-addresssilently fails with the client-side timeouterror performing http request: timeout: no recent network activity. The server side shows zero observable signal: no errors, no warnings, no requests visible inmiren logs system, because packets are dropped at netfilter INPUT before reaching the socket.This is documented in
docs/docs/firewall.mdbut is easy to miss, and the existingmiren doctor servershows all green for HTTPS/HTTP while the actual API port is unreachable.What it does
Adds
checkQUIC(addr string) (bool, string)modeled oncheckHTTPS/checkHTTP:http3.NextProtoH3(the same ALPN the real RPC server advertises inpkg/rpc/state.go), so a successful handshake also exercises the TLS path.context.DeadlineExceededor the QUIC idle-timeout error, the detail message points directly at the most likely cause:cli/commands/cluster_add.go.Usage
No CLI change. Just run as before:
Output now includes the QUIC line in the Endpoints block.
Test plan
go build ./cmd/miren— compilesgo vet ./cli/commands/— cleangolangci-lint run --timeout=10m ./cli/commands/...— 0 issues[✓] QUIC handshake ok[✗] QUIC no response on udp/8443 (host firewall may be blocking inbound UDP)No new tests in
doctor_server_test.go— the existingcheckHTTPS/checkHTTPhelpers also have no tests and I didn't want to grow this PR's scope. Happy to add a table-driven test in a follow-up if maintainers want one.Followups (out of scope here)
Things I considered and deliberately left out to keep the patch focused:
miren serverstart that mentions UDP is the API port, surfaced in journalctl.showFixConnectivityHelpto specifylsof -i UDP:8443(currently only suggests:443TCP).miren server installtime.Happy to PR any of these separately if there's interest.
🤖 Generated with Claude Code