chore: code review fixes for DHCP races, atomic state, and platform robustness#3
Merged
Conversation
handleRequest used to call pool.isFree() and then pool.markUsed() under
two separate lock acquisitions, so two concurrent REQUESTs for the same
free IP could both pass the free check and both commit, producing a
double-allocated IP and a phantom lease until eviction. tryClaim folds
the check-and-commit into one critical section so exactly one racer
wins; the loser is NAKed.
While here, drop the platform.IP4ToUint32(v4.String()) round-trip on the
hot pool path and pack the 4-byte IPv4 directly with
binary.BigEndian.Uint32, and switch pool.used to map[uint32]struct{}
since the bool value was unused. Add pool concurrency tests covering
the race, idempotency, and out-of-pool input.
The previous Save was a single os.WriteFile, so a crash mid-write would leave behind a half-written pool.json that Load would refuse to parse, forcing a manual recovery. Mirror dhcp/lease.go's pattern: write to pool.json.tmp first and rename over pool.json so the final file is either complete or unchanged. Add a pool_test.go covering Save/Load roundtrip, the atomic-write recovery path (a bogus .tmp must be ignored by Load), the missing-file case, and idempotent Delete.
cleanupLoop and handleRelease used to call pool.release before delRoute. Releasing first opens a tight window: a concurrent DISCOVER can claim the freed IP and install its own /32 route, only for our late delRoute to remove that brand-new route and blackhole the new lease's traffic. Swap the order so the route teardown completes against the lease that owned it, then the IP returns to the free pool.
leaseStore.add used to silently overwrite an existing entry when the same MAC ACKed a different IP. The old IP's /32 route and pool.used slot were left dangling — the route blackholed nothing useful but locked the IP out of reallocation until cleanupLoop noticed the original expiry, which never happens because the entry was overwritten. Replace the []string return with []evictedLease that carries both the displaced MAC and IP. handleRequest now distinguishes the two cases: same-MAC-different-IP triggers delRoute + pool.release for the old IP (orphan cleanup); other-MAC-same-IP keeps the route (still valid for the new lease) and just logs. Add lease_test.go covering both eviction paths, renewal (no eviction), and the save/load roundtrip.
`gcloud compute instances network-interfaces update --aliases=...` is a full replacement, so assignAliasIP used to overwrite any aliases the operator had configured on nic0 (or that a previous run of cocoon-net left in place). Mirror Teardown's read-modify-write: describe nic0 first, return early if our entry is already bound, otherwise append to the existing list and write back the union. This makes init/adopt safe to re-run and safe on shared-NIC topologies.
Volcengine.env was only ever written. The `ve` child binary is the sole consumer and reads credentials/region from the process environment loadEnv exports, not from the parent struct. Strip the field and rewrite loadEnv to return just an error: the side effect (os.Setenv) is now its declared purpose, which matches what callers actually rely on.
teardown deleted pool.json but left the daemon's leases.json behind in the same state dir. A subsequent init that recreated pool.json would then have the daemon restore stale leases (and stale /32 routes) for the new pool. Remove leases.json alongside pool.json; NotFound is ignored so teardown stays idempotent.
A bare or whitespace-only --dns flag (or trailing commas like "a,")
used to yield []string{""} or a slice with empty entries, which then
slipped through parseIPs as zero valid IPs and was silently treated
as "no DNS configured" instead of failing or falling back. Filter
empties after TrimSpace and return nil for the all-empty case so the
existing len()==0 check in daemon.go takes the fallback path.
Add helpers_test.go covering the empty, whitespace-only, mixed-empty,
and well-formed cases.
SubnetIPs used to swallow gateway parse errors (an empty or malformed gateway just meant "every host IP is fair game") and would happily hand out the broadcast address as a usable host. Reject an unparseable gateway up front so misconfigured callers fail fast, and stop iteration before the broadcast so /24 and /28 pools never lease out network|~mask. Add ip_test.go with /24 and /28 round-trips that pin the gateway, broadcast, and network-address exclusions plus error coverage for empty/invalid gateway, invalid CIDR, and IPv6 input.
Previously, the first sysctl write that failed (typically a per-interface rp_filter entry for an ENI that had been detached out-of-band) returned an error and aborted node.Setup, leaving the daemon unrunnable until an operator cleared the stale interface reference. Log the failure and continue so transient stale entries do not block startup; the iptables/route layer surfaces real connectivity problems if any. Drop the now-unused error return. Drive-by: check commonlog.Setup's error in cmd.root so the lint suite (which surfaced the pre-existing errcheck violation when the cache was cleared) stays clean.
CI was failing because the pinned cocoon-common v0.2.1 (eaa6e94) ships log.Setup as void, but cmd/root.go in this branch now checks its returned error. The newer pseudo-version (4c7f78a) carries the fix(log)! breaking change that returns the error, so cocoon-net builds again on CI.
Drop comments that restate the code or narrate the fix: - evictedLease godoc collapses to bullet-only WHY (no rehash of cleanup rules already encoded in request.go) - request.go inline now refers to evictedLease instead of duplicating the orphan-cleanup explanation - pool_test/lease_test test-name doc comments lose the 'with the old isFree+markUsed split' / 'finding flags' narration - volcengine env.go loadEnv godoc trims the redundant 'returns no value because' clause
Picks up the latest chore-code-review-202605 head from cocoon-common so cocoon-net tracks the same code-review pass (commonlog.Setup error return, DetectNodeIP error return, HasCocoonTolerationKey rename, OwnerDeploymentName (string, bool)). No source-level changes are needed here — cmd/root.go and the platform helpers in this branch already match the new signatures.
Split sysctl tuning into required global keys (ip_forward, all/default/cni0 rp_filter) and best-effort per-interface keys. A write failure on a required key now returns an error from Setup instead of being logged-and-continued, so the daemon won't come up "successfully" with broken forwarding. Also trim verbose comments across node/cmd packages.
There was a problem hiding this comment.
Pull request overview
This PR applies a set of correctness and robustness fixes across the DHCP server, persisted state handling, and cloud-platform integration, with new unit tests added to cover previously untested, race-prone paths.
Changes:
- Fix DHCP REQUEST race conditions by atomically claiming pool IPs and improving lease eviction cleanup (including route/pool orphan cleanup on same-MAC rebind).
- Make pool state persistence crash-safe via atomic tmp+rename writes and expand teardown cleanup to remove persisted leases.
- Improve platform behavior (GKE alias IP updates without clobbering existing aliases; Volcengine env loading as a side-effect-only path; safer SubnetIPs generation) and add focused tests.
Reviewed changes
Copilot reviewed 22 out of 23 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| pool/pool.go | Make pool state Save atomic using tmp + rename; clarify Load behavior with leftover tmp files. |
| pool/pool_test.go | Add Save/Load roundtrip + atomic tmp ignore tests and Delete idempotency coverage. |
| platform/volcengine/volcengine.go | Remove unused env field; New() now loads env once and returns an empty handle. |
| platform/volcengine/env.go | Convert loadEnv into side-effect-only env exporter (no returned config struct). |
| platform/ip.go | Harden SubnetIPs (gateway parsing validation, IPv4-only, skip broadcast). |
| platform/ip_test.go | Add coverage for SubnetIPs (broadcast/gateway/network exclusions, error cases) + existing helpers. |
| platform/gke/gcloud.go | Prevent alias clobbering by read-modify-write merge of nic0 aliases; no-op if already present. |
| node/sysctl.go | Split required vs best-effort sysctl writes; log-and-continue for per-interface sysctls. |
| node/node.go | Adjust Setup doc comments to reflect ordering constraints more succinctly. |
| go.mod | Bump cocoon-common dependency version. |
| go.sum | Update checksums to match the bumped cocoon-common version. |
| dhcp/request.go | Use atomic pool.tryClaim to eliminate REQUEST TOCTOU; release claims on build/route failure; cleanup orphaned IPs on same-MAC rebind. |
| dhcp/release.go | Fix ordering: delete /32 route before releasing IP back to pool. |
| dhcp/pool.go | Use uint32 key directly from packed IPv4; add tryClaim; change used map to set semantics. |
| dhcp/pool_test.go | Add concurrency/idempotency tests for tryClaim and allocation behavior. |
| dhcp/lease.go | Return structured eviction info (MAC+IP) from add() to enable cleanup on rebind/conflict. |
| dhcp/lease_test.go | Add tests for eviction cases and save/load roundtrip behavior. |
| dhcp/cleanup.go | Fix ordering in cleanup: delete /32 route before releasing IP for expired leases. |
| cmd/teardown.go | Remove leases.json alongside pool.json during teardown. |
| cmd/root.go | Handle and report commonlog.Setup errors instead of ignoring them. |
| cmd/helpers.go | Make splitTrim drop empty entries and return nil for empty/whitespace-only input. |
| cmd/helpers_test.go | Add table tests for splitTrim behavior. |
| cmd/daemon.go | Minor comment adjustments (no functional change). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
ipKey: a 16-byte net.IP doesn't read out-of-bounds; it reads from the wrong offset and yields a junk key. Rewording reflects the real failure. SubnetIPs: comment claimed iteration stops "one address before broadcast"; actual code iterates through and skips bcast via `continue`. Trim the doc to describe what the code does.
assignAliasIP: re-running with a different --subnet would leave the old cocoon-pods alias in place and add a second one. Drop existing entries under our range name before merging so a new cidr replaces the old; keep operator-managed entries under other names. SubnetIPs: reject a gateway outside the CIDR or equal to network / broadcast. The skip-on-equality only mattered when the gateway was a real host; misconfigs now fail fast instead of silently producing a pool that includes the actual gateway.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
A batch of correctness and resilience fixes surfaced by code review. The headline change is the DHCP REQUEST race (a TOCTOU between
isFreeandmarkUsedthat let two clients win the same IP); the rest are smaller robustness fixes on the state, route, and platform paths.DHCP (race + ordering)
pool.tryClaimfolds the validate-and-commit into one critical section so two concurrent REQUESTs for the same free IP cannot both pass. Also drop theIP4ToUint32(v4.String())round-trip and key the pool directly withbinary.BigEndian.Uint32on the 4-byte form./32route before releasing the IP incleanupLoopandhandleRelease; releasing first opened a window where a concurrent DISCOVER could install its own route and have ours wipe it.leaseStore.addnow returns[]evictedLease(MAC + IP) so the caller can clean the orphaned route + pool slot when the same MAC rebinds to a different IP. Previously the entry was silently overwritten and the old IP was stranded.State (atomic + cleanup)
pool.State.Savewrites topool.json.tmpand renames overpool.json— mirrors the lease store. A crash mid-write no longer leaves a half-written file Load refuses to parse.cmd teardownremovesleases.jsonalongsidepool.jsonso re-init doesn't restore stale leases.Platform
assignAliasIPreads existing nic0 aliases, returns early if our entry is already bound, otherwise appends —gcloud --aliases=...is a full replacement, so previously a re-run clobbered operator-configured aliases.envfield that was only ever written;loadEnvis now a side-effect-only function that exports env vars for thevechild binary (its only consumer).platform.SubnetIPsvalidates the gateway (errors on empty/malformed) and stops iteration before the broadcast.CLI / misc
splitTrimfilters empty entries so an empty--dnsdoesn't slip through as[""].node.setupSysctllogs-and-continues on per-key write failure (per-interface keys vanish when ENIs are detached out-of-band; aborting startup on those was painful).commonlog.Setup's error incmd.root(pre-existing lint issue surfaced when the cache was cleared).Tests
0 unit tests existed before; added focused coverage for the most race-prone and untested areas:
dhcp/pool_test.go—tryClaimconcurrency + idempotencydhcp/lease_test.go— eviction paths (same-MAC-rebind, other-MAC-same-IP, expired-other-MAC), save/loadpool/pool_test.go— Save/Load roundtrip, atomic-write recovery (bogus.tmpmust be ignored)cmd/helpers_test.go—splitTrimtable testsplatform/ip_test.go—SubnetIPsfor /24 + /28 with explicit broadcast / gateway / network exclusionsTest plan
make lintpasses onGOOS=linuxandGOOS=darwin(verified locally — 0 issues both targets)go test -race -count=1 ./...clean (verified locally)GOOS=linuxandGOOS=darwinpool.json/leases.jsonbehind