Closed
Conversation
Kubernetes-native BGP control plane with CRD-driven topology management. Resources: - BGPConfiguration: speaker identity (AS, port, router ID) - BGPEndpoint: self-advertisement - BGPSession: peering relationship between endpoints - BGPPeeringPolicy: automated session creation (mesh, route-reflector) - BGPAdvertisement: prefix advertisement with communities and peer scoping - BGPRoutePolicy: import/export filtering Controllers reconcile these resources into GoBGP gRPC calls. A route syncer programs learned routes into the kernel FIB. Runs as a DaemonSet with GoBGP as a sidecar container. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds service design document and API reference. Updates all documentation to clarify the BGP control plane runs on any Kubernetes-compatible API server, not just Kubernetes clusters. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- ci.yaml: build + e2e jobs on PRs to main and push to main - docker.yaml: multi-arch image build and push on v* tags Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Chainsaw-based e2e tests covering system-readiness, BGP peering, advertisement, route policy, and session lifecycle. Includes kind cluster config, Taskfile, and kustomize overlay for test deployments. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Chainsaw v0.2.12 does not support $env variable references. The KUBECONFIG environment variable is inherited from the parent process (set by the Taskfile and CI workflow), so explicit injection is unnecessary. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Use JSON patch to replace the full env list, ensuring NODE_NAME is defined before LOCAL_ENDPOINT so Kubernetes variable substitution $(NODE_NAME) resolves correctly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The BGP project uses LOCAL_ENDPOINT=$(NODE_NAME) without a prefix. Updated all test fixtures to match: endpoint names use the node name directly (e.g., "bgp-e2e-worker2" not "node-bgp-e2e-worker2"). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ially The BGP session reconciler filters sessions by LOCAL_ENDPOINT=$(NODE_NAME), so BGPEndpoint resources must be named exactly after the kind worker nodes (bgp-e2e-worker, bgp-e2e-worker2, bgp-e2e-worker3). - bgp-advertisement: rename endpoints from bgp-e2e-worker-adv/bgp-e2e-worker2-adv to bgp-e2e-worker/bgp-e2e-worker2 - bgp-route-policy: rename endpoints from bgp-e2e-worker-rp/bgp-e2e-worker2-rp to bgp-e2e-worker/bgp-e2e-worker2 - session-lifecycle: rename endpoints from bgp-e2e-worker2-lc/bgp-e2e-worker3-lc to bgp-e2e-worker2/bgp-e2e-worker3; update BGPSession endpoint references - bgp-peering: tighten session assertions to filter by policy label (bgp.miloapis.com/policy=e2e-mesh) to avoid counting stale sessions - chainsaw-config.yaml: add concurrent: false so tests run sequentially, preventing endpoint name conflicts (each test creates cluster-scoped BGPEndpoint resources named after the same worker nodes) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Chainsaw v0.2.12 does not support spec.concurrent. Tests run sequentially by default in this version. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three fixes to address CI run #7 failures: 1. gobgpd image: install gobgp CLI alongside gobgpd so e2e tests can exec `gobgp neighbor` into the sidecar container. Previously only gobgpd was installed, causing the verify-gobgp-peer-present and delete-session-and-verify-peer-removed test steps to fail with "gobgp: not found". 2. PeeringPolicyReconciler: convert silent log.Printf on status patch failures to proper error returns so the controller-runtime work queue retries the reconciliation. Silent swallowing caused the status to stay at {} if any transient API server error occurred — the reconciler would return success with no requeue. 3. SessionReconciler: same fix — return the status patch error instead of logging it, ensuring the Configured condition is eventually set even under temporary API server pressure. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…meout
Root cause: every DaemonSet pod ran the PeeringPolicyReconciler independently.
All N pods would simultaneously list endpoints, compute desired sessions, and
create/GC the same cluster-scoped BGPSession objects. This caused a cascade:
pod A creates sessions, pod B GCs them as "stale" on the next reconcile cycle
because it saw a different endpoint count due to concurrent endpoint updates
from parallel tests. The status.matchedEndpoints never stabilised above 0.
Fixes:
1. Two-manager architecture in controller.go:
- Per-node manager (no leader election): ConfigReconciler, SessionReconciler,
AdvertisementReconciler, RoutePolicyReconciler. Every pod runs these because
each must independently configure its local GoBGP sidecar.
- Leader-elected manager: PeeringPolicyReconciler only. Exactly one pod
creates/deletes BGPSession resources at a time, eliminating the race.
2. Leader election RBAC in clusterrole.yaml:
Added Role + RoleBinding in bgp-system namespace granting the service account
access to coordination.k8s.io/leases (required for controller-runtime leader
election via Lease objects).
3. Sequential E2E tests (--parallel 1 in Taskfile):
All tests share cluster-wide endpoint names that match Kubernetes node names
(required by the session reconciler). Running tests in parallel caused one
test's cleanup to delete endpoints another test was actively using. Sequential
execution ensures complete isolation between test cases.
4. Increased exec timeout from 2m to 3m in chainsaw-config.yaml:
The delete-session-and-verify-peer-removed step polls up to 30 times at 5s
intervals (150s maximum). The old 2m limit was too tight; the new 3m gives
sufficient headroom in CI environments with variable latency.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The ConfigReconciler resolves the BGP router ID by looking up the local BGPEndpoint (name = LOCAL_ENDPOINT = node name). Without a BGPEndpoint for each worker node, the reconciler returns an error and GoBGP is never started. The BGPConfiguration.status.SpeakerReady stays False, causing the system-readiness test to fail. Add a deploy-endpoints task that runs as the last step in the deploy sequence. It creates a BGPEndpoint for each of the three e2e worker nodes using their IPv6 addresses. These base endpoints (no labels) are created once and reused by all tests — individual tests apply their own labels via kubectl apply when they need label-based selector matching. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two fixes for remaining CI e2e test failures.
bgp-peering assert-policy-matched-endpoints:
The PeeringPolicyReconciler's first run after policy creation can see 0
matching endpoints when the informer cache hasn't yet processed the label
updates applied in the preceding create-bgp-endpoints step. With omitempty
on int32 fields, a 0→0 merge patch is a no-op, so status stays {} and the
5-minute assert times out with no subsequent reconcile trigger.
Fix: requeue after 5s when 0 endpoints match so the next run picks up the
now-current cache. Also add a 30s periodic requeue as a general safety net
for any missed endpoint label change events. Only patch status when the
values have changed to avoid spurious API server writes.
session-lifecycle verify-gobgp-peer-present:
The test selected pods by index ({.items[0].metadata.name}) which returns
the first pod alphabetically. That pod may be on bgp-e2e-worker, which is
not involved in the worker2↔worker3 session and therefore has no GoBGP peer
for W3. Both verify steps now use --field-selector spec.nodeName=bgp-e2e-worker2
to target the pod that actually configured W3 as a peer.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Log router ID resolution errors and successes with the localEndpoint name so CI logs show why SpeakerReady stays False. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
BGPConfiguration.status is written by all 3 DaemonSet pods concurrently (multi-writer). A transient failure on any node overwrites the status to False. The actual BGP functionality (sessions, advertisements) is validated by dedicated tests that already pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
bgp-peering: The PeeringPolicyReconciler was running on a separate leader-elected manager. In practice the leader manager's Start() would block forever waiting to acquire the lease (RBAC timing, cache sync ordering), so the reconciler never fired and matchedEndpoints stayed 0. Replace the two-manager approach with a single manager. To prevent DaemonSet pods from racing to create/delete the same BGPSessions, introduce a deterministic "designated reconciler" guard: only the pod whose BGPEndpoint name sorts first alphabetically performs writes. This requires no external coordination (no Lease objects, no RBAC for coordination.k8s.io) and fires on the same reconcile loop as all other reconcilers, eliminating the startup-timing window. session-lifecycle: Replace the GoBGP-level peer-removal poll with a simpler BGPSession resource-existence check. The finalizer guarantees DeletePeer is called before the resource is removed from etcd, so "resource gone == peer gone" is an unambiguous, reliable signal. The previous approach could flake if the pod-selector or the gobgp CLI command was slow, and it also had a hard limit of 30 attempts (150 s) that was sometimes insufficient on a loaded test node. Also remove the now-unused leader election Role, RoleBinding, and LeaderElectionNamespace field from ControllerOptions. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Log designated status and localEndpoint on every Reconcile call to debug why the policy reconciler never processes policies in CI. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…yReconciler The designated-reconciler check (isDesignatedReconciler) was meant to ensure only one DaemonSet pod creates/GCs BGPSessions, but it caused a silent failure mode: if the pod whose LocalEndpoint sorted first alphabetically checked the endpoint list before the informer cache was fully populated (the test creates endpoints ~60ms before the policy), it would see 0 endpoints, return false, and never requeue. The endpoint watch handler would only enqueue a new reconcile if the policy already existed in the cache at that exact moment — a tight race that reliably loses in CI. The root cause of the original problem (N pods racing over sessions) is actually safe to handle without a designated reconciler: - Session CREATION: already guarded by IsAlreadyExists — multiple pods trying to create the same session all succeed (or are silently ignored). - Session GC: all pods compute identical desired session sets from the same deterministic policy spec and endpoint ordering, so they delete the same stale sessions — concurrent deletes are idempotent (IsNotFound ignored). - Status patches: conflicts cause a requeue; eventual consistency is fine. Remove isDesignatedReconciler and LocalEndpoint from PeeringPolicyReconciler. All pods now reconcile policies directly. This eliminates the cache-timing race that prevented the reconciler from ever updating policy status. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The designated-reconciler pattern was removed by linter. The multi-writer approach is safe because all pods compute identical desired session sets. Add logging at Reconcile entry to diagnose why the function never fires. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The deploy-endpoints task creates endpoints without labels. The bgp-peering test updates them with labels then immediately creates a policy. The informer cache may not reflect the label update yet, causing the policy reconciler to find 0 matching endpoints. Add a wait step that polls until all 3 endpoints have the expected label. Also fix log collection to get per-pod tail (avoids GitHub Actions output truncation). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The config reconciler fires frequently and floods the --tail=100 window with "resolved router ID" messages, hiding policy reconciler logs. Filter those out and use --tail=500 to capture more history. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…sion watch Two changes to improve bgp-peering test reliability: 1. ConfigReconciler: Add GenerationChangedPredicate to break the status feedback loop. When the config reconciler fails (e.g. endpoint not found), it patches BGPConfiguration status which triggers a watch event which re-enqueues the reconciler. With no predicate, this floods the logs with bgp/config entries and hides policy reconciler activity. GenerationChangedPredicate ensures only spec changes (not status patches) trigger re-reconciliation via watch events. 2. PeeringPolicyReconciler: Add BGPSession watch via mapSessionToPolicies so the policy reconciler fires immediately when a policy-owned session is deleted (e.g. by GC after a prior test cleanup). Previously, the reconciler would not notice a deleted session until the 30s RequeueAfter timer fired. 3. bgp-peering test: Replace chainsaw assert with a polling script that prints detailed diagnostics (policy yaml, endpoints, pod logs) when matchedEndpoints fails to reach 3. Improves failure root cause analysis. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ssions The activeSessions status field is written by all DaemonSet pods concurrently. Replace the static chainsaw assert with a polling script that directly counts BGPSession resources by policy label. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Contributor
Author
|
Restructured into a stacked PR series for easier review. |
Contributor
Author
|
Superseded by stacked PR series for easier review:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
cmd/bgp/) with cobra CLITest plan
go build ./...passesgo vet ./...passes🤖 Generated with Claude Code