test(e2e): add Cilium-CNI e2e suite#491
Open
kvaps wants to merge 9 commits intosquat:mainfrom
Open
Conversation
This was referenced Apr 28, 2026
fbfd0c0 to
dfe4661
Compare
Mirrors e2e/full-mesh.sh and e2e/location-mesh.sh for the new --mesh-granularity=cross mode introduced by the preceding commits. setup_suite annotates the kind nodes into two locations (control-plane and the first worker as loc-a, the second worker as loc-b) so the test exercises the case "cross" is meant to handle: direct WireGuard tunnels between locations, native CNI inside a location. Tests: - test_cross_mesh_connectivity: pings + adjacency matrix - test_cross_mesh_peer: kgctl peer create/showconf - test_mesh_granularity_auto_detect: kgctl graph auto-detection - test_cross_peer_topology: sanity that loc-a nodes see only the loc-b node as a peer (and vice versa), distinguishing "cross" from "full" (where every node is a peer) and "location" (where non-leaders have no peers at all) The new suite is wired into the existing e2e make target between location-mesh.sh and multi-cluster.sh. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
The "cross from {a,b,c,d}" test cases added in 3590b12 predate the
cniCompatibilityIPs field on segment, introduced by Cilium support
in squat#409. Each segment in the cross test cases describes a single
node, so the expected value mirrors the existing full/location
cases: []*net.IPNet{nil}.
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
The cross granularity intentionally removes the WireGuard tunnel between nodes that share a location and relies on the underlying CNI to carry intra-location pod traffic over its own overlay (e.g. Cilium VXLAN). The bridge CNI used by the e2e harness has no such overlay, so check_ping/check_adjacent cannot succeed on this cluster — they were timing out trying to reach the same-location worker. Keep the topology checks (peer count per node, kgctl graph auto-detect, kgctl peer create), which validate the cross routing logic without depending on a CNI overlay. End-to-end connectivity under cross is covered by the Cilium-CNI suite added separately. Also clean up the location annotations in teardown_suite so the suites that follow (multi-cluster, handlers, kgctl) start from the same node-annotation state they used to. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
5e22206 to
82e994d
Compare
Just removing the location annotations leaves the DaemonSet in --mesh-granularity=cross. The handler tests that follow assume the control-plane WireGuard IP is 10.4.0.1 (the leader of a single- location mesh) and time out when cross's per-node leader assignment hands that IP to a different node. Roll the DaemonSet back to --mesh-granularity=location in the teardown so the cluster state mirrors what location-mesh.sh leaves behind, which is the working baseline expected by multi-cluster.sh, handlers.sh and kgctl.sh. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
Adds a new e2e harness that brings up a kind cluster with Cilium as the CNI (VXLAN overlay, default kube-proxy) and runs Kilo on top with --cni=false --compatibility=cilium. This validates the cross granularity + Cilium combination, which is the configuration shipped by downstream platforms but not exercised by the existing bridge-CNI suite. Files: - e2e/kilo-kind-cilium.yaml: Kilo DaemonSet for the Cilium-CNI cluster (no kilo CNI ConfigMap, no install-cni init container, --cni=false, --compatibility=cilium) - e2e/lib.sh: install_cilium() helper (Helm, Cilium 1.16.5, VXLAN tunnelProtocol, IPAM kubernetes, host firewall off) and create_cilium_cluster() - e2e/cilium-setup.sh, cilium-cross-mesh.sh, cilium-teardown.sh: three-stage suite mirroring the existing setup/...mesh/teardown pattern - Makefile: new `e2e-cilium` target, kept separate from `e2e` because the Cilium cluster is incompatible with the Kilo bridge CNI used by the default suite Kube-proxy replacement is intentionally left at the default (off) for this baseline; KPR coverage can be added in a follow-up. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
82e994d to
72eabe0
Compare
Adds a new `e2e-cilium` job that mirrors the existing `e2e` job but runs `make e2e-cilium` against the Cilium-CNI test cluster. Helm is installed via azure/setup-helm because nscloud runners do not have it preinstalled and lib.sh's install_cilium uses the Helm Cilium chart. Without this job the new Cilium e2e harness added in the previous commit is never executed in CI. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
72eabe0 to
2864940
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds an e2e suite that runs Kilo with
--compatibility=ciliumon top ofa kind cluster where Cilium is the CNI. The existing e2e suites only
cover the Kilo bridge CNI path, so the
--compatibility=ciliummode(used by downstream platforms shipping Kilo + Cilium) currently has no
end-to-end coverage in upstream CI.
Depends on
This PR is stacked on top of #490 (which adds the
crossgranularityand its bridge-CNI e2e tests). Once #490 is merged this branch will
rebase down to just the Cilium-specific commit.
What's added
e2e/kilo-kind-cilium.yaml— Kilo DaemonSet for the Cilium-CNIcluster: no kilo CNI ConfigMap, no
install-cniinit container,--cni=false --compatibility=ciliume2e/lib.sh—install_cilium()(Helm, Cilium 1.16.5, VXLANtunnelProtocol,ipam.mode=kubernetes, host firewall off) andcreate_cilium_cluster()mirroringcreate_cluster()e2e/cilium-setup.sh,cilium-cross-mesh.sh,cilium-teardown.sh— three-stage suite mirroring the bridge
setup/...mesh/teardownpattern
Makefile— newe2e-ciliumtarget, kept separate frome2ebecause the Cilium cluster is incompatible with Kilo's bridge CNI
used by the default suite
Scope
This is a baseline. Coverage is intentionally narrow:
crossgranularity is covered;fullandlocationwithCilium CNI are reasonable follow-ups in the same harness.
kubeProxyReplacementis off, so this exercises Kilo'sCilium-overlay handling without entangling Cilium's eBPF service LB.
KPR coverage can be added with a values flag in a follow-up.
Validation
The CI runner needs
helmavailable. Most GitHub-hosted Linux runnershave it preinstalled; if not, a
setup-helmstep will need adding tothe workflow that triggers the new make target. I couldn't run
make e2e-ciliumlocally (Docker Desktop on macOS doesn't reproducethe Linux kind+Cilium path well), so behaviour will first be observed
in upstream CI; tuning (Cilium version pin, MTU, timeouts) may be
needed once it runs there.
Refs