-
Notifications
You must be signed in to change notification settings - Fork 15
components/linux: add ConfigureStaticRoutes + CheckRouteOverlap #151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
chokevin
wants to merge
12
commits into
Azure:main
Choose a base branch
from
chokevin:chokevin/configure-static-routes
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
12 commits
Select commit
Hold shift + click to select a range
1c595a3
components/linux: add ConfigureStaticRoutes
chokevin be0d8aa
address review feedback: detect script changes, fail-fast on missing …
chokevin 00f41a2
address Copilot review: resolve dev dynamically, ip -4 route replace
chokevin 03e8dde
components/linux: add CheckRouteOverlap action
chokevin 971393e
components/linux: require explicit opt-in for static routes
chokevin 06abb17
components/linux: address Copilot follow-up findings
chokevin 79973ef
components/linux: simplify generated route scripts
chokevin fa5a4f9
fix(route-overlap): validate mode and skip fork e2e
chokevin ede2b8f
refactor(linux): embed route scripts as bash templates
chokevin a7ae0a9
revert(ci): keep e2e job enabled on PRs
chokevin 764e270
fix(linux): handle default-dev routes and masked probes
chokevin 242eb12
fix(linux): clarify no-route diagnostics and guidance
chokevin File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
15 changes: 15 additions & 0 deletions
15
components/linux/v20260301/assets/check-route-overlap.service
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,15 @@ | ||
| [Unit] | ||
| Description=AKSFlexNode IPv4 route overlap pre-flight check | ||
| After=network-online.target static-routes.service | ||
| Wants=network-online.target | ||
| Before=kubelet.service | ||
|
|
||
| [Service] | ||
| Type=oneshot | ||
| RemainAfterExit=yes | ||
| ExecStart=/bin/bash /etc/aks-flex-node/check-route-overlap.sh | ||
| StandardOutput=journal | ||
| StandardError=journal | ||
|
|
||
| [Install] | ||
| RequiredBy=kubelet.service | ||
54 changes: 54 additions & 0 deletions
54
components/linux/v20260301/assets/check-route-overlap.sh.tpl
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,54 @@ | ||
| #!/bin/bash | ||
| # Generated by AKSFlexNode CheckRouteOverlap. Do not edit. mode={{ .ModeLabel }} | ||
| set -eu | ||
| PATH=/usr/sbin:/sbin:/usr/bin:/bin:${PATH:-} | ||
|
|
||
| mkdir -p /run/aks-flex-node | ||
| rm -f /run/aks-flex-node/route-overlap.detected | ||
| rm -f /run/aks-flex-node/route-overlap.ok | ||
|
|
||
| DEFAULT_DEV=$(ip -4 route show default 2>/dev/null | awk '/^default / {for (i=1;i<=NF;i++) if ($i=="dev") {print $(i+1); exit}}') | ||
| if [ -z "$DEFAULT_DEV" ]; then | ||
| echo "check-route-overlap: no IPv4 default route; cannot determine outbound interface" >&2 | ||
| echo "no-default-route" > /run/aks-flex-node/route-overlap.detected | ||
| exit {{ .FailExit }} | ||
| fi | ||
|
|
||
| {{- if .HasEntries }} | ||
| bad=0 | ||
| while IFS='|' read -r CIDR PROBE; do | ||
| [ -z "$CIDR" ] && continue | ||
| ACTUAL=$(ip -4 route get "$PROBE" 2>/dev/null | awk '{for (i=1;i<=NF;i++) if ($i=="dev") {print $(i+1); exit}}') | ||
| if [ -z "$ACTUAL" ]; then ACTUAL="<no-route>"; fi | ||
| if [ "$ACTUAL" != "$DEFAULT_DEV" ]; then | ||
| if [ "$ACTUAL" = "<no-route>" ]; then | ||
| msg="NO-ROUTE: expected CIDR $CIDR (probe $PROBE) has no IPv4 route; expected via $DEFAULT_DEV" | ||
| else | ||
| msg="OVERLAP: expected CIDR $CIDR (probe $PROBE) routes via $ACTUAL, expected $DEFAULT_DEV" | ||
| fi | ||
| echo "$msg" >&2 | ||
| echo "$msg" >> /run/aks-flex-node/route-overlap.detected | ||
| bad=1 | ||
| fi | ||
| done <<'EOF' | ||
| {{ .Entries }} | ||
| EOF | ||
|
|
||
| if [ "$bad" -eq 1 ]; then | ||
| cat >&2 <<'EOF' | ||
| Action: configure spec.staticRoutes on the NodeClass with more-specific | ||
| routes for the affected CIDRs, or rebuild the cluster on a non-overlapping | ||
| VNet CIDR. For each affected CIDR, add a spec.staticRoutes entry with the | ||
| destination CIDR and next-hop/default gateway for the node's normal outbound interface. | ||
| EOF | ||
| exit {{ .FailExit }} | ||
| fi | ||
|
|
||
| echo "check-route-overlap: all expected CIDRs route via $DEFAULT_DEV" | ||
| touch /run/aks-flex-node/route-overlap.ok | ||
| exit 0 | ||
| {{- else }} | ||
| echo "check-route-overlap: no expected CIDRs configured; nothing to check (default dev: $DEFAULT_DEV)" | ||
| touch /run/aks-flex-node/route-overlap.ok | ||
| exit 0 | ||
| {{- end }} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,16 @@ | ||
| [Unit] | ||
| Description=Install AKSFlexNode static routes | ||
| # Route install must happen before kubelet tries to reach cluster services | ||
| # whose CIDR may otherwise be shadowed by provider-installed connected | ||
| # routes (e.g. the Azure InfiniBand fabric /16 on ND-isr SKUs). | ||
| Before=kubelet.service | ||
| After=network-online.target | ||
| Wants=network-online.target | ||
|
|
||
| [Service] | ||
| Type=oneshot | ||
| ExecStart=/etc/aks-flex-node/static-routes.sh | ||
| RemainAfterExit=yes | ||
|
chokevin marked this conversation as resolved.
|
||
|
|
||
| [Install] | ||
| RequiredBy=kubelet.service | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,59 @@ | ||
| #!/bin/bash | ||
| # Generated by AKSFlexNode ConfigureStaticRoutes. Do not edit. | ||
| set -eu | ||
| PATH=/usr/sbin:/sbin:/usr/bin:/bin:${PATH:-} | ||
|
|
||
| # resolve_default_gw <dev>: prints the default gateway for <dev> after retrying | ||
| # for up to ~30s, in case cloud-init / DHCP has not installed it yet. | ||
| resolve_default_gw() { | ||
| local dev="$1" | ||
| local i gw | ||
| for i in $(seq 1 30); do | ||
| gw=$(ip -4 route show default dev "$dev" 2>/dev/null | awk '/^default via/ {print $3; exit}') | ||
| if [ -n "$gw" ]; then echo "$gw"; return 0; fi | ||
| sleep 1 | ||
| done | ||
| return 1 | ||
| } | ||
|
|
||
| # resolve_default_dev: prints the outbound interface of the IPv4 default | ||
| # route (e.g. eth0, ens3, enp0s6). Retries up to ~30s for DHCP. | ||
| resolve_default_dev() { | ||
| local i dev | ||
| for i in $(seq 1 30); do | ||
| dev=$(ip -4 route show default 2>/dev/null | awk '/^default / {for (i=1;i<=NF;i++) if ($i=="dev") {print $(i+1); exit}}') | ||
| if [ -n "$dev" ]; then echo "$dev"; return 0; fi | ||
| sleep 1 | ||
| done | ||
| return 1 | ||
|
chokevin marked this conversation as resolved.
|
||
| } | ||
|
|
||
| {{- if .HasEntries }} | ||
| DEFAULT_DEV="" | ||
| resolve_default_dev_cached() { | ||
| if [ -n "$DEFAULT_DEV" ]; then echo "$DEFAULT_DEV"; return 0; fi | ||
| DEFAULT_DEV=$(resolve_default_dev) || return 1 | ||
| echo "$DEFAULT_DEV" | ||
| } | ||
|
|
||
| while IFS='|' read -r DEST DEV GW METRIC; do | ||
| [ -z "$DEST" ] && continue | ||
| if [ "$DEV" = "{{ .AutoDevToken }}" ]; then | ||
| DEV=$(resolve_default_dev_cached) || { echo "no default IPv4 route; cannot install route $DEST" >&2; exit 1; } | ||
| fi | ||
| if [ "$GW" = "{{ .AutoGWToken }}" ]; then | ||
| GW=$(resolve_default_gw "$DEV") || { echo "no default gateway on $DEV after 30s; cannot install route $DEST" >&2; exit 1; } | ||
| fi | ||
| if [ "$METRIC" -gt 0 ]; then | ||
| ip -4 route replace "$DEST" via "$GW" dev "$DEV" metric "$METRIC" | ||
| else | ||
| ip -4 route replace "$DEST" via "$GW" dev "$DEV" | ||
| fi | ||
| done <<'EOF' | ||
| {{ .Entries }} | ||
| EOF | ||
| exit 0 | ||
| {{- else }} | ||
| # No routes configured; nothing to do. | ||
| exit 0 | ||
| {{- end }} | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.