Skip to content

bug: k3s gateway fails on Jetson Orin Nano: iptables nf_tables / legacy conflict makes OpenShell unusable #407

@guinava

Description

@guinava

Agent Diagnostic

Investigated using Claude as coding agent. No OpenShell skills were available locally (fresh install, no repo clone for agent investigation).
What was tried:

Loaded iptables-legacy on host via update-alternatives --set iptables /usr/sbin/iptables-legacy
Loaded all required kernel modules: ip_tables, iptable_filter, iptable_nat, iptable_mangle, xt_comment, xt_conntrack, xt_mark, br_netfilter
Blacklisted nf_tables kernel module and rebooted — confirmed lsmod | grep nf_tables returns empty and host iptables --version reports v1.8.7 (legacy)
Configured cgroupns=host in Docker daemon via nemoclaw setup-spark
Ran openshell gateway stop/destroy and cleaned all containers between attempts

What was found:

With nf_tables loaded: k3s kube-router panics at network_policy_controller.go:412 — RULE_INSERT failed (No such file or directory) because nf_tables extensions inside the container conflict with host iptables-legacy tables
With nf_tables blacklisted: k3s kube-proxy exits with iptables is not available on this host: error listing chain "POSTROUTING" in table "nat": exit status 4: Warning: iptables-legacy tables present, use iptables-legacy to see them — iptables v1.8.10 (nf_tables): Could not fetch rule set generation id: Invalid argument
The container bundles iptables v1.8.10 (nf_tables) internally, which requires the nf_tables kernel module. But when nf_tables is loaded on host, kube-router crashes. This creates an unresolvable conflict on Jetson Orin.

Agent could not resolve this because the iptables binary inside the gateway container (v1.8.10 nf_tables) is incompatible with the Jetson kernel in both states (nf_tables loaded and unloaded). This requires a fix in the gateway container image itself (e.g., bundling iptables-legacy or using --prefer-bundled-bin with legacy mode)

Description

What happened:
openshell gateway start --name nemoclaw (triggered via nemoclaw onboard) fails every time on Jetson Orin Nano. The k3s container exits with either:

nf_tables loaded: panic: network_policy_controller.go:412 — Failed to run iptables command — RULE_INSERT failed (No such file or directory)
nf_tables blacklisted: kube-proxy exited: iptables is not available on this host — iptables v1.8.10 (nf_tables): Could not fetch rule set generation id: Invalid argument

The gateway container bundles iptables v1.8.10 (nf_tables) which is incompatible with the Jetson Orin kernel (5.15.x-tegra) in either configuration.
What I expected:
nemoclaw onboard should complete successfully and start the OpenShell gateway on Jetson Orin Nano, as Jetson is an NVIDIA platform and NemoClaw targets NVIDIA hardware.

Reproduction Steps

REPRODUCTION STEPS

Fresh Jetson Orin Nano with JetPack 6.x (Ubuntu 22.04, kernel 5.15.x-tegra)
Install Docker (default JetPack install)
Install OpenShell:

curl -LsSf https://raw.githubusercontent.com/NVIDIA/OpenShell/main/install.sh | sudo sh

Install NemoClaw:

git clone https://github.com/NVIDIA/NemoClaw.git
cd NemoClaw
sudo ./install.sh

Run sudo nemoclaw setup-spark (configures cgroupns=host)
Run sudo nemoclaw onboard
Step [2/7] "Starting OpenShell gateway" fails at "Initializing environment"

Attempted workarounds (all failed):

update-alternatives --set iptables /usr/sbin/iptables-legacy
Blacklisting nf_tables + reboot
Loading all legacy iptable_* and xt_* modules manually
Combinations of the above

Environment

Device: NVIDIA Jetson Orin Nano
OS: Ubuntu 22.04 (JetPack 6.x)
Kernel: 5.15.148-tegra (aarch64)
Docker: Docker Engine 28.x with cgroupns=host configured
OpenShell: v0.0.7 (installed via official install.sh)
NemoClaw: installed from source (git clone, main branch, March 2026)
Host iptables: v1.8.7 (legacy) after update-alternatives
Container iptables: v1.8.10 (nf_tables) — bundled inside gateway image ghcr.io/nvidia/openshell/cluster:0.0.8
Node.js: v22.22.1

Logs

Attempt 1: nf_tables loaded (default)
F0317 18:20:46.858772  86 network_policy_controller.go:412] Failed to run iptables
command to insert in INPUT chain running [/usr/sbin/iptables -t filter -I INPUT 1 -m comment
--comment kube-router netpol - 4IA2OSFRMVNDXBVV -j KUBE-ROUTER-INPUT --wait]: exit status 4:
Warning: Extension comment revision 0 not supported, missing kernel module?
iptables v1.8.10 (nf_tables): RULE_INSERT failed (No such file or directory): rule in chain INPUT

panic: network_policy_controller.go:412
Attempt 2: nf_tables blacklisted + iptables-legacy on host
Host verification:
$ lsmod | grep nf_tables
(empty)
$ iptables --version
iptables v1.8.7 (legacy)

Gateway error:
time="2026-03-17T18:35:04Z" level=error msg="Shutdown request received: kube-proxy exited:
iptables is not available on this host : error listing chain \"POSTROUTING\" in table
\"nat\": exit status 4: # Warning: iptables-legacy tables present, use iptables-legacy to see
them\niptables v1.8.10 (nf_tables): Could not fetch rule set generation id: Invalid argument\n"

Agent-First Checklist

  • I pointed my agent at the repo and had it investigate this issue
  • I loaded relevant skills (e.g., debug-openshell-cluster, debug-inference, openshell-cli)
  • My agent could not resolve this — the diagnostic above explains why

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions