Routed inference falls back to direct provider calls when Hermes agent bypassing network policy egress controls #5464
-
|
I'm running NemoClaw with Hermes as the active agent via NEMOCLAW_AGENT=hermes and I noticed through packet inspection that some inference calls are going directly to the upstream provider endpoint rather than through NemoClaw's inference router. This seems to correlate with credential refresh cycles. My concern is that if inference traffic is bypassing the router, it's also bypassing whatever egress network policy rules I've configured. Is the router expected to be in the call path 100% of the time, or are there legitimate fallback conditions where it steps aside? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
|
It is likely due to credential environment propagation during the onboard flow. You can reference the recent fix in fix(inference) prs to address a case where the inference router was not correctly receiving the credential environment variables when the agent was invoked through certain shell paths. The router is intended to be in the call path unconditionally. To verify whether the router is actually mediating your calls, check the router process logs directly rather than relying on onboard validation passing, since validation only confirms the router responds at startup and not that every subsequent call routes correctly. Make sure the NVIDIA_INFERENCE_CREDENTIAL environmenvt variable is visible inside the sandbox at runtime and not just during the onboard phase. If you are on a version prior to the credential env fix, pulling the latest and re-running onboard should resolve the intermittent direct-egress behavior. If it persists after updating, open a GitHub Issue with your platform details and a sanitized version of your network policy config since this would indicate a different code path is triggering the fallback |
Beta Was this translation helpful? Give feedback.
It is likely due to credential environment propagation during the onboard flow. You can reference the recent fix in fix(inference) prs to address a case where the inference router was not correctly receiving the credential environment variables when the agent was invoked through certain shell paths. The router is intended to be in the call path unconditionally.
To verify whether the router is actually mediating your calls, check the router process logs directly rather than relying on onboard validation passing, since validation only confirms the router responds at startup and not that every subsequent call routes correctly. Make sure the NVIDIA_INFERENCE_CREDENTIAL environmenvt variable i…