Skip to content

AgentHarness with openclaw backend fails all LLM calls — inference provider never attached to sandbox #1965

@nloke

Description

@nloke

Problem

When deploying an AgentHarness using the openclaw backend, the agent starts and Slack connects successfully, but all LLM calls fail with a Connection error. The agent is effectively a dead bot.

Background: OpenShell Security Model

openclaw uses an HTTP proxy (HTTPS_PROXY=http://10.200.0.1:3128) to resolve credentials at request time. The openclaw.json config stores an unresolved placeholder:

"apiKey": "openshell:resolve:env:OPENAI_API_KEY"

When openclaw (running as UID 998) makes an LLM call, the proxy intercepts the Bearer openshell:resolve:env:OPENAI_API_KEY header, looks up OPENAI_API_KEY from the sandbox's attached inference provider, and replaces it with the real key before forwarding to the upstream LLM gateway.

Root Cause

ClawBackend.EnsureAgentHarness calls attachMessagingProviders to attach Slack providers but never creates or attaches an inference provider. The sandbox is created without one — the proxy has nothing to resolve and closes the connection.

BuildBootstrapJSON (in openclaw/bootstrap.go) correctly writes openshell:resolve:env:OPENAI_API_KEY into openclaw.json, but the corresponding provider that the proxy needs to resolve it is never registered in OpenShell.

Steps to Reproduce

  1. Create a ModelConfig pointing to an OpenAI-compatible gateway with an API key stored in a k8s secret
  2. Deploy an AgentHarness with backend: openclaw
  3. Wait for AgentHarness to reach Ready=True
  4. Send a message to the bot via Slack
  5. Observe Connection error in openclaw logs — LLM call never completes

Expected Behaviour

The sandbox should be created with an inference provider attached so the proxy can resolve OPENAI_API_KEY at request time and forward the LLM call successfully.

Additional Note

There is a related issue with AgentHarness phase reporting (phase=UNSPECIFIED instead of a usable phase) — see #1958. Both issues must be resolved for a functional openclaw deployment.

Proposed Fix

See draft PR #1964 for a working implementation. The fix wires upsertInferenceProviderForHarness into EnsureAgentHarness — it reads the ModelConfig, resolves the API key from the referenced k8s secret, upserts the provider via OpenShell gRPC, and passes it into attachMessagingProviders so the sandbox is created with the provider attached. The sandbox process never holds the real key, preserving the OpenShell security model.

Environment

  • kagent commit: 940c478
  • openclaw: v2026.5.4
  • Backend: openclaw (OpenShell)
  • LLM gateway: OpenAI-compatible (LiteLLM)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions