Skip to content

[Bug]: ENOTSUP when acquiring session lock file: implementation uses hard link (fs.link), breaks on SMB/NFS/virtiofs #81089

@david-garcia-garcia

Description

@david-garcia-garcia

Bug type

Crash (process/app exits or hangs)

Beta release blocker

No

Summary

Summary

When OpenClaw refreshes or coordinates work around the on-disk cache for aggregated session usage / cost data, it acquires a lock file under the agent sessions directory (e.g. ~/.openclaw/agents/<agentId>/sessions/). That lock is implemented with fs.promises.link() (a POSIX hard link) from a uniquely named temp file to the final *.lock path.

On several common filesystem and mount types, link(2) is not supported and Node reports ENOTSUP, so lock acquisition fails outright instead of competing cleanly for the lock.

Error (example)

Error: ENOTSUP: operation not supported on socket, link '/home/node/.openclaw/agents/main/sessions/.usage-cost-cache.json.lock.<pid>.<hrtime>.tmp' -> '/home/node/.openclaw/agents/main/sessions/.usage-cost-cache.json.lock': code=ENOTSUP

(Paths vary by agentId and user; the important part is link*.lock and code=ENOTSUP.)

Expected behavior

Lock acquisition should work on typical local POSIX filesystems and common container writable layers, and should use a portable primitive (or fallback) on environments where hard links are unavailable (SMB/CIFS, some NFS setups, virtiofs/9p shared folders, certain FUSE stacks, etc.).

Actual behavior

link() fails with ENOTSUP, so the refresh lock cannot be created and dependent code paths error.

Root cause

In src/infra/session-cost-usage.ts, writeUsageCostCacheLockAtomically:

  1. Writes the lock payload to a temp file with exclusive create (writeFile + flag: "wx").
  2. Calls fs.promises.link(tempPath, lockPath) so concurrent processes get EEXIST if the lock already exists.

Step 2 requires hard-link support. Many network and shared-folder filesystems return ENOTSUP for link.

The lock file sits next to the usage cache: .usage-cost-cache.json.usage-cost-cache.json.lock (see resolveUsageCostCacheLockPath / USAGE_COST_CACHE_FILE in the same module).

Note: The "operation not supported on socket" fragment in the message is often misleading; the salient point is ENOTSUP on link, not that the path is a socket.

Environments where this shows up

  • Docker (or similar) with a bind mount to a host directory on SMB, exFAT, or another FS without hard links.
  • VM shared folders (virtiofs, 9p).
  • NFS setups where link is unsupported or rejected.

Suggested fix directions (for maintainers)

  1. Portable exclusive create: open(lockPath, O_CREAT | O_EXCL | O_WRONLY) (or equivalent); on EEXIST, treat as “busy” and reuse existing stale-lock cleanup logic.
  2. Fallback: try link first, on ENOTSUP / known unsupported errors fall back to O_EXCL create.
  3. Docs: if behavior can’t be fully unified, document that ~/.openclaw (or at least agents/*/sessions) must live on a filesystem that supports hard links until the implementation changes.

Workaround (operators)

Until fixed: keep ~/.openclaw on a normal local filesystem inside the container/VM (e.g. a Docker volume on ext4/xfs), not an SMB/NFS/host shared-folder mount.

Related surface

Failure is at lock creation for the usage-cost cache refresh, not necessarily at reading/writing .usage-cost-cache.json itself (cache body uses replaceFileAtomic from @openclaw/fs-safe/atomic).

Steps to reproduce

Steps to reproduce (ENOTSUP / hard-link session lock)

  1. Use a directory on a filesystem that does not support hard links (any one of these is enough):

    • Docker Desktop (Windows/macOS): bind-mount host data into the container where the host path is on SMB/CIFS, NTFS accessed via a share, or another limited FS.
    • Linux: place (or bind-mount) ~/.openclaw on SMB (//server/share/...), NFS with restricted link semantics, or exFAT.
    • VM: guest ~/.openclaw on a virtiofs / VirtualBox shared folder / 9p mount.
  2. Optional sanity check on that same directory (proves the FS, not OpenClaw):

    mkdir -p /path/to/openclaw-test
    echo test > /path/to/openclaw-test/a && ln /path/to/openclaw-test/a /path/to/openclaw-test/b || true

    If ln fails with “Operation not supported” (or similar), you are in the failing class of filesystems.

  3. Point OpenClaw’s state at that directory, e.g.:

    • Container: -v /your/smb/mapped/path:/home/node/.openclaw (adjust target path to the user your image runs as).
    • Bare metal: ensure the agent’s sessions tree resolves under that mount (typically ~/.openclaw/agents/<agentId>/sessions/).
  4. Trigger a usage-cost cache refresh, e.g. open any UI/API path that loads cost or usage summaries from cache with refresh, or any flow that refreshes the on-disk usage cache for session transcripts (exact surface depends on OpenClaw version; gateway plus session usage/cost views are typical).

  5. Observe logs or stderr for an error containing:

    • link
    • .usage-cost-cache.json.lock
    • ENOTSUP
      (wording may include “operation not supported on socket”.)

Example error:

Error: ENOTSUP: operation not supported on socket, link '.../.usage-cost-cache.json.lock.<pid>.<hrtime>.tmp' -> '.../.usage-cost-cache.json.lock': code=ENOTSUP

Expected behavior

No dependency on filesystem specifics

Actual behavior

Crash

OpenClaw version

v2026.5.10-beta.5

Operating system

Ubuntu 24.04

Install method

custom

Model

N/A

Provider / routing chain

N/A

Additional provider/model setup details

No response

Logs, screenshots, and evidence

Error: ENOTSUP: operation not supported on socket, link '/home/node/.openclaw/agents/main/sessions/.usage-cost-cache.json.lock.167.222258340581245.tmp' -> '/home/node/.openclaw/agents/main/sessions/.usage-cost-cache.json.lock': code=ENOTSUP

Impact and severity

No response

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Normal backlog priority with limited blast radius.bugSomething isn't workingbug:crashProcess/app exits unexpectedly or hangs

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions