Testing

Honest disclosure: Scarf's test coverage is minimal. The scarfTests/ and scarfUITests/ targets exist but contain placeholder tests only. The project has historically relied on dogfooding — the maintainer runs Scarf against their own daily Hermes install and against a remote dogfooding host.

This page documents what's in place and where contributions would help most.

Frameworks

When tests are added, the standard is Swift Testing (@Suite / @Test macros), not XCTest. Per the project conventions (CLAUDE.md):

Use @Suite and @Test macros for all new tests.
Protocol-oriented services for testability — the ServerTransport protocol is the obvious mocking seam.
No timing-dependent tests: use polling with early exit, not Task.sleep + assertion.
Singleton state isolation: call cleanup methods + await Task.yield() before assertions.
No print() in production code — use os.Logger. print() is fine in #Preview and test helpers.

Running

xcodebuild test -project scarf/scarf.xcodeproj -scheme scarf

Or in Xcode: ⌘U.

What would be high-value to add

If you're looking for a contribution, these are the gaps that would matter most:

ServerTransport mock + LocalTransport smoke tests — every service depends on transport, so a MockTransport unlocks unit-testing all of them.
HermesEnvService round-trip tests — non-destructive .env editing has tricky comment / blank-line preservation; would benefit from regression coverage.
ACPClient JSON-RPC framing tests — feed canned JSON-RPC byte streams and assert events emitted.
HermesPathSet path-resolution tests — local vs. remote home, binary hint precedence.
HermesConfig decoding tests — load representative config.yaml fixtures and check field mapping.
SSHTransport shell-quoting tests — shellQuote and remotePathArg are correctness-critical and pure functions.

Manual verification flows

For any change that touches behavior, here's the manual checklist the maintainer runs before tagging a release:

Open a local window — Dashboard loads, Sessions browser populates, Memory editor opens.
Open a remote window — same Dashboard / Sessions / Memory, but against the dogfooding host.
Send a Rich Chat message — receives a streamed response, reasoning shows if the model emits it.
Edit and save a memory file — change appears in Hermes on next agent turn.
Run a Cron job — appears in the Cron view, has correct delivery channel.
Toggle a tool in Tools — hermes tools enable/disable runs and the dot color updates.

Why so little automated coverage?

The app is a thin GUI over Hermes — most behavior depends on (a) the OS file system, (b) SQLite, (c) SSH, (d) a long-running subprocess speaking JSON-RPC. Mocking these well is non-trivial and historically the cost has been higher than the bug rate justified. The transport protocol now makes it cheaper; the gap is finally worth filling.

Last updated: 2026-04-20 — Scarf v2.0.1

Wiki edited via the local .wiki-worktree/ clone. See Wiki Maintenance for the workflow. Last sync: 2026-04-20.

Getting Started

ScarfGo (iOS)

User Guide

Architecture

Developer Guide

Reference

Troubleshooting

Slow Chat Startup

Contributing

Release History

Legal & Support

Unsorted

Uh oh!

Testing

Testing

Frameworks

Running

What would be high-value to add

Manual verification flows

Why so little automated coverage?

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally