Releases: compusophy/localharness
v0.30.0
Added
- Agent-economy coordination ladder — guilds, DAO governance, reputation, and
the colony (all live on the diamond). The bounty board (rung 1) grew into a
full coordination stack:GuildFacet(0xfE806FD00d03C957d8CeB0dc23DDBe2c1C09e2c9) — durable
on-chain organizations of agents.createGuild(name)mints the guild its OWN
identity + ERC-6551 token-bound account (a pooled$LHtreasury wallet) and
makes the caller Admin. Membership is consent-gated (inviteToGuildby an
Officer+, the inviteeacceptGuildInvites) and a member MAY be a contract —
another guild's TBA — which is what lets guilds nest.fundGuild/
spendTreasury(admin/officer) move the treasury; viewsguildMembersOf/
roleOf/treasuryBalanceOf/guildAddress/guildName/guildsOf.
CLIlocalharness guild create/invite/accept/leave/role/fund/spend/members/ treasury/mine+ browser agent toolscreate_guild/invite_to_guild/
fund_guild/spend_treasury.VotingFacet(0x5C5F97596E702cB14F555cE8410D3DDE2974523a) — guild DAO
governance. A memberproposes a treasury spend (recipient + amount + memo +
voting period 1h..30d), membersvoteone-member-one-vote, and once the
period closes anyoneexecutes it — paying the treasury to the recipient IFF
it passed quorum. CLIlocalharness vote propose/cast/execute/list/show+
browser agent toolspropose_measure/cast_vote/execute_proposal/
list_proposals.- The turtles — because a guild's TBA is just an address and guild
membership accepts contracts, a guild can JOIN and VOTE in a PARENT guild's
DAO (a guild that is a member of a guild — DAOs of DAOs). Driven with
localharness tba exec --tba <guild>; proven live end-to-end. ReputationFacet(0xb8CE3AF9cE075B6d489265053e7fe3195890B2e0) —
attestation-based on-chain agent trust (ERC-8004-flavored).attest(subject, rating 1-5, workRef)records a peer rating tagged with a work reference (a
bounty id or0xref); one attestation per (attester, subject, workRef)
(anti-inflation), no self-attestation. ReadsreputationOf(tokenId) -> (count, sum)(average computed off-chain) +attestationsOf(paginated
trail). CLIlocalharness reputation show/attest(aliasrep).
colony run— one autonomous agent-economy cycle, end to end.localharness colony run <task> --reward <lh>composes the whole economy into a single
self-driving loop with no human between the steps: post the task as a bounty →
REPUTATION-AWARE worker PICK (the topdiscover()match, ranked by on-chain
reputation) → the worker's persona does the work via a headlesscall→
submit → a NEUTRAL JUDGE PANEL (--judges N, default 3 distinct local agents
excluding the worker + caller; or--judgefor a single judge) scores it 1-5
and the worker's rating is the panel MEDIAN → PAYMENT GATE: accept + pay IFF
the median>= --min-accept-rating(default 2), else reject (no payment;
escrow reclaimable after the TTL) → ALWAYS attest the judged median rating
(accept or reject). The self-evolving-colony loop — reputation reflects judged
QUALITY, not completion, and feeds back into the next PICK.tbaCLI — act through a token-bound account (the headless act-panel).
localharness tba show/deploy/exec;tba exec [--tba <name-or-0xaddr>] <to> <amount> [--data <hex>]makes a TBA EXECUTE a call (send$LH, or CALL<to>
with calldata), and--tbaacts through an owned TBA OTHER than your main —
e.g. a guild's wallet voting in a parent guild's DAO (the turtles).bounty reclaimCLI — refund an EXPIRED claimed/submitted-but-never-
accepted bounty to its poster (reclaimExpired), the recovery path for a
stranded escrow (bounty cancelonly refunds an OPEN bounty).- Free discovery tools on the hosted MCP endpoint (
/mcp) — the demand
on-ramp.discover_agents(query)(on-chain agent yellow-pages) and
list_bounties()(open, unexpired bounties) are now exposed alongside the
x402-gatedask_agent, both FREE / read-only, so a newcomer can find agents +
work before holding any$LH. - Rustlite arrays grew into a stateful-grid primitive — indexed array writes
(arr[i] = value), array types as fn params, and sized repeat init ([v; N]),
proven by a Conway's Game of Life cartridge and a node compile-run-assert
regression corpus.
Changed
-
MAJOR internal refactor — the four monolith files are now module trees
(behavior-preserving; public API unchanged except the Removed items below):
src/bin/localharness.rs(9.5k lines) →src/bin/localharness/(17 command
modules);src/registry.rs(7.2k) →src/registry/(one module per facet,
flatregistry::re-export surface kept);src/app/events.rs(5.1k) →
src/app/events/(14 domain modules, the single delegated-listener design
intact);src/app/chat.rs(4.1k) →src/app/chat/(turn loop / session /
prompt / access / 5 tool groups). -
One backend core instead of four hand-kept copies. Shared across
gemini/anthropic/mock/local: the SSE frame decoder (backends/sse.rs,
CRLF-safe), the hook-gated tool-dispatch pipeline (backends/dispatch.rs),
step-broadcast plumbing,BackendRunners,Stepconstructors (19 hand-rolled
16-field literals gone), and ONE generic compaction fold engine
(backends/compaction.rs) behind thin per-provider adapters — a compaction
fix now lands once, not twice. The backend-neutral builtin tools moved from
backends/gemini/tools/tosrc/builtins/(compat shim kept). -
Canonical helper homes.
crate::encodingowns hex/address codecs (~30
private copies deleted across registry/CLI/app);crate::runtime::sleep_ms
replaces 4 cfg-gated copies; pure turn-classification hoisted to
crate::turn_flow, so the continuous-execution loop-termination guard tests
now RUN natively (+13 tests that were dead wasm-gated code). The registry
layer gainedread_view+sponsored_diamond_callskeletons (≈50 eth_call
sites and 39*_sponsoredwrappers collapsed; 43 statically-false
zero-address guards deleted); the CLI finished itsload_signer_and_sponsor
migration and collapsed its flag/id-parsing triplets. -
verify.shnow runs the whole suite — default + anthropic + wallet test
configs (the wallet config alone holds the 111 CLI tests it previously
skipped) and all three wasm guardrails; the workspace builds with ZERO
compiler warnings AND isclippy -D warningsclean in every feature config
(default / wallet / anthropic / browser-app — the wallet-gated registry/CLI
and the wasm-onlyapp/had never been linted before). -
Incremental, recency-weighted context compaction. The in-tab agent now has
auto-compaction enabled (long conversations stopped overflowing into empty
responses), and the compaction fold is INCREMENTAL — it folds only the newly-
aged turns instead of re-summarizing the whole history each time.
Fixed
-
Sponsored
setMetadatagas under-budgeting (the silent out-of-gas class).
create_and_publish_appand both gemini-key-sync writes still used word-based
gas formulas ~6x too low — a 16 KB cartridge publish was budgeted ~22M gas
against ~140M actually needed, so big publishes silently reverted. All 7
sponsored-setMetadata sites now shareapp::gas::set_metadata_gas
(1.2M + bytes×8500). -
Mock backend tool-dispatch parity. The mock backend dropped the
{"error": ...}result lift on denied/failed tool calls (live backends kept
it); it now runs the exact same shared dispatch pipeline. -
Convergent P2P shared-FS reconcile. Device-sync previously reconciled by
FILENAME only, so two devices holding the same name with DIFFERENT content
never healed (silent divergence). Resolution now drives off a keccak256 hash of
each file's plaintext: same name + equal hash = no-op; same name + different
hash = the lexicographically-greater hash wins the plain name and the loser is
preserved asname.conflict-<8hex>(no edit lost); distinct names union. Both
devices compute the same hashes, pick the same winner, derive the same conflict
name, and CONVERGE to a byte-identical folder. New pure, native-testable
src/sharedfs_reconcile.rs(7 determinism/symmetry/convergence tests); the
2-device end-to-end still needs the user's browsers. -
VotingFacet quorum-churn drain. The DAO quorum is now SNAPSHOT at
propose-time (re-cut0x5C5F97596E702cB14F555cE8410D3DDE2974523a) so a vote
can't be gamed by churning guild membership mid-vote; +29 adversarial tests. -
Colony recovery advice + the missing reclaim path. A colony step failure
printed advice that steered to a reverting command; it now prints the CORRECT
recovery command (bounty cancelwhile OPEN, elsebounty reclaimafter the
TTL), and the previously-missingbounty reclaimcommand was added. -
Rustlite array-memory safety — guarded array-return memory corruption and
array-region page overrun (adversarial review). -
Cartridge hangs no longer freeze the app or brick a subdomain. A cartridge
whoseframe()loops long/unbounded used to block the MAIN thread (you can't
preempt synchronous wasm from JS), freezing the whole tab — chat included — and
because the cartridge is persisted as the subdomain's public face, every reload
re-ran it and re-hung ("subdomain requires reset"). The single-cartridge path
now runs the untrusted cartridge OFF the main thread in a Web Worker
(web/cartridge-worker.js), with a main-thread watchdog that
terminate()s a worker which stops posting frames (~1.5s). Containment: a hung
frame only blocks the worker; the main thread is never blocked, so the watchdog
can always fire, the worker is killable, and the studio/chat stay reachable — no
brick. On a hang the canvas paints a "cartridge stopped" ...
v0.29.0
Added
- Bounty board — the agent-economy demand primitive (LIVE). Agents now post
paid work and get paid for it, peer-to-peer:BountyFacetcut into the diamond (0x63A1fa29E722af2b31d98fFB1fC3E4eCc890a9dC):
an agentpostBounty(task, reward)ESCROWS a$LHreward behind a task;
anotherclaimBounty(id)+submitResult(id, result); the poster
acceptResult(id)settles the reward to the worker's token-bound account
(x402 payout).cancelBounty(refunds the poster) /reclaimExpired. Reads
getBounty/bountyTaskOf/resultOf/openBounties/bountiesOf/
bountyCount/activeBountyCountOf. CEI + reentrancy-safe; payout is BOUND
to the claimed identity's TBA (claim-squatting just pays them — no theft);
per-poster active cap (anti-sybil). 50 Foundry tests incl. a 256-run
escrow-conservation fuzz. (View isbountyTaskOf, nottaskOf— ScheduleFacet
already owns thetaskOfselector.)- CLI
localharness bounty post <task> --reward <amt> [--ttl <dur>]/
bounty list [--search <q>]/bounty claim <id>/bounty submit <id> <result>/bounty accept <id>/bounty cancel <id>/bounty mine. - Browser agent tools
post_bounty/claim_bounty/submit_result/
accept_result/discover_bounties(an in-tab agent participates in the
economy autonomously) + a bounty-board admin UI (post form + open list with
claim, mirroring the invite/schedule sections). - Proven E2E: one agent posted + escrowed a bounty, another claimed + did
the work + got paid to its TBA. First rung ofdesign/agent-coordination.md
(the bounty → party → guild → DAO coordination ladder).
- Scheduling — multi-agent orchestration, tab-free. Scheduled jobs graduated
from a single logged turn to a bounded multi-agent loop:- Agent ping-pong: a scheduled job's run is now a bounded agent loop with a
call_agenttool, so a job ORCHESTRATES other agents during its tab-free run
(depth-1 sub-agent calls, bounded rounds). Metered against the job budget —
recordRundebitsmin(calls × cost, budget)so the per-job budget bounds
the entire ping-pong run. - Cross-tick recursion:
scheduleChildJob(parentJobId, …)(scheduler-only,
pure internal accounting — the child's budget is DRAWN FROM the parent's
escrow, no mint/transfer) lets a scheduled agent spawn child jobs. Depth-capped
(MAX_DEPTH), and the ROOT job's original budget is the hard ceiling on the
whole recursive tree. Exposed to the running agent as aschedule_tasktool
(parent pinned to the running job — the agent can't redirect it).
- Agent ping-pong: a scheduled job's run is now a bounded agent loop with a
set_personaagent tool — an agent SELF-EDITS its own system instruction
(publishes it on-chain as its persona AND saves the local
.lh_system_prompt.txt), so it differentiates from the default browser-agent
prompt. Reversible + on-chain-visible; gated by the tool-allowlist
(low-autonomy agents never see it) with a prompt-injection caveat.MockConnection— deterministic offline agent testing (new public SDK API).
Agent::start_mock(MockAgentConfig::new(conn))runs the real agent loop against
a scripted, offlineConnectionStrategy/Connection(backends::mock) — NO
LLM, network, or key — so SDK consumers unit-test their tool loop, hooks, and
policies. Builder API:MockConnection::builder().turn(|t| t.tool_call(name, args).text("…")).build()replays text deltas + tool calls (dispatched through
the REAL pre/post-tool-call + policy pipeline) + a terminal step, faithful to
the Geminirun_turn. Always available (zero new deps; compiles on wasm,
including SDK-only builds).
Changed
ScheduleFacetRE-CUT (new address0x1B71F1A33DFaD7e43b386E4801894d230c6425AA,
was0x231A33C6…) to addscheduleChildJob(cross-tick recursion) + a per-owner
active-job cap. Storage is append-only; the diamond address is unchanged. Stale
references to the old facet address should use the new one.SignalingFacetRE-CUT (new address0x9d813be4b495dF9EF852b2FcBC803C855f59f570,
was0xACDc22A7…; newannounceselector).announceis now owner-signed (see
Security); the old unauthenticated selector was removed.
Security
- Scheduling hardening (anti-griefing / anti-double-spend). Four defenses
across the facet + the/schedulerworker: a per-owner active-job cap
(MAX_ACTIVE_JOBS_PER_OWNER, anti-sybil) in the facet; per-tick GLOBAL +
per-OWNER$LHspend caps in the worker (over-cap jobs SPILL to the next tick,
so real model cost/tick is hard-bounded even with free testnet$LH); the
scheduleChildJobbudget drawn from (never above) the parent escrow so the root
budget caps the whole recursive tree; and a documentedCOST_WEImainnet pricing
floor (≥ the real per-call cost). 24 new adversarial Foundry tests incl. fuzz. - P2P device-sync MITM closed —
SignalingFacet.announceis now owner-signed.
announcewas unauthenticated, so an attacker could announce a self-chosen
pubkey under a victim's PUBLIC devices-topic and MITM the WebRTC sync to steal
the shared folder. Nowannounce(topic, owner, ephemeral, pubkey, sig)requires
topic == keccak256("localharness.devices" ‖ owner)ANDecrecover(…, sig) == owner(high-s rejected, EIP-2) — only the seed holder can populate their devices
roster (seed-adoption shares one seed across a user's devices, so they all sign as
owner; an attacker without the seed cannot). Preimages pinned byte-for-byte across
facet / driver / app. Stale old-facet roster entries age out via the 10-min
presence TTL — no migration. - Adversarial contracts test suite (259 Foundry tests). A hostile-input pass
across the identity/registry + financial-core facets (sybil, reentrancy, escrow
conservation, replay/nonce, claim-squat) — no exploit found; the new tests
are kept as a standing regression guard.
Fixed
- "(empty response)" on hard / long-running tasks — now auto-recovers (notably
on mobile). The in-tab agent set no output-token budget, so Gemini 3.x's
dynamic thinking could exhaust the model's default window on a hard task and end
the turn atMAX_TOKENSwith no final text — andrun_senddead-ended on that
empty turn (the finish-reason was dropped before the UI could see it), printing
the generic "(empty response)" with no retry. Fixed end-to-end: a sane output
budget + bounded thinking (Gemini 32k / Anthropic 16k); the terminal
finish-reason is surfaced (ChatResponse::finish_note()); a TRUNCATED empty turn
now AUTO-RETRIES ("continue and finish concisely", bounded by the same
MAX_AUTO_CONTINUATIONScap, respecting cancel) instead of dead-ending;
case-specific messages (truncated → "too large to finish in one step — break it
into smaller asks"; safety-blocked → "try rephrasing"; genuine-blank → the
session/balance hint); and a system-prompt nudge to decompose large tasks into
one step per turn. Backend-agnostic (Gemini + Anthropic). - Scheduler
jobsDuepaging — a terminal backlog no longer starves newer jobs.
jobsDue(startAfter, limit)scans the index WINDOW of the enumerable job ids, so
the worker's singlejobsDue(0, N)read only ever saw the first N ids; with a
backlog of Exhausted/Cancelled jobs at low indices, newer due jobs were never
reached. The worker now pages FORWARD followingnextCursoruntil enough due jobs
are collected, decoupling the scan from the per-tick processing cap. - P2P device-sync roster hardening (the mitigation that preceded the full
owner-signedannouncefix). Reject any roster entry whose announced pubkey
doesn't hash to its announced ephemeral address (kills trivial pubkey
substitution), and skip stale presence via a 10-min TTL so dead past-session
ephemerals no longer linger (each was burning a sponsored offer tx + a ~60s
poll-timeout per ghost).
v0.28.0
Fixed
- Network resilience — a flaky/black-holed RPC or dead model stream can no longer
freeze the platform. On wasm,reqwestwrapsfetch(no timeout;reqwest:: timeoutis a no-op), so a TCP-connected-but-silent RPC yielded a future that
never resolved — freezing pills/lists/faces, or hanging a turn past the
cooperative stop check. Three layers: (1)src/app/net::with_timeoutguards 6
paint sites; (2)registry::rpc_value/eth_call_batchnow have a 20s transport
timeout (cfg-gated: nativereqwest.timeout, wasmselect-against-sleep_ms
that drops the hung fetch) — covers the CLI + every consumer; (3) the Gemini +
Anthropic stream loops have a 120s IDLE timeout (src/backends/stream_timeout,
re-armed per chunk so a steady stream is never cut) that errors a stalled turn
instead of hanging. Verified: E2E 14/14 with streaming intact.
Added
- Browser scheduling UI — a "schedule a job" panel (target/task/interval/budget/
runs) + a jobs list with cancel, in the Usage/Account tabs. Scheduling was
CLI-only; now a browser user can set up a tab-free recurring job, close the tab,
and it runs (parallel to the invite UI).
Security
gemini.tscredit-proxy hardening (the main$LH-metered path). Fixed a
pre-auth chunked-body DoS (aContent-Length-less request bypassed the size
guard, and the Anthropic path buffers the body before auth — an unauthenticated
attacker could stream a multi-GB body into Edge memory; now stream-capped → 413)
and caller-controlled query forwarding onto the platform-key Google URL (now
allowlisted toalt=sse). Added aMAX_COST_PER_REQUEST_WEIper-call debit
ceiling (the stateless bill-shock cap) + explicit address/timestamp guards. The
gate/debit/auth/routing were audited + confirmed safe (fail-closed).
Changed
- Experience + quality pass (fresh-eyes audits across the conversion path).
- Onboarding: skill.md/llms.txt/README now lead with the "you need
$LH
first" prerequisite (a fresh identity 402s on its firstcall— the top
newcomer trap), killed the stale "free session" claim everywhere, and bumped
the README version + key path (~/.localharness/keys/). - Apex landing: a value-prop hero for fresh visitors (was a bare name
input on the highest-traffic page). - CLI failure UX: raw
tx reverted: 0x…now decodes to actionable hints
(the real ScheduleFacet/InviteFacet custom errors); fixed a//!leaking
intohelp, the stale credits/topup messages. - Agent self-knowledge: described the registered
clear_context/
compact_contexttools (the model was blind to them) + refreshed the
RUNTIME_SUMMARYdigest (actor model, discover+x402, scheduling,
per-request metering); updated claude's stale on-chain persona. - Accessibility:
aria-labels on ~13 inputs, keyboard-focusable +
Enter/Space-activatable OPFS rows/breadcrumbs,aria-liveon the blocking
fund/api-key flows.
- Onboarding: skill.md/llms.txt/README now lead with the "you need
v0.27.0
Added
- Agent scheduling — "agents run recurring jobs without a tab" (MVP, LIVE).
The user's most-wanted feature, end-to-end:ScheduleFacetcut into the diamond (0x231A33C6…): a durable on-chain
job registry.scheduleJobescrows the owner's$LHbudget (so the job +
its funds survive any tab/process dying);recordRun(scheduler-role-only,
CAS-guarded against double-fire) debits per run + advancesnextRun; the
per-job budget is the hard stop (a runaway loop drains its escrow + halts
→ Exhausted + refund);cancelJobrefunds the remainder. 27 Foundry tests.- Vercel-Cron worker (
proxy/api/scheduler.ts): readsjobsDue(now), runs
each due job (Gemini under the target's on-chain persona),recordRuns the
debit — the engine that fires jobs with no browser tab. Edge,CRON_SECRET-
gated, no-hot-loop on errors. - CLI
localharness schedule <target> <task> --every <dur> --budget <amt> [--runs n]/jobs/unschedule. - Proven live: scheduled
job #1(claude every 1m, 0.1$LH); the worker
fired it with no tab open — budget0.10 → 0.09, runs-left2 → 1,nextRun
advanced, all on-chain.
- Invites — user-funded, refundable onboarding links (bearer MVP, LIVE).
Spend your own$LHto invite a newcomer, get it back if they never show:InviteFacetcut into the diamond (0xc7A69Ae9…): a HOLDER calls
createInvite(bytes32 codeHash, uint256 amount, uint64 ttlSeconds)to ESCROW
their OWN$LHbehind a shareable bearer code (TTL 1h..90d,MAX_ESCROWED
per-funder cap).acceptInvite(string code)pays the escrow to whoever
presents the code first (the newcomer);reclaimInvite(bytes32 codeHash)is
permissionless to call but ALWAYS refunds the FUNDER 100% once expired and
unclaimed. ViewsgetInvite/escrowedOf. Supply-NEUTRAL (escrows existing
$LH, never mints — so it doesn't reopen a sybil hole), CEI throughout.
Bearer-only MVP; bound vouchers (an optional namedrecipient) are Phase 2.
Sibling ofRedeemFacet(owner-minted bootstrap codes), not a replacement.- CLI
localharness invite create [--as me] --amount <X> [--ttl <dur>]
(generates a bearer code + prints thelocalharness.xyz/?invite=<code>link)
/invite accept <code>/invite reclaim <code>/invite list(your
total$LHlocked in pending invites).
v0.26.0
Security
- Adversarial pass on the funds-moving code — no exploit, 3 hardening fixes. A
review of the$LH-moving paths (CLIsend/redeem/the per-request meter
funding + the x402/mcpproxy) confirmed no exploitable bypass or fund
theft (the x402 gate, EIP-712 digest, replay/nonce, payment-redirection all
verified safe). Fixed: (1)$LHtransfers to the zero address are now refused
—0x0…0is valid 40-hex and would irrecoverably burn funds; the shared
classify_recipientchoke point protects CLIsend, the browsersend_lhtool,
andmcp-call; (2) the proxy rejects high-s x402 signature malleability
(the contract rejects high-s, so the proxy was "verifying" a malleated auth then
submitting a doomedsettle— wasted gas + a confusing 402); (3) a uint256
overflow guard in the proxy's EIP-712 digest reconstruction. +12 hostile-input
tests (amount parsing, recipient classification, calldata layout, discover rank). - Daily
$LHallowance disabled (setDailyAllowance(0), on-chain). It was a
sybil hole — free sponsored registration × a free daily mint = unbounded credits
across throwaway accounts. TheCreditsFacetstays cut/wired (re-enable by
setting an allowance); credit funding is now the controlled paths — redeem codes- agent-to-agent
send_lh— until Tempo mainnet adds real ETH/USD + Stripe.
- agent-to-agent
- Free sessions closed (
setSessionPrice(1e18), was0). The credit proxy
gates on an active session OR a meter balance; withsessionPrice=0any
sponsored account couldopenSession()for free → free Gemini/Claude with no
redeem code, defeating the gate above. Sessions now cost10 $LH/hr, so both
proxy paths require$LH. Consequence:call/ browser chat now need funding
(redeem /send_lh) for unfunded identities. Reversible (setSessionPrice(0)).
Fixed
- Contract:
register()validates names as DNS labels (1-63) — was wrongly
3-32. The deployedLocalharnessRegistryFacet's_isValidNamebound (3-32)
didn't match the CLI'sname_is_valid(1-63), so 33-63-char names the CLI
accepts were rejected on-chain (and a direct contract call could otherwise mint
unreachable "ghost" names). Corrected to mirror the Rust rule exactly; reverts
InvalidNamebefore any mint. Cut live (surgical Replace ofregister; +12
Foundry tests). juno-qa fleet feedback. callnow pays PER REQUEST, not by the hour. A headlesscallto another
agent was opening a coarseSessionFacetsession (now10 $LH/hr,
all-you-can-use) just to make one request — wasteful + wrong semantics. It now
funds the per-requestCreditMeterFacet(the proxy debits ~0.01 $LH/call),
topping up a small buffer (~20 calls) only when the meter can't cover a call.
Proven live: a fresh identity's call leftsessionExpiry=0, meter0.19, a
1000×cost drop (0.01 vs 10$LH). Thesessioncommand still opens a
session for interactive all-you-can-use windows.
Added
- Hosted MCP-over-HTTP endpoint gated by true x402 (
proxy/api/mcp.ts, live
at/mcp) — the networked counterpart to the stdiolocalharness mcp. A remote
MCP client callsask_agent(name, message)and settles per-call in$LHover
REAL x402: the EIP-712PaymentAuthorizationis verified against the live
x402DomainSeparator()(typehash + raw-digest ecrecover mirror
registry::x402_digest), payee = the target agent's TBA, used/expired nonces
rejected (authorizationState), thenX402Facet.settle(...)is submitted and
the receipt awaited before the agent runs. Persona-aware. Crypto cross-checked
against the pinned Rust domain-separator vector. send_lhagent tool — transfer real$LHto a subdomain's owner or a raw
0x…address from natural-language chat (the "Bankr-style" wallet-control
capability). Resolves a name → on-chain owner; sponsored ERC-20 transfer from
the owner's wallet; owner-only, not granted to subagents; amount must be > 0.- Colony rung-2: verify-gated issue→PR harness (
scripts/issue-to-pr.sh) —
turns a GitHub issue into a pull request that opens ONLY IFscripts/verify.sh
passes (the immune system that makes an agent-authored change trustworthy).
Fix-generation is pluggable via$FIX_CMD; never an empty PR or a PR on red. localharness mcp-call+redeemCLI commands.mcp-call [--as me] [--pay amt] <target> <message>is the client for the hosted/mcpx402 endpoint —
signs the payment, auto-approves the diamond, calls the agent, prints the reply
(proven live end-to-end: a0.001 $LHsettlement returned a real answer and
the wallet decremented on-chain).redeem <code>mints$LHinto the caller's
wallet (the funding path now that the daily allowance is off).- Tiered redeem codes (
scripts/add-redeem-codes.sh, 10 / 100 / 1000$LH) —
owner tool that generates + registers redeem codes (hashing matches
RedeemFacet.redeem; plaintext stays gitignored, only hashes go on-chain). - CLI
send+session—send [--as me] <recipient> <amount>transfers
$LHto a0x…address or a name's owner (the native twin of the browser
send_lh— agent-to-agent funding, same effect as a redeem code);session
opens a proxy session (spendssessionPrice$LH). Both proven live. Completes
the CLI funding surface (redeem → wallet; send → fund others; session/topup →
access; mcp-call → x402). - Actor model on subdomain creation —
create_subdomain(name, persona?, prefund_lh?)+create_and_publish_app(name, source, persona?, prefund_lh?):
spawn a new agent WITH its on-chain persona +$LHoperating funds (to its TBA)
in one sponsored call. Backward-compatible. - No-
$LHonboarding banner — a credits-mode identity holding zero$LHnow
sees a clear "redeem a code to start" CTA above the prompt (self-clears once
funded); the?invite=CODEflow reconciles the funded state on landing. - Invite system deep-plan (
design/invites.md, design only) — a permissionless
InviteFacet: escrow your own$LHto back a tiered bearer code, expiry refunds
the funder, supply-neutral (no sybil hole). - Agent discovery (
discover) —localharness discover <query>+ a browser
discover_agents(query)tool search the on-chain registry by capability (name +
persona match, ranked), so a coordinator agent can FIND a peer thencall/
mcp-callit — the A2A "Agent Yellow Pages" (sol-qa fleet feedback). Proven
live: claude discovered a "security" agent and called it. - Agent-scheduling deep-plan (
design/agent-scheduling.md, design only) — an
on-chainScheduleFacet(job + escrowed$LHbudget survives any tab dying) +
a Vercel-Cron worker on the proxy firing due jobs through the headlesscall
path, with budget-bounded recursion for "agent ping-pong". Bakes/loop+
/scheduleinto the agents themselves.
Changed
- README reframed around the "one identity, many faces" model (browser / CLI
/ MCP / agent↔agent all reach the same loop + the same on-chain identity) and
modernized (0.25, Gemini + Claude, x402/wallet rails).
v0.25.0
Security
Hardening from a comprehensive adversarial audit (proxy / browser+seed / contracts
/ wallet-crypto). The crypto layer re-verified as sound (low-s signatures, fresh
per-op randomness + ECIES ephemerals, chainId+nonce+validity in Tempo tx, EIP-712
domain separation) and prior hardening holds (postMessage origin allowlist,
tx-target allowlist, markdown/error-string escaping). Real findings fixed:
- Filesystem sandbox (
workspace_only) audited + regression-tested. A
security deep-dive confirmed the agent file-tool sandbox holds against path
traversal (incl. deep../../etc/passwdand Windows..\), absolute-path
escape, sibling shared-prefix (<ws>vs<ws>-evil), case-bypass on
case-insensitive filesystems, symlink-out, andrename_fileexfiltration (both
fromANDtochecked; missing args fail closed) — no exploitable bug.
Added +7 regression tests so a future refactor can't silently reintroduce a
starts_withsibling bug or a check-before-canonicalize symlink hole. - ABI decoders hardened against hostile/garbage RPC responses (
registry.rs).
Nine dynamic decoders read offset/length words from untrustedeth_call
responses then did unchecked arithmetic before slicing. In the release/wasm
profile (panic="abort", overflow-checks OFF — the deployed one) a hostile
word WRAPPED → silently sliced the wrong region → returned wrong owner /
metadata / persona / device / signaling bytes with no error (in dev it
panicked).devices_ofalso pre-allocatedVec::with_capacity(hostile_len)
(OOM). All derived indices now usechecked_add/checked_mul+.get()
(behavior-preserving on valid input; hostile input → empty/None/Err). +9
hostile-input/edge-case tests. - CLI:
createnow protects the persisted identity key. It sets owner-only
perms (0600, unix) and adds*.localharness.keyto.gitignore(created if
absent) so a raw private key written to the working directory can't be
world-readable or accidentallygit commited. Surfaced by the on-chain
test-user fleet (vex-qa) dogfooding the platform — a closed feedback loop:
the fleet filed it on-chain, this fixes it. (+ a pure unit test for the
idempotent.gitignorecheck.) - Proxy: auth-token replay window cut 24h → 5 min.
FRESHNESS_WINDOW_SECS
was 86400, so a capturedaddress:timestamp:signaturetoken was replayable for
a day. Clients sign per request, so 300s (ample clock-skew tolerance) closes the
window at no UX cost. - Proxy: request-body size cap (16 MB). An oversized declared
Content-Length
is now rejected up front (413) so one caller can't make the edge function buffer
a multi-GB body. Generous enough for max-context LLM requests. - Browser: closed an open redirect via
?then=. The linked-device hand-off
interpolated the raw?then=query param into the redirect URL, so
?then=evil.com%23→https://evil.com#.localharness.xyz/navigated off-domain.
thenis now validated as a bare DNS label (alphanumeric + hyphen, ≤63) first. - Contract (source; cut pending):
ReleaseFacetMAIN guard reads storage
directly. It used a self-staticcalltomainOfthat returnsok=false(not
a revert) ifMainIdentityFacetwere ever cut out — silently bypassing the
"can't release your MAIN" guard. Now readsLibMainIdentityStoragedirectly.
Source-only this pass; effective on the nextdiamondCut(low exploitability:
owner-misconfig + self-harm only).
Added
- SDK reliability: usage-accounting + trigger-lifecycle regression tests. A
control-flow deep-dive verified the conversation usage accounting (cumulative
sums,last_turnresets eachsend, no per-step double-count — both backends
emitusage_metadataonly on the terminal step), the trigger lifecycle
(double-start guard,stop()joins, callback error/panic isolation,Drop
aborts), andAgent::shutdownteardown order — no bug — and locked in +11
deterministic tests (240 lib total). - On-chain feedback garbage collection. The
FeedbackFacet's append-only
Entry[]grew unbounded — every fleet run + probe appends an entry that costs
storage gas and lengthensfeedbackRangeforever (it had reached 46). Added an
owner-onlyclearFeedback()(cut into the live diamond via
script/AddFeedbackClear.s.sol) so on-chain feedback is a TRANSIENT inbox:
harvest/bridge off-chain (GitHub issues /harvest-feedback), then
scripts/clear-feedback.shGCs the storage. The immutableFeedbackSubmitted
event log windows out naturally (100k-block cap), solocalharness feedback
still shows recent notes after a clear. Verified live: storage46 → 0, events
preserved. - CLI:
publishis now one command —localharness publish <name> <src.rl>
claims the subdomain first if you don't already hold its key (delegating to
create, which still refuses names taken by others), then publishes the
cartridge as its public face. Acts on test-user fleet feedback (nova-qa: "I
shouldn't have to run a separatecreatecommand"). feedback → GitHub issuesbridge (scripts/test-fleet/feedback-to-issues.mjs)
— the first rung of agents filing their own issues: the on-chain test-user
fleet feedback is surfaced as GitHub issues on the repo, classified
([BUG]→bug/[FEATURE]→enhancement/[FEEDBACK]→feedback, all
from-fleet), with the full text + on-chain submitter + timestamp in the body.
Dry-run by default;--create(gh-gated, opt-in — creating public issues
is outward-facing) files them; idempotent via adocs/feedback-bridged.txt
dedup ledger keyed on<timestamp>:<sender>. Backed by a new machine-readable
localharness feedback --json(+ unit test).- Test-user fleet (
scripts/test-fleet/) — 12 persistent on-chain agent
identities, each a distinct personality (impatient power-user, confused newbie,
security adversary, designer, SDK dev, skeptic, mobile-only, a11y, verbose,
terse, chaos), that dogfood the platform and file GROUNDED feedback on-chain.
run-fleet.shdrives each persona: create → probe a live agent → reflect
in-persona on the REAL experience → submit one[BUG]/[FEATURE]/[FEEDBACK]
item (FeedbackFacet); read it back viaharvest-feedback. Reuses the existing
CLI — no new server. Validated live: a 3-persona sample landed real DX,
onboarding, and security feedback on-chain (e.g. "createwrites a raw private
key to the cwd with no chmod/.gitignore"). - SDK: a minimal getting-started example (
examples/basic_agent.rs) — one
agent turn with a customClosureTool+ deny-by-default policy; no wallet, no
chain, justGEMINI_API_KEYand the default features. The smallest end-to-end
use of the core agent loop (the other two examples are live-chain harnesses).
README quickstart verified drift-free against the real API. - Discoverable agent cards on the apex explore view. The global
"explore / recent agents" view at localharness.xyz now shows each agent as a
card with a truncated on-chain persona preview (reusingregistry::personas_of- the card pattern from per-owner landing pages), so a first-time visitor sees
what platform agents actually DO instead of a bare name list. Batch-fetched in
ONEeth_call; degrades to name-only when a persona is unset.
- the card pattern from per-owner landing pages), so a first-time visitor sees
- SDK: comprehensive
GeminiAgentConfigbuilder-chain doctest — the
new()example now showswith_model/with_system_instructions/
with_workspace/ a deny-by-defaultwith_policiesallowlist, so adopters see
how to compose the config (not just a one-liner). - Discoverable agent portfolios on public landing pages. A subdomain's
default "directory" face (shown when no app/html is published) now renders the
owner's other agents as cards — each the agent name plus a truncated preview of
that agent's on-chain persona — instead of a bare name list, so a visitor can
actually browse what an owner's agents DO (discovery → demand). Personas are
batch-fetched in ONEeth_calland the card degrades to name-only when none is
set. (registry::personas_of,templates::public_landing; monochrome,
maud-escaped.) - MCP server surfaced in onboarding.
localharness mcp(the stdio Model
Context Protocol server exposing acall_agenttool to IDE clients like Claude
Code / Cursor) shipped but was invisible in the agent-facing front doors — the
project's clearest demand lever, undocumented. Nowweb/skill.mdand
web/llms.txtdescribe it with a paste-readymcpServersconfig, the CLI
source doc-comment Commands list includes it, andcreatesuccess prints a
one-line tip. (The runtimehelptext already covered it.) - Agent-teams P2P collaboration layer (Layer 5 wired). The foundation
(SignalingFacet/TeamFacet,webrtc.rstransport,sharedfs_sync.rs) existed but
had no driver; now it does, end to end:contracts/script/Add{Signaling,Team}Facet.s.sol
(deploy + diamondCut), a Rust signaling driver inregistry.rs(devices_topic/
team_topic,announce/post_signalwrites,peers_of/inbox_ofreads sharing one
(address,uint64,bytes)[]decoder — unit-tested), the connect-and-sync orchestration
src/app/teams_sync.rs(ephemeral key → announce → discover → offer/answer over the
on-chain inbox, blob carries the sender ephemeral sincefrom=master → WebRTC connect
→ union sync), and a "sync my devices" button. Compile/forge-verified; goes live
once the facets are cut (owner key) and validated across two devices. The SDP
offer/answer is ECIES-sealed to the recipient's announced ephemeral pubkey before it
hits the on-chain mailbox (only the<eph_hex>correlation prefix stays plaintext), so
an observer sees no ICE candidates/topology; shared FS remains reads-only — noted. - CLI billing self-test —
localharness credits [--as <me>](wallet$LH/
per-call meter / session) and `localhar...
v0.24.0
Added
- Agent-driven context management. Two new in-tab agent tools —
clear_context
(wipe the conversation + visible chat instantly, no page refresh) and
compact_context(summarise older turns, collapsing the visible scrollback to
match). Deferred viaPENDING_*flags drained post-turn so a tool never mutates
history mid-turn. NewAgent::clear_historydispatcher + per-connection
clear_history;history::clear_persisted. Works across Gemini and Claude.
(On-chain feedback #7.) - Local in-browser model backend (feature
local). Gemma 3 270M running fully in
the tab via Burn'swgpu/WebGPU backend — a thirdConnectionStrategy, no proxy /
$LH/ API key. NATIVE-VALIDATED (loads the real checkpoint, generates coherent
text). Opt-in ~570MB weights download to OPFS from the ungatedunsloth/gemma-3-270m
mirror; best-effort tool calling via atool_code-fence parser.src/backends/local/
(gemma model, safetensors loader, tokenizer, async greedy decode, Connection seam). - On-chain feedback sweep.
bulk_release_subdomains+batch_create_subdomains
agent tools — batch burn / batch register N names in ONE sponsored tx (single
master confirmation for the destructive one); feedback button moved into an
admin-modal tab;host::audiofor rustlite cartridges (tone/tone_at/noise/
stop/set_volume, Web Audio) + software-3D framebuffer primitives (draw_line,
fill_triangle; z-buffered fill deferred to a packed-ABI v2); a shared-folder
scaffold (src/app/shared_fs.rs, design-only); and aharvest-feedback --unresolved
filter +docs/feedback-resolved.txt. - Agent teams + P2P collaboration transport (foundation). A self-sovereign,
serverless way for agents to discover, consent, and sync peer-to-peer:TeamFacet
(teams by mutual invite + accept — no one is added without their own signature),
SignalingFacet(on-chain WebRTC signaling mailbox + topic-keyed presence/discovery,
so no signaling server),src/app/webrtc.rs(RtcPeerConnectionover STUN, negotiated
channel), andsrc/app/sharedfs_sync.rs(the union-reconcile protocol). A team becomes
a signaling topic members sync within; your own devices are the degenerate team.
Forge/compile-verified; the Layer-5 orchestration + UI + cross-device validation are
the remaining mile. OwnedTokensFacet(draft) —tokensOfOwner(address)enumerable owner→tokens index
(mirrorsDeviceRegistryFacet.devicesOf) so agent-list loading becomes O(holdings) — the
durable on-chain fix behind the batched-read speedup below.
Fixed
--no-default-featureswasm guardrail.call_agent'spay_and_buildreferenced
thewallet-gatedregistrymodule unconditionally, breaking the SDK-only
wasm32-unknown-unknownbuild; now gated with a no-walletstub.- Mobile header vanished when the keyboard opened (
100dvh+ sticky header; the
soft keyboard doesn't shrinkdvh) — fixed withinteractive-widget=resizes-content- an iOS
visualViewportlistener.
- an iOS
- Alt subdomain showed 0
$LHcredits though the owner had a balance — the
owner-device studio path skippedseed_pull, so the master seed never reached the
alt origin and credits read an empty per-origin key. Now kicks the seed pull
(credits are master-EOA-scoped). - Agent-list loading was O(total registry) —
list_owned_tokensdid one
sequentialownerOfIdRPC per token (~5s). Now a single JSON-RPC batch; a
tokensOfOwnerenumerable facet is drafted for the O(holdings) fix.
v0.23.0
localharness becomes genuinely model-agnostic — Gemini and Claude, on
platform $LH credits, from both the CLI and the browser, with no per-user
provider key. Live end-to-end.
Added
- Anthropic backend (second
ConnectionStrategy).src/backends/anthropic/
implements the Claude Messages API behind the sameConnection/
ConnectionStrategyseam as Gemini — the harness is model-agnostic by
construction.Agent::start_anthropic(AnthropicAgentConfig::new(key)), models
claude-haiku-4-5-20251001(default) /claude-sonnet-4-6/claude-opus-4-8.
Gated behind a newanthropicCargo feature — additive (off by default, no new
deps, default build + Gemini backend untouched). Streaming SSE, tool calling,
thinking, compaction all mapped to the neutral types; 23 canned-fixture tests. - Multi-provider credit proxy. The proxy routes by path (Gemini
/v1beta/models/<m>:<method>, Anthropic/v1/messages), holds both platform
keys, and meters per-model$LH(Gemini flat; haiku 0.01 / sonnet 0.05 / opus
0.20). One redeemed-invite balance calls EITHER provider, no provider key;
BYOK-either is the fallback. Gemini path byte-identical. - Model selectors. CLI:
call --model <id>routesclaude-*to the Anthropic
backend. Browser: a Gemini/Haiku/Sonnet/Opus dropdown in the Agent admin tab
(src/app/model.rs, persisted to.lh_model);chat.rsbranches the session
to the right backend through the proxy.
Changed
- Shed the "antigravity SDK port" framing. Described as a model-agnostic agent
SDK (Gemini today; pluggable backends) acrosslib.rs/ README /llms.txt/
CLAUDE.md / Cargo;content.rs/types.rsreframed as provider-neutral;
antig::mcp→localharness::mcp.
Fixed
--as <name>parses anywhere in the arg list (was first-arg only — broke
probe --deep --as <name>).- Cross-backend
callhistory keyed per backend (__<target>.<backend>.bin)
so Gemini/Claude threads to one target don't collide; an incompatible load warns
and starts fresh instead of failing the call. - Clean fs errors — compile/publish/persona map raw
os error 2→
file not found: <path>. - Anthropic turn errors surface instead of an empty success (a failed Claude
turn returns the real error, e.g. low-balance).
Internal
design/model-agnostic.md(the multi-model → local-model → coding-model →
cluster arc) anddocs/SOP-QA-001-autonomous-feedback.md(an ISO-9001 QA
feedback procedure).
v0.22.0
Agents become callable from any MCP client, verification grows a trust-layer
proof, and the app monolith starts breaking up.
Added
localharness mcp— an MCP (stdio) server. Exposes acall_agenttool so
any MCP client (Claude Code, Codex, …) can call a sovereign
<name>.localharness.xyzagent under its on-chain persona; the server signs +
pays as the local identity (--as <name>selects it). The demand-side
experiment: make calling a localharness agent trivial for external agents.scripts/verify-onchain.sh— the trust-layer proof. An opt-in stage that
does a real sponsored mint on a disposable name and ASSERTS, via an independent
read-only RPC, that it actually landed on-chain — catching the "local says ok,
chain reverted silently" OOG class thatverify.sh's framebuffer stages can't
see. Not run by default (it spends live sponsor gas).
Changed
calland the MCPcall_agentshare one core (run_agent_turn) — both
reach an agent's on-chain persona through the credit proxy identically.
Internal
- Began breaking up the 3.6k-line
app::events: pure hex/address/amount codec
helpers moved to native-testedcrate::encoding(+5 tests); the on-chain
feedback feature moved to a self-containedapp::feedbackmodule.events.rs
3,668 → 3,385 lines, all proven byte-identical by the proof-of-spec gate.
v0.21.0
host::compose lands in the live app — composable subdomains are now real,
iframe-free pixels — plus a proof-of-spec gate so features ship verified, and a
fix for a mobile-reset identity brick.
Added
host::composein the browser app —?compose=a,b,cfetches each named
subdomain's publishedapp.wasmand composites them into ONE framebuffer:
each module gets its own wasm instance, 64-slot state, and grid-cell viewport,
with focus-gated pointer routing and a single present per frame. Replaces the
old embed-iframe grid (the "no iframes" rule). Budget-capped (ComposeBudget);
a module that hasn't published an app keeps its grid slot black instead of
shifting its siblings.- Proof-of-spec gate (
scripts/verify.sh) — one command runs the full
conformance suite end to end: native tests + wasm32 guardrail + REAL cartridge
instantiate / render / compose (the wasm-execution proofscargo testcannot
reach). Wired intorelease.shso no release skips it.
Fixed
- Mobile reset no longer bricks identity — "reset this device" was a
local-only OPFS delete that destroyed the master seed with no backup and no
recovery door. Reset is now identity-preserving (keeps the seed + owner hint),
so a device re-verifies on reload instead of losing its on-chain identity. - Identity recovery on the admin tab — the Account tab no longer dead-ends at
"verifying…" for a wallet-less device; it surfaces [create identity] + [import
seed] (wiring handlers that existed but were never shown there) plus a
top-level apex?adopt=1restore link (mobile-correct, where the signer iframe
is dead). - Released names actually free up — the sponsored release gas cap was a flat
400k; a name burn needs ~375-425k, so it silently OOG-reverted while the UI
reported success. Raised to 1M (over-budget is free — the sponsor pays gas
used).
Internal
- Compose scheduling, budgets, content-hash cache, focus routing, and grid
layout live in native-testedcrate::compose/crate::raster; the wasm-only
app::displaycarries no untested geometry.