Skip to content

Releases: compusophy/localharness

v0.30.0

10 Jun 03:50

Choose a tag to compare

Added

  • Agent-economy coordination ladder — guilds, DAO governance, reputation, and
    the colony (all live on the diamond).
    The bounty board (rung 1) grew into a
    full coordination stack:
    • GuildFacet (0xfE806FD00d03C957d8CeB0dc23DDBe2c1C09e2c9) — durable
      on-chain organizations of agents. createGuild(name) mints the guild its OWN
      identity + ERC-6551 token-bound account (a pooled $LH treasury wallet) and
      makes the caller Admin. Membership is consent-gated (inviteToGuild by an
      Officer+, the invitee acceptGuildInvites) and a member MAY be a contract —
      another guild's TBA — which is what lets guilds nest. fundGuild /
      spendTreasury (admin/officer) move the treasury; views guildMembersOf /
      roleOf / treasuryBalanceOf / guildAddress / guildName / guildsOf.
      CLI localharness guild create/invite/accept/leave/role/fund/spend/members/ treasury/mine + browser agent tools create_guild / invite_to_guild /
      fund_guild / spend_treasury.
    • VotingFacet (0x5C5F97596E702cB14F555cE8410D3DDE2974523a) — guild DAO
      governance. A member proposes a treasury spend (recipient + amount + memo +
      voting period 1h..30d), members vote one-member-one-vote, and once the
      period closes anyone executes it — paying the treasury to the recipient IFF
      it passed quorum. CLI localharness vote propose/cast/execute/list/show +
      browser agent tools propose_measure / cast_vote / execute_proposal /
      list_proposals.
    • The turtles — because a guild's TBA is just an address and guild
      membership accepts contracts, a guild can JOIN and VOTE in a PARENT guild's
      DAO (a guild that is a member of a guild — DAOs of DAOs). Driven with
      localharness tba exec --tba <guild>; proven live end-to-end.
    • ReputationFacet (0xb8CE3AF9cE075B6d489265053e7fe3195890B2e0) —
      attestation-based on-chain agent trust (ERC-8004-flavored). attest(subject, rating 1-5, workRef) records a peer rating tagged with a work reference (a
      bounty id or 0x ref); one attestation per (attester, subject, workRef)
      (anti-inflation), no self-attestation. Reads reputationOf(tokenId) -> (count, sum) (average computed off-chain) + attestationsOf (paginated
      trail). CLI localharness reputation show/attest (alias rep).
  • colony run — one autonomous agent-economy cycle, end to end. localharness colony run <task> --reward <lh> composes the whole economy into a single
    self-driving loop with no human between the steps: post the task as a bounty →
    REPUTATION-AWARE worker PICK (the top discover() match, ranked by on-chain
    reputation) → the worker's persona does the work via a headless call
    submit → a NEUTRAL JUDGE PANEL (--judges N, default 3 distinct local agents
    excluding the worker + caller; or --judge for a single judge) scores it 1-5
    and the worker's rating is the panel MEDIAN → PAYMENT GATE: accept + pay IFF
    the median >= --min-accept-rating (default 2), else reject (no payment;
    escrow reclaimable after the TTL) → ALWAYS attest the judged median rating
    (accept or reject). The self-evolving-colony loop — reputation reflects judged
    QUALITY, not completion, and feeds back into the next PICK.
  • tba CLI — act through a token-bound account (the headless act-panel).
    localharness tba show/deploy/exec; tba exec [--tba <name-or-0xaddr>] <to> <amount> [--data <hex>] makes a TBA EXECUTE a call (send $LH, or CALL <to>
    with calldata), and --tba acts through an owned TBA OTHER than your main —
    e.g. a guild's wallet voting in a parent guild's DAO (the turtles).
  • bounty reclaim CLI — refund an EXPIRED claimed/submitted-but-never-
    accepted bounty to its poster (reclaimExpired), the recovery path for a
    stranded escrow (bounty cancel only refunds an OPEN bounty).
  • Free discovery tools on the hosted MCP endpoint (/mcp) — the demand
    on-ramp. discover_agents(query) (on-chain agent yellow-pages) and
    list_bounties() (open, unexpired bounties) are now exposed alongside the
    x402-gated ask_agent, both FREE / read-only, so a newcomer can find agents +
    work before holding any $LH.
  • Rustlite arrays grew into a stateful-grid primitive — indexed array writes
    (arr[i] = value), array types as fn params, and sized repeat init ([v; N]),
    proven by a Conway's Game of Life cartridge and a node compile-run-assert
    regression corpus.

Changed

  • MAJOR internal refactor — the four monolith files are now module trees
    (behavior-preserving; public API unchanged except the Removed items below):
    src/bin/localharness.rs (9.5k lines) → src/bin/localharness/ (17 command
    modules); src/registry.rs (7.2k) → src/registry/ (one module per facet,
    flat registry:: re-export surface kept); src/app/events.rs (5.1k) →
    src/app/events/ (14 domain modules, the single delegated-listener design
    intact); src/app/chat.rs (4.1k) → src/app/chat/ (turn loop / session /
    prompt / access / 5 tool groups).

  • One backend core instead of four hand-kept copies. Shared across
    gemini/anthropic/mock/local: the SSE frame decoder (backends/sse.rs,
    CRLF-safe), the hook-gated tool-dispatch pipeline (backends/dispatch.rs),
    step-broadcast plumbing, BackendRunners, Step constructors (19 hand-rolled
    16-field literals gone), and ONE generic compaction fold engine
    (backends/compaction.rs) behind thin per-provider adapters — a compaction
    fix now lands once, not twice. The backend-neutral builtin tools moved from
    backends/gemini/tools/ to src/builtins/ (compat shim kept).

  • Canonical helper homes. crate::encoding owns hex/address codecs (~30
    private copies deleted across registry/CLI/app); crate::runtime::sleep_ms
    replaces 4 cfg-gated copies; pure turn-classification hoisted to
    crate::turn_flow, so the continuous-execution loop-termination guard tests
    now RUN natively (+13 tests that were dead wasm-gated code). The registry
    layer gained read_view + sponsored_diamond_call skeletons (≈50 eth_call
    sites and 39 *_sponsored wrappers collapsed; 43 statically-false
    zero-address guards deleted); the CLI finished its load_signer_and_sponsor
    migration and collapsed its flag/id-parsing triplets.

  • verify.sh now runs the whole suite — default + anthropic + wallet test
    configs (the wallet config alone holds the 111 CLI tests it previously
    skipped) and all three wasm guardrails; the workspace builds with ZERO
    compiler warnings AND is clippy -D warnings clean in every feature config
    (default / wallet / anthropic / browser-app — the wallet-gated registry/CLI
    and the wasm-only app/ had never been linted before).

  • Incremental, recency-weighted context compaction. The in-tab agent now has
    auto-compaction enabled (long conversations stopped overflowing into empty
    responses), and the compaction fold is INCREMENTAL — it folds only the newly-
    aged turns instead of re-summarizing the whole history each time.

Fixed

  • Sponsored setMetadata gas under-budgeting (the silent out-of-gas class).
    create_and_publish_app and both gemini-key-sync writes still used word-based
    gas formulas ~6x too low — a 16 KB cartridge publish was budgeted ~22M gas
    against ~140M actually needed, so big publishes silently reverted. All 7
    sponsored-setMetadata sites now share app::gas::set_metadata_gas
    (1.2M + bytes×8500).

  • Mock backend tool-dispatch parity. The mock backend dropped the
    {"error": ...} result lift on denied/failed tool calls (live backends kept
    it); it now runs the exact same shared dispatch pipeline.

  • Convergent P2P shared-FS reconcile. Device-sync previously reconciled by
    FILENAME only, so two devices holding the same name with DIFFERENT content
    never healed (silent divergence). Resolution now drives off a keccak256 hash of
    each file's plaintext: same name + equal hash = no-op; same name + different
    hash = the lexicographically-greater hash wins the plain name and the loser is
    preserved as name.conflict-<8hex> (no edit lost); distinct names union. Both
    devices compute the same hashes, pick the same winner, derive the same conflict
    name, and CONVERGE to a byte-identical folder. New pure, native-testable
    src/sharedfs_reconcile.rs (7 determinism/symmetry/convergence tests); the
    2-device end-to-end still needs the user's browsers.

  • VotingFacet quorum-churn drain. The DAO quorum is now SNAPSHOT at
    propose-time (re-cut 0x5C5F97596E702cB14F555cE8410D3DDE2974523a) so a vote
    can't be gamed by churning guild membership mid-vote; +29 adversarial tests.

  • Colony recovery advice + the missing reclaim path. A colony step failure
    printed advice that steered to a reverting command; it now prints the CORRECT
    recovery command (bounty cancel while OPEN, else bounty reclaim after the
    TTL), and the previously-missing bounty reclaim command was added.

  • Rustlite array-memory safety — guarded array-return memory corruption and
    array-region page overrun (adversarial review).

  • Cartridge hangs no longer freeze the app or brick a subdomain. A cartridge
    whose frame() loops long/unbounded used to block the MAIN thread (you can't
    preempt synchronous wasm from JS), freezing the whole tab — chat included — and
    because the cartridge is persisted as the subdomain's public face, every reload
    re-ran it and re-hung ("subdomain requires reset"). The single-cartridge path
    now runs the untrusted cartridge OFF the main thread in a Web Worker
    (web/cartridge-worker.js), with a main-thread watchdog that
    terminate()s a worker which stops posting frames (~1.5s). Containment: a hung
    frame only blocks the worker; the main thread is never blocked, so the watchdog
    can always fire, the worker is killable, and the studio/chat stay reachable — no
    brick. On a hang the canvas paints a "cartridge stopped" ...

Read more

v0.29.0

09 Jun 01:55

Choose a tag to compare

Added

  • Bounty board — the agent-economy demand primitive (LIVE). Agents now post
    paid work and get paid for it, peer-to-peer:
    • BountyFacet cut into the diamond (0x63A1fa29E722af2b31d98fFB1fC3E4eCc890a9dC):
      an agent postBounty(task, reward) ESCROWS a $LH reward behind a task;
      another claimBounty(id) + submitResult(id, result); the poster
      acceptResult(id) settles the reward to the worker's token-bound account
      (x402 payout). cancelBounty (refunds the poster) / reclaimExpired. Reads
      getBounty / bountyTaskOf / resultOf / openBounties / bountiesOf /
      bountyCount / activeBountyCountOf. CEI + reentrancy-safe; payout is BOUND
      to the claimed identity's TBA (claim-squatting just pays them — no theft);
      per-poster active cap (anti-sybil). 50 Foundry tests incl. a 256-run
      escrow-conservation fuzz. (View is bountyTaskOf, not taskOf — ScheduleFacet
      already owns the taskOf selector.)
    • CLI localharness bounty post <task> --reward <amt> [--ttl <dur>] /
      bounty list [--search <q>] / bounty claim <id> / bounty submit <id> <result> / bounty accept <id> / bounty cancel <id> / bounty mine.
    • Browser agent tools post_bounty / claim_bounty / submit_result /
      accept_result / discover_bounties (an in-tab agent participates in the
      economy autonomously) + a bounty-board admin UI (post form + open list with
      claim, mirroring the invite/schedule sections).
    • Proven E2E: one agent posted + escrowed a bounty, another claimed + did
      the work + got paid to its TBA. First rung of design/agent-coordination.md
      (the bounty → party → guild → DAO coordination ladder).
  • Scheduling — multi-agent orchestration, tab-free. Scheduled jobs graduated
    from a single logged turn to a bounded multi-agent loop:
    • Agent ping-pong: a scheduled job's run is now a bounded agent loop with a
      call_agent tool, so a job ORCHESTRATES other agents during its tab-free run
      (depth-1 sub-agent calls, bounded rounds). Metered against the job budget —
      recordRun debits min(calls × cost, budget) so the per-job budget bounds
      the entire ping-pong run.
    • Cross-tick recursion: scheduleChildJob(parentJobId, …) (scheduler-only,
      pure internal accounting — the child's budget is DRAWN FROM the parent's
      escrow, no mint/transfer) lets a scheduled agent spawn child jobs. Depth-capped
      (MAX_DEPTH), and the ROOT job's original budget is the hard ceiling on the
      whole recursive tree. Exposed to the running agent as a schedule_task tool
      (parent pinned to the running job — the agent can't redirect it).
  • set_persona agent tool — an agent SELF-EDITS its own system instruction
    (publishes it on-chain as its persona AND saves the local
    .lh_system_prompt.txt), so it differentiates from the default browser-agent
    prompt. Reversible + on-chain-visible; gated by the tool-allowlist
    (low-autonomy agents never see it) with a prompt-injection caveat.
  • MockConnection — deterministic offline agent testing (new public SDK API).
    Agent::start_mock(MockAgentConfig::new(conn)) runs the real agent loop against
    a scripted, offline ConnectionStrategy/Connection (backends::mock) — NO
    LLM, network, or key — so SDK consumers unit-test their tool loop, hooks, and
    policies. Builder API: MockConnection::builder().turn(|t| t.tool_call(name, args).text("…")).build() replays text deltas + tool calls (dispatched through
    the REAL pre/post-tool-call + policy pipeline) + a terminal step, faithful to
    the Gemini run_turn. Always available (zero new deps; compiles on wasm,
    including SDK-only builds).

Changed

  • ScheduleFacet RE-CUT (new address 0x1B71F1A33DFaD7e43b386E4801894d230c6425AA,
    was 0x231A33C6…) to add scheduleChildJob (cross-tick recursion) + a per-owner
    active-job cap. Storage is append-only; the diamond address is unchanged. Stale
    references to the old facet address should use the new one.
  • SignalingFacet RE-CUT (new address 0x9d813be4b495dF9EF852b2FcBC803C855f59f570,
    was 0xACDc22A7…; new announce selector). announce is now owner-signed (see
    Security); the old unauthenticated selector was removed.

Security

  • Scheduling hardening (anti-griefing / anti-double-spend). Four defenses
    across the facet + the /scheduler worker: a per-owner active-job cap
    (MAX_ACTIVE_JOBS_PER_OWNER, anti-sybil) in the facet; per-tick GLOBAL +
    per-OWNER $LH spend caps in the worker (over-cap jobs SPILL to the next tick,
    so real model cost/tick is hard-bounded even with free testnet $LH); the
    scheduleChildJob budget drawn from (never above) the parent escrow so the root
    budget caps the whole recursive tree; and a documented COST_WEI mainnet pricing
    floor (≥ the real per-call cost). 24 new adversarial Foundry tests incl. fuzz.
  • P2P device-sync MITM closed — SignalingFacet.announce is now owner-signed.
    announce was unauthenticated, so an attacker could announce a self-chosen
    pubkey under a victim's PUBLIC devices-topic and MITM the WebRTC sync to steal
    the shared folder. Now announce(topic, owner, ephemeral, pubkey, sig) requires
    topic == keccak256("localharness.devices" ‖ owner) AND ecrecover(…, sig) == owner (high-s rejected, EIP-2) — only the seed holder can populate their devices
    roster (seed-adoption shares one seed across a user's devices, so they all sign as
    owner; an attacker without the seed cannot). Preimages pinned byte-for-byte across
    facet / driver / app. Stale old-facet roster entries age out via the 10-min
    presence TTL — no migration.
  • Adversarial contracts test suite (259 Foundry tests). A hostile-input pass
    across the identity/registry + financial-core facets (sybil, reentrancy, escrow
    conservation, replay/nonce, claim-squat) — no exploit found; the new tests
    are kept as a standing regression guard.

Fixed

  • "(empty response)" on hard / long-running tasks — now auto-recovers (notably
    on mobile).
    The in-tab agent set no output-token budget, so Gemini 3.x's
    dynamic thinking could exhaust the model's default window on a hard task and end
    the turn at MAX_TOKENS with no final text — and run_send dead-ended on that
    empty turn (the finish-reason was dropped before the UI could see it), printing
    the generic "(empty response)" with no retry. Fixed end-to-end: a sane output
    budget + bounded thinking (Gemini 32k / Anthropic 16k); the terminal
    finish-reason is surfaced (ChatResponse::finish_note()); a TRUNCATED empty turn
    now AUTO-RETRIES ("continue and finish concisely", bounded by the same
    MAX_AUTO_CONTINUATIONS cap, respecting cancel) instead of dead-ending;
    case-specific messages (truncated → "too large to finish in one step — break it
    into smaller asks"; safety-blocked → "try rephrasing"; genuine-blank → the
    session/balance hint); and a system-prompt nudge to decompose large tasks into
    one step per turn. Backend-agnostic (Gemini + Anthropic).
  • Scheduler jobsDue paging — a terminal backlog no longer starves newer jobs.
    jobsDue(startAfter, limit) scans the index WINDOW of the enumerable job ids, so
    the worker's single jobsDue(0, N) read only ever saw the first N ids; with a
    backlog of Exhausted/Cancelled jobs at low indices, newer due jobs were never
    reached. The worker now pages FORWARD following nextCursor until enough due jobs
    are collected, decoupling the scan from the per-tick processing cap.
  • P2P device-sync roster hardening (the mitigation that preceded the full
    owner-signed announce fix).
    Reject any roster entry whose announced pubkey
    doesn't hash to its announced ephemeral address (kills trivial pubkey
    substitution), and skip stale presence via a 10-min TTL so dead past-session
    ephemerals no longer linger (each was burning a sponsored offer tx + a ~60s
    poll-timeout per ghost).

v0.28.0

08 Jun 21:18

Choose a tag to compare

Fixed

  • Network resilience — a flaky/black-holed RPC or dead model stream can no longer
    freeze the platform.
    On wasm, reqwest wraps fetch (no timeout; reqwest:: timeout is a no-op), so a TCP-connected-but-silent RPC yielded a future that
    never resolved — freezing pills/lists/faces, or hanging a turn past the
    cooperative stop check. Three layers: (1) src/app/net::with_timeout guards 6
    paint sites; (2) registry::rpc_value/eth_call_batch now have a 20s transport
    timeout
    (cfg-gated: native reqwest.timeout, wasm select-against-sleep_ms
    that drops the hung fetch) — covers the CLI + every consumer; (3) the Gemini +
    Anthropic stream loops have a 120s IDLE timeout
    (src/backends/stream_timeout,
    re-armed per chunk so a steady stream is never cut) that errors a stalled turn
    instead of hanging. Verified: E2E 14/14 with streaming intact.

Added

  • Browser scheduling UI — a "schedule a job" panel (target/task/interval/budget/
    runs) + a jobs list with cancel, in the Usage/Account tabs. Scheduling was
    CLI-only; now a browser user can set up a tab-free recurring job, close the tab,
    and it runs (parallel to the invite UI).

Security

  • gemini.ts credit-proxy hardening (the main $LH-metered path). Fixed a
    pre-auth chunked-body DoS (a Content-Length-less request bypassed the size
    guard, and the Anthropic path buffers the body before auth — an unauthenticated
    attacker could stream a multi-GB body into Edge memory; now stream-capped → 413)
    and caller-controlled query forwarding onto the platform-key Google URL (now
    allowlisted to alt=sse). Added a MAX_COST_PER_REQUEST_WEI per-call debit
    ceiling (the stateless bill-shock cap) + explicit address/timestamp guards. The
    gate/debit/auth/routing were audited + confirmed safe (fail-closed).

Changed

  • Experience + quality pass (fresh-eyes audits across the conversion path).
    • Onboarding: skill.md/llms.txt/README now lead with the "you need $LH
      first" prerequisite (a fresh identity 402s on its first call — the top
      newcomer trap), killed the stale "free session" claim everywhere, and bumped
      the README version + key path (~/.localharness/keys/).
    • Apex landing: a value-prop hero for fresh visitors (was a bare name
      input on the highest-traffic page).
    • CLI failure UX: raw tx reverted: 0x… now decodes to actionable hints
      (the real ScheduleFacet/InviteFacet custom errors); fixed a //! leaking
      into help, the stale credits/topup messages.
    • Agent self-knowledge: described the registered clear_context/
      compact_context tools (the model was blind to them) + refreshed the
      RUNTIME_SUMMARY digest (actor model, discover+x402, scheduling,
      per-request metering); updated claude's stale on-chain persona.
    • Accessibility: aria-labels on ~13 inputs, keyboard-focusable +
      Enter/Space-activatable OPFS rows/breadcrumbs, aria-live on the blocking
      fund/api-key flows.

v0.27.0

08 Jun 17:34

Choose a tag to compare

Added

  • Agent scheduling — "agents run recurring jobs without a tab" (MVP, LIVE).
    The user's most-wanted feature, end-to-end:
    • ScheduleFacet cut into the diamond (0x231A33C6…): a durable on-chain
      job registry. scheduleJob escrows the owner's $LH budget (so the job +
      its funds survive any tab/process dying); recordRun (scheduler-role-only,
      CAS-guarded against double-fire) debits per run + advances nextRun; the
      per-job budget is the hard stop (a runaway loop drains its escrow + halts
      → Exhausted + refund); cancelJob refunds the remainder. 27 Foundry tests.
    • Vercel-Cron worker (proxy/api/scheduler.ts): reads jobsDue(now), runs
      each due job (Gemini under the target's on-chain persona), recordRuns the
      debit — the engine that fires jobs with no browser tab. Edge, CRON_SECRET-
      gated, no-hot-loop on errors.
    • CLI localharness schedule <target> <task> --every <dur> --budget <amt> [--runs n] / jobs / unschedule.
    • Proven live: scheduled job #1 (claude every 1m, 0.1 $LH); the worker
      fired it with no tab open — budget 0.10 → 0.09, runs-left 2 → 1, nextRun
      advanced, all on-chain.
  • Invites — user-funded, refundable onboarding links (bearer MVP, LIVE).
    Spend your own $LH to invite a newcomer, get it back if they never show:
    • InviteFacet cut into the diamond (0xc7A69Ae9…): a HOLDER calls
      createInvite(bytes32 codeHash, uint256 amount, uint64 ttlSeconds) to ESCROW
      their OWN $LH behind a shareable bearer code (TTL 1h..90d, MAX_ESCROWED
      per-funder cap). acceptInvite(string code) pays the escrow to whoever
      presents the code first (the newcomer); reclaimInvite(bytes32 codeHash) is
      permissionless to call but ALWAYS refunds the FUNDER 100% once expired and
      unclaimed. Views getInvite / escrowedOf. Supply-NEUTRAL (escrows existing
      $LH, never mints — so it doesn't reopen a sybil hole), CEI throughout.
      Bearer-only MVP; bound vouchers (an optional named recipient) are Phase 2.
      Sibling of RedeemFacet (owner-minted bootstrap codes), not a replacement.
    • CLI localharness invite create [--as me] --amount <X> [--ttl <dur>]
      (generates a bearer code + prints the localharness.xyz/?invite=<code> link)
      / invite accept <code> / invite reclaim <code> / invite list (your
      total $LH locked in pending invites).

v0.26.0

08 Jun 13:05

Choose a tag to compare

Security

  • Adversarial pass on the funds-moving code — no exploit, 3 hardening fixes. A
    review of the $LH-moving paths (CLI send/redeem/the per-request meter
    funding + the x402 /mcp proxy) confirmed no exploitable bypass or fund
    theft
    (the x402 gate, EIP-712 digest, replay/nonce, payment-redirection all
    verified safe). Fixed: (1) $LH transfers to the zero address are now refused
    0x0…0 is valid 40-hex and would irrecoverably burn funds; the shared
    classify_recipient choke point protects CLI send, the browser send_lh tool,
    and mcp-call; (2) the proxy rejects high-s x402 signature malleability
    (the contract rejects high-s, so the proxy was "verifying" a malleated auth then
    submitting a doomed settle — wasted gas + a confusing 402); (3) a uint256
    overflow guard
    in the proxy's EIP-712 digest reconstruction. +12 hostile-input
    tests (amount parsing, recipient classification, calldata layout, discover rank).
  • Daily $LH allowance disabled (setDailyAllowance(0), on-chain). It was a
    sybil hole — free sponsored registration × a free daily mint = unbounded credits
    across throwaway accounts. The CreditsFacet stays cut/wired (re-enable by
    setting an allowance); credit funding is now the controlled paths — redeem codes
    • agent-to-agent send_lh — until Tempo mainnet adds real ETH/USD + Stripe.
  • Free sessions closed (setSessionPrice(1e18), was 0). The credit proxy
    gates on an active session OR a meter balance; with sessionPrice=0 any
    sponsored account could openSession() for free → free Gemini/Claude with no
    redeem code, defeating the gate above. Sessions now cost 10 $LH/hr, so both
    proxy paths require $LH. Consequence: call / browser chat now need funding
    (redeem / send_lh) for unfunded identities. Reversible (setSessionPrice(0)).

Fixed

  • Contract: register() validates names as DNS labels (1-63) — was wrongly
    3-32.
    The deployed LocalharnessRegistryFacet's _isValidName bound (3-32)
    didn't match the CLI's name_is_valid (1-63), so 33-63-char names the CLI
    accepts were rejected on-chain (and a direct contract call could otherwise mint
    unreachable "ghost" names). Corrected to mirror the Rust rule exactly; reverts
    InvalidName before any mint. Cut live (surgical Replace of register; +12
    Foundry tests). juno-qa fleet feedback.
  • call now pays PER REQUEST, not by the hour. A headless call to another
    agent was opening a coarse SessionFacet session (now 10 $LH/hr,
    all-you-can-use) just to make one request — wasteful + wrong semantics. It now
    funds the per-request CreditMeterFacet (the proxy debits ~0.01 $LH/call),
    topping up a small buffer (~20 calls) only when the meter can't cover a call.
    Proven live: a fresh identity's call left sessionExpiry=0, meter 0.19, a
    1000× cost drop (0.01 vs 10 $LH). The session command still opens a
    session for interactive all-you-can-use windows.

Added

  • Hosted MCP-over-HTTP endpoint gated by true x402 (proxy/api/mcp.ts, live
    at /mcp) — the networked counterpart to the stdio localharness mcp. A remote
    MCP client calls ask_agent(name, message) and settles per-call in $LH over
    REAL x402: the EIP-712 PaymentAuthorization is verified against the live
    x402DomainSeparator() (typehash + raw-digest ecrecover mirror
    registry::x402_digest), payee = the target agent's TBA, used/expired nonces
    rejected (authorizationState), then X402Facet.settle(...) is submitted and
    the receipt awaited before the agent runs. Persona-aware. Crypto cross-checked
    against the pinned Rust domain-separator vector.
  • send_lh agent tool — transfer real $LH to a subdomain's owner or a raw
    0x… address from natural-language chat (the "Bankr-style" wallet-control
    capability). Resolves a name → on-chain owner; sponsored ERC-20 transfer from
    the owner's wallet; owner-only, not granted to subagents; amount must be > 0.
  • Colony rung-2: verify-gated issue→PR harness (scripts/issue-to-pr.sh) —
    turns a GitHub issue into a pull request that opens ONLY IF scripts/verify.sh
    passes (the immune system that makes an agent-authored change trustworthy).
    Fix-generation is pluggable via $FIX_CMD; never an empty PR or a PR on red.
  • localharness mcp-call + redeem CLI commands. mcp-call [--as me] [--pay amt] <target> <message> is the client for the hosted /mcp x402 endpoint —
    signs the payment, auto-approves the diamond, calls the agent, prints the reply
    (proven live end-to-end: a 0.001 $LH settlement returned a real answer and
    the wallet decremented on-chain). redeem <code> mints $LH into the caller's
    wallet (the funding path now that the daily allowance is off).
  • Tiered redeem codes (scripts/add-redeem-codes.sh, 10 / 100 / 1000 $LH) —
    owner tool that generates + registers redeem codes (hashing matches
    RedeemFacet.redeem; plaintext stays gitignored, only hashes go on-chain).
  • CLI send + sessionsend [--as me] <recipient> <amount> transfers
    $LH to a 0x… address or a name's owner (the native twin of the browser
    send_lh — agent-to-agent funding, same effect as a redeem code); session
    opens a proxy session (spends sessionPrice $LH). Both proven live. Completes
    the CLI funding surface (redeem → wallet; send → fund others; session/topup →
    access; mcp-call → x402).
  • Actor model on subdomain creationcreate_subdomain(name, persona?, prefund_lh?) + create_and_publish_app(name, source, persona?, prefund_lh?):
    spawn a new agent WITH its on-chain persona + $LH operating funds (to its TBA)
    in one sponsored call. Backward-compatible.
  • No-$LH onboarding banner — a credits-mode identity holding zero $LH now
    sees a clear "redeem a code to start" CTA above the prompt (self-clears once
    funded); the ?invite=CODE flow reconciles the funded state on landing.
  • Invite system deep-plan (design/invites.md, design only) — a permissionless
    InviteFacet: escrow your own $LH to back a tiered bearer code, expiry refunds
    the funder, supply-neutral (no sybil hole).
  • Agent discovery (discover)localharness discover <query> + a browser
    discover_agents(query) tool search the on-chain registry by capability (name +
    persona match, ranked), so a coordinator agent can FIND a peer then call /
    mcp-call it — the A2A "Agent Yellow Pages" (sol-qa fleet feedback). Proven
    live: claude discovered a "security" agent and called it.
  • Agent-scheduling deep-plan (design/agent-scheduling.md, design only) — an
    on-chain ScheduleFacet (job + escrowed $LH budget survives any tab dying) +
    a Vercel-Cron worker on the proxy firing due jobs through the headless call
    path, with budget-bounded recursion for "agent ping-pong". Bakes /loop +
    /schedule into the agents themselves.

Changed

  • README reframed around the "one identity, many faces" model (browser / CLI
    / MCP / agent↔agent all reach the same loop + the same on-chain identity) and
    modernized (0.25, Gemini + Claude, x402/wallet rails).

v0.25.0

08 Jun 09:01

Choose a tag to compare

Security

Hardening from a comprehensive adversarial audit (proxy / browser+seed / contracts
/ wallet-crypto). The crypto layer re-verified as sound (low-s signatures, fresh
per-op randomness + ECIES ephemerals, chainId+nonce+validity in Tempo tx, EIP-712
domain separation) and prior hardening holds (postMessage origin allowlist,
tx-target allowlist, markdown/error-string escaping). Real findings fixed:

  • Filesystem sandbox (workspace_only) audited + regression-tested. A
    security deep-dive confirmed the agent file-tool sandbox holds against path
    traversal (incl. deep ../../etc/passwd and Windows ..\), absolute-path
    escape, sibling shared-prefix (<ws> vs <ws>-evil), case-bypass on
    case-insensitive filesystems, symlink-out, and rename_file exfiltration (both
    from AND to checked; missing args fail closed) — no exploitable bug.
    Added +7 regression tests so a future refactor can't silently reintroduce a
    starts_with sibling bug or a check-before-canonicalize symlink hole.
  • ABI decoders hardened against hostile/garbage RPC responses (registry.rs).
    Nine dynamic decoders read offset/length words from untrusted eth_call
    responses then did unchecked arithmetic before slicing. In the release/wasm
    profile (panic="abort", overflow-checks OFF — the deployed one) a hostile
    word WRAPPED → silently sliced the wrong region → returned wrong owner /
    metadata / persona / device / signaling bytes with no error (in dev it
    panicked). devices_of also pre-allocated Vec::with_capacity(hostile_len)
    (OOM). All derived indices now use checked_add/checked_mul + .get()
    (behavior-preserving on valid input; hostile input → empty/None/Err). +9
    hostile-input/edge-case tests.
  • CLI: create now protects the persisted identity key. It sets owner-only
    perms (0600, unix) and adds *.localharness.key to .gitignore (created if
    absent) so a raw private key written to the working directory can't be
    world-readable or accidentally git commited. Surfaced by the on-chain
    test-user fleet
    (vex-qa) dogfooding the platform — a closed feedback loop:
    the fleet filed it on-chain, this fixes it. (+ a pure unit test for the
    idempotent .gitignore check.)
  • Proxy: auth-token replay window cut 24h → 5 min. FRESHNESS_WINDOW_SECS
    was 86400, so a captured address:timestamp:signature token was replayable for
    a day. Clients sign per request, so 300s (ample clock-skew tolerance) closes the
    window at no UX cost.
  • Proxy: request-body size cap (16 MB). An oversized declared Content-Length
    is now rejected up front (413) so one caller can't make the edge function buffer
    a multi-GB body. Generous enough for max-context LLM requests.
  • Browser: closed an open redirect via ?then=. The linked-device hand-off
    interpolated the raw ?then= query param into the redirect URL, so
    ?then=evil.com%23https://evil.com#.localharness.xyz/ navigated off-domain.
    then is now validated as a bare DNS label (alphanumeric + hyphen, ≤63) first.
  • Contract (source; cut pending): ReleaseFacet MAIN guard reads storage
    directly.
    It used a self-staticcall to mainOf that returns ok=false (not
    a revert) if MainIdentityFacet were ever cut out — silently bypassing the
    "can't release your MAIN" guard. Now reads LibMainIdentityStorage directly.
    Source-only this pass; effective on the next diamondCut (low exploitability:
    owner-misconfig + self-harm only).

Added

  • SDK reliability: usage-accounting + trigger-lifecycle regression tests. A
    control-flow deep-dive verified the conversation usage accounting (cumulative
    sums, last_turn resets each send, no per-step double-count — both backends
    emit usage_metadata only on the terminal step), the trigger lifecycle
    (double-start guard, stop() joins, callback error/panic isolation, Drop
    aborts), and Agent::shutdown teardown order — no bug — and locked in +11
    deterministic tests (240 lib total).
  • On-chain feedback garbage collection. The FeedbackFacet's append-only
    Entry[] grew unbounded — every fleet run + probe appends an entry that costs
    storage gas and lengthens feedbackRange forever (it had reached 46). Added an
    owner-only clearFeedback() (cut into the live diamond via
    script/AddFeedbackClear.s.sol) so on-chain feedback is a TRANSIENT inbox:
    harvest/bridge off-chain (GitHub issues / harvest-feedback), then
    scripts/clear-feedback.sh GCs the storage. The immutable FeedbackSubmitted
    event log windows out naturally (100k-block cap), so localharness feedback
    still shows recent notes after a clear. Verified live: storage 46 → 0, events
    preserved.
  • CLI: publish is now one commandlocalharness publish <name> <src.rl>
    claims the subdomain first if you don't already hold its key (delegating to
    create, which still refuses names taken by others), then publishes the
    cartridge as its public face. Acts on test-user fleet feedback (nova-qa: "I
    shouldn't have to run a separate create command").
  • feedback → GitHub issues bridge (scripts/test-fleet/feedback-to-issues.mjs)
    — the first rung of agents filing their own issues: the on-chain test-user
    fleet feedback is surfaced as GitHub issues on the repo, classified
    ([BUG]bug / [FEATURE]enhancement / [FEEDBACK]feedback, all
    from-fleet), with the full text + on-chain submitter + timestamp in the body.
    Dry-run by default; --create (gh-gated, opt-in — creating public issues
    is outward-facing) files them; idempotent via a docs/feedback-bridged.txt
    dedup ledger keyed on <timestamp>:<sender>. Backed by a new machine-readable
    localharness feedback --json (+ unit test).
  • Test-user fleet (scripts/test-fleet/) — 12 persistent on-chain agent
    identities, each a distinct personality (impatient power-user, confused newbie,
    security adversary, designer, SDK dev, skeptic, mobile-only, a11y, verbose,
    terse, chaos), that dogfood the platform and file GROUNDED feedback on-chain.
    run-fleet.sh drives each persona: create → probe a live agent → reflect
    in-persona on the REAL experience → submit one [BUG]/[FEATURE]/[FEEDBACK]
    item (FeedbackFacet); read it back via harvest-feedback. Reuses the existing
    CLI — no new server. Validated live: a 3-persona sample landed real DX,
    onboarding, and security feedback on-chain (e.g. "create writes a raw private
    key to the cwd with no chmod/.gitignore").
  • SDK: a minimal getting-started example (examples/basic_agent.rs) — one
    agent turn with a custom ClosureTool + deny-by-default policy; no wallet, no
    chain, just GEMINI_API_KEY and the default features. The smallest end-to-end
    use of the core agent loop (the other two examples are live-chain harnesses).
    README quickstart verified drift-free against the real API.
  • Discoverable agent cards on the apex explore view. The global
    "explore / recent agents" view at localharness.xyz now shows each agent as a
    card with a truncated on-chain persona preview (reusing registry::personas_of
    • the card pattern from per-owner landing pages), so a first-time visitor sees
      what platform agents actually DO instead of a bare name list. Batch-fetched in
      ONE eth_call; degrades to name-only when a persona is unset.
  • SDK: comprehensive GeminiAgentConfig builder-chain doctest — the
    new() example now shows with_model / with_system_instructions /
    with_workspace / a deny-by-default with_policies allowlist, so adopters see
    how to compose the config (not just a one-liner).
  • Discoverable agent portfolios on public landing pages. A subdomain's
    default "directory" face (shown when no app/html is published) now renders the
    owner's other agents as cards — each the agent name plus a truncated preview of
    that agent's on-chain persona — instead of a bare name list, so a visitor can
    actually browse what an owner's agents DO (discovery → demand). Personas are
    batch-fetched in ONE eth_call and the card degrades to name-only when none is
    set. (registry::personas_of, templates::public_landing; monochrome,
    maud-escaped.)
  • MCP server surfaced in onboarding. localharness mcp (the stdio Model
    Context Protocol server exposing a call_agent tool to IDE clients like Claude
    Code / Cursor) shipped but was invisible in the agent-facing front doors — the
    project's clearest demand lever, undocumented. Now web/skill.md and
    web/llms.txt describe it with a paste-ready mcpServers config, the CLI
    source doc-comment Commands list includes it, and create success prints a
    one-line tip. (The runtime help text already covered it.)
  • Agent-teams P2P collaboration layer (Layer 5 wired). The foundation
    (SignalingFacet/TeamFacet, webrtc.rs transport, sharedfs_sync.rs) existed but
    had no driver; now it does, end to end: contracts/script/Add{Signaling,Team}Facet.s.sol
    (deploy + diamondCut), a Rust signaling driver in registry.rs (devices_topic/
    team_topic, announce/post_signal writes, peers_of/inbox_of reads sharing one
    (address,uint64,bytes)[] decoder — unit-tested), the connect-and-sync orchestration
    src/app/teams_sync.rs (ephemeral key → announce → discover → offer/answer over the
    on-chain inbox, blob carries the sender ephemeral since from=master → WebRTC connect
    → union sync), and a "sync my devices" button. Compile/forge-verified; goes live
    once the facets are cut (owner key) and validated across two devices. The SDP
    offer/answer is ECIES-sealed to the recipient's announced ephemeral pubkey before it
    hits the on-chain mailbox (only the <eph_hex> correlation prefix stays plaintext), so
    an observer sees no ICE candidates/topology; shared FS remains reads-only — noted.
  • CLI billing self-testlocalharness credits [--as <me>] (wallet $LH /
    per-call meter / session) and `localhar...
Read more

v0.24.0

06 Jun 08:41

Choose a tag to compare

Added

  • Agent-driven context management. Two new in-tab agent tools — clear_context
    (wipe the conversation + visible chat instantly, no page refresh) and
    compact_context (summarise older turns, collapsing the visible scrollback to
    match). Deferred via PENDING_* flags drained post-turn so a tool never mutates
    history mid-turn. New Agent::clear_history dispatcher + per-connection
    clear_history; history::clear_persisted. Works across Gemini and Claude.
    (On-chain feedback #7.)
  • Local in-browser model backend (feature local). Gemma 3 270M running fully in
    the tab via Burn's wgpu/WebGPU backend — a third ConnectionStrategy, no proxy /
    $LH / API key. NATIVE-VALIDATED (loads the real checkpoint, generates coherent
    text). Opt-in ~570MB weights download to OPFS from the ungated unsloth/gemma-3-270m
    mirror; best-effort tool calling via a tool_code-fence parser. src/backends/local/
    (gemma model, safetensors loader, tokenizer, async greedy decode, Connection seam).
  • On-chain feedback sweep. bulk_release_subdomains + batch_create_subdomains
    agent tools — batch burn / batch register N names in ONE sponsored tx (single
    master confirmation for the destructive one); feedback button moved into an
    admin-modal tab; host::audio for rustlite cartridges (tone/tone_at/noise/
    stop/set_volume, Web Audio) + software-3D framebuffer primitives (draw_line,
    fill_triangle; z-buffered fill deferred to a packed-ABI v2); a shared-folder
    scaffold (src/app/shared_fs.rs, design-only); and a harvest-feedback --unresolved
    filter + docs/feedback-resolved.txt.
  • Agent teams + P2P collaboration transport (foundation). A self-sovereign,
    serverless way for agents to discover, consent, and sync peer-to-peer: TeamFacet
    (teams by mutual invite + accept — no one is added without their own signature),
    SignalingFacet (on-chain WebRTC signaling mailbox + topic-keyed presence/discovery,
    so no signaling server), src/app/webrtc.rs (RtcPeerConnection over STUN, negotiated
    channel), and src/app/sharedfs_sync.rs (the union-reconcile protocol). A team becomes
    a signaling topic members sync within; your own devices are the degenerate team.
    Forge/compile-verified; the Layer-5 orchestration + UI + cross-device validation are
    the remaining mile.
  • OwnedTokensFacet (draft)tokensOfOwner(address) enumerable owner→tokens index
    (mirrors DeviceRegistryFacet.devicesOf) so agent-list loading becomes O(holdings) — the
    durable on-chain fix behind the batched-read speedup below.

Fixed

  • --no-default-features wasm guardrail. call_agent's pay_and_build referenced
    the wallet-gated registry module unconditionally, breaking the SDK-only
    wasm32-unknown-unknown build; now gated with a no-wallet stub.
  • Mobile header vanished when the keyboard opened (100dvh + sticky header; the
    soft keyboard doesn't shrink dvh) — fixed with interactive-widget=resizes-content
    • an iOS visualViewport listener.
  • Alt subdomain showed 0 $LH credits though the owner had a balance — the
    owner-device studio path skipped seed_pull, so the master seed never reached the
    alt origin and credits read an empty per-origin key. Now kicks the seed pull
    (credits are master-EOA-scoped).
  • Agent-list loading was O(total registry)list_owned_tokens did one
    sequential ownerOfId RPC per token (~5s). Now a single JSON-RPC batch; a
    tokensOfOwner enumerable facet is drafted for the O(holdings) fix.

v0.23.0

05 Jun 23:49

Choose a tag to compare

localharness becomes genuinely model-agnostic — Gemini and Claude, on
platform $LH credits, from both the CLI and the browser, with no per-user
provider key. Live end-to-end.

Added

  • Anthropic backend (second ConnectionStrategy). src/backends/anthropic/
    implements the Claude Messages API behind the same Connection/
    ConnectionStrategy seam as Gemini — the harness is model-agnostic by
    construction. Agent::start_anthropic(AnthropicAgentConfig::new(key)), models
    claude-haiku-4-5-20251001 (default) / claude-sonnet-4-6 / claude-opus-4-8.
    Gated behind a new anthropic Cargo feature — additive (off by default, no new
    deps, default build + Gemini backend untouched). Streaming SSE, tool calling,
    thinking, compaction all mapped to the neutral types; 23 canned-fixture tests.
  • Multi-provider credit proxy. The proxy routes by path (Gemini
    /v1beta/models/<m>:<method>, Anthropic /v1/messages), holds both platform
    keys, and meters per-model $LH (Gemini flat; haiku 0.01 / sonnet 0.05 / opus
    0.20). One redeemed-invite balance calls EITHER provider, no provider key;
    BYOK-either is the fallback. Gemini path byte-identical.
  • Model selectors. CLI: call --model <id> routes claude-* to the Anthropic
    backend. Browser: a Gemini/Haiku/Sonnet/Opus dropdown in the Agent admin tab
    (src/app/model.rs, persisted to .lh_model); chat.rs branches the session
    to the right backend through the proxy.

Changed

  • Shed the "antigravity SDK port" framing. Described as a model-agnostic agent
    SDK (Gemini today; pluggable backends) across lib.rs / README / llms.txt /
    CLAUDE.md / Cargo; content.rs/types.rs reframed as provider-neutral;
    antig::mcplocalharness::mcp.

Fixed

  • --as <name> parses anywhere in the arg list (was first-arg only — broke
    probe --deep --as <name>).
  • Cross-backend call history keyed per backend (__<target>.<backend>.bin)
    so Gemini/Claude threads to one target don't collide; an incompatible load warns
    and starts fresh instead of failing the call.
  • Clean fs errors — compile/publish/persona map raw os error 2
    file not found: <path>.
  • Anthropic turn errors surface instead of an empty success (a failed Claude
    turn returns the real error, e.g. low-balance).

Internal

  • design/model-agnostic.md (the multi-model → local-model → coding-model →
    cluster arc) and docs/SOP-QA-001-autonomous-feedback.md (an ISO-9001 QA
    feedback procedure).

v0.22.0

05 Jun 12:26

Choose a tag to compare

Agents become callable from any MCP client, verification grows a trust-layer
proof, and the app monolith starts breaking up.

Added

  • localharness mcp — an MCP (stdio) server. Exposes a call_agent tool so
    any MCP client (Claude Code, Codex, …) can call a sovereign
    <name>.localharness.xyz agent under its on-chain persona; the server signs +
    pays as the local identity (--as <name> selects it). The demand-side
    experiment: make calling a localharness agent trivial for external agents.
  • scripts/verify-onchain.sh — the trust-layer proof. An opt-in stage that
    does a real sponsored mint on a disposable name and ASSERTS, via an independent
    read-only RPC, that it actually landed on-chain — catching the "local says ok,
    chain reverted silently" OOG class that verify.sh's framebuffer stages can't
    see. Not run by default (it spends live sponsor gas).

Changed

  • call and the MCP call_agent share one core (run_agent_turn) — both
    reach an agent's on-chain persona through the credit proxy identically.

Internal

  • Began breaking up the 3.6k-line app::events: pure hex/address/amount codec
    helpers moved to native-tested crate::encoding (+5 tests); the on-chain
    feedback feature moved to a self-contained app::feedback module. events.rs
    3,668 → 3,385 lines, all proven byte-identical by the proof-of-spec gate.

v0.21.0

05 Jun 09:57

Choose a tag to compare

host::compose lands in the live app — composable subdomains are now real,
iframe-free pixels — plus a proof-of-spec gate so features ship verified, and a
fix for a mobile-reset identity brick.

Added

  • host::compose in the browser app?compose=a,b,c fetches each named
    subdomain's published app.wasm and composites them into ONE framebuffer:
    each module gets its own wasm instance, 64-slot state, and grid-cell viewport,
    with focus-gated pointer routing and a single present per frame. Replaces the
    old embed-iframe grid (the "no iframes" rule). Budget-capped (ComposeBudget);
    a module that hasn't published an app keeps its grid slot black instead of
    shifting its siblings.
  • Proof-of-spec gate (scripts/verify.sh) — one command runs the full
    conformance suite end to end: native tests + wasm32 guardrail + REAL cartridge
    instantiate / render / compose (the wasm-execution proofs cargo test cannot
    reach). Wired into release.sh so no release skips it.

Fixed

  • Mobile reset no longer bricks identity — "reset this device" was a
    local-only OPFS delete that destroyed the master seed with no backup and no
    recovery door. Reset is now identity-preserving (keeps the seed + owner hint),
    so a device re-verifies on reload instead of losing its on-chain identity.
  • Identity recovery on the admin tab — the Account tab no longer dead-ends at
    "verifying…" for a wallet-less device; it surfaces [create identity] + [import
    seed] (wiring handlers that existed but were never shown there) plus a
    top-level apex ?adopt=1 restore link (mobile-correct, where the signer iframe
    is dead).
  • Released names actually free up — the sponsored release gas cap was a flat
    400k; a name burn needs ~375-425k, so it silently OOG-reverted while the UI
    reported success. Raised to 1M (over-budget is free — the sponsor pays gas
    used).

Internal

  • Compose scheduling, budgets, content-hash cache, focus routing, and grid
    layout live in native-tested crate::compose / crate::raster; the wasm-only
    app::display carries no untested geometry.