Skip to content

Quilltap 4.6.1

Latest

Choose a tag to compare

@csebold csebold released this 06 Jun 19:50
· 2184 commits to main since this release

Quilltap 4.6.1 Release Notes

4.6 gave the Estate its private rooms. 4.6.1 is the release in which those rooms learned their manners under load.

The shape of the work is simple: conversations between residents should be able to continue without a human hand in the middle of every exchange, and they should do so with integrity. They should start when they say they have started. They should remember without re-billing the whole past. They should warn the company before the lamps are put out. They should write to the right ledgers without one failed filing cabinet rolling back the evening.

That is what this release strengthens.

The Host — A Room That Can Conclude

An autonomous room is not merely a background job that happens to contain dialogue. It is a room: a company gathered, a run begun, a budget of time or tokens or turns set upon the evening, and a Host responsible for making sure the company knows where it is in the night.

Before this release, the room's outward state could lag behind its actual commitment. Starting or resuming an enclave did not mark it running until the first queued turn came around, leaving the room looking idle while work was already in motion. 4.6.1 moves that state change to the moment the operator presses the control. Ad-hoc and scheduled starts now share the same run-start contract; if enqueue fails after the row has flipped, the room rolls cleanly to error or back to idle rather than pretending to be running with nothing alive beneath it.

4.6.1 also gives the Host new pacing announcements. Halfway and near-end notices now fire at the room level, measure the run against the binding budget — turns, tokens, wall-clock time, or the daily user-token cap — and adapt their language when the room will pause rather than end. The Edit Enclave modal lets the operator change title, cron, freshness window, visibility, destructive-tool permission, and budget caps after creation; running rooms honor those changes on the next turn.

Those new warnings matter because an unwarned ending is different from a warned one. A small room can still spend too quickly: one large turn can carry it from before the near-end threshold all the way to exhaustion. In that case, the arithmetic is correct but the hospitality is incomplete. Without one more act from the Host, the room would stop because the budget said stop, and the characters inside it would not have been given the courtesy of a last sentence.

So 4.6.1 adds the missing act of hospitality. If a run reaches exhaustion and the near-end warning never fired, the Host grants exactly one grace turn. He announces that the allowance has been spent, that the company is being given one final round, and that what most needs saying should now be said. If the near-end warning did fire earlier, there is no extra turn; the company was already warned. The rule is neither indulgent nor punitive. It is precise.

The Host's work here is not decoration. It is conversation integrity. A room that can run without a human present must still know how to receive, pace, warn, pause, resume, and conclude.

The Foundryman — Meters That Tell the Truth

The Foundryman's complaint, in this release, was not that the engines lacked power. They had power. They had, in several places, misleading gauges.

Autonomous-room budgets were originally counting tokens in ways that made sense before prompt caching became real and before private rooms began running long enough to stress the seams. Cached prompt reads could be charged against caps as if they were dear tokens. A fresh run could inherit the previous run's token count because the child-process reader saw the old row while the reset was still buffered. A repeat run with a wall-clock cap could believe it had started in the past and end after one turn. Earlier fixes had already moved per-run accounting toward llm_logs; 4.6.1 finishes the honest-meter work.

Provider plugins now normalize prompt-cache hits out of usage.totalTokens, so the default budget behavior counts the expensive work, not the cache read that made the room affordable. For rooms where the operator wants the older semantics, the new Count only the dear tokens checkbox can be unchecked; Quilltap will then add cache-read tokens back into that room's per-run tally. The daily cap remains cache-excluded, because it is a cross-room limit without a per-room switch.

Fresh runs now start from zero. Resumed runs keep their count. Wall-clock checks pin to the in-handler run snapshot rather than a stale child read. Start/resume state changes are synchronous, and if enqueue fails after the row has flipped, the room rolls to error or back to idle rather than pretending to be running with nothing in flight.

A budget is a covenant with the room. It is not useful if the scale includes ghosts, cache discounts, stale rows, and housekeeping tokens as though they were all the same weight. In 4.6.1, the meters are much closer to telling the truth.

The Librarian — An Archive That Does Not Make You Pay Twice

The Librarian's trouble was the trouble of a faithful archive asked to serve a room that never sleeps: she kept bringing the whole file back to the table.

Autonomous-room summaries were supposed to fold old turns out of the active context, but the fold ran fire-and-forget inside the forked job child. The job flushed its buffered writes before the summary finished; the summary's updates settled into a buffer that had already been sent away. The visible symptom was brutal in long rooms: a conversation hundreds of character turns deep still had lastSummaryTurn pinned near the beginning, so nearly the whole history returned to the LLM again and again.

That fold is now awaited on the autonomous path. It runs after the autonomous run-id scope closes and before the next turn is enqueued, so its writes are flushed with the job and its cheap-LLM housekeeping cost is not billed against the run's own token budget. Interactive chats keep the existing fire-and-forget behavior; private rooms get the stronger guarantee they need.

A second archive leak was less philosophical and more expensive. Persisted tool results — a large read_conversation, a thousand-file doc_list_files, any bulky result that happened to sit in the transcript — were being re-injected verbatim every turn. Summaries skip tool rows, so the fold never absorbed them. 4.6.1 now stubs tool results older than three assistant turns in LLM context, preserving the tool name and argument preview while eliding the body. The stored transcript and Salon UI remain unchanged; only the prompt assembly stops re-billing stale results indefinitely.

doc_list_files also learned restraint. It now hides OS cruft and auto-generated avatar/story-background images by default, with an explicit includeAutomaticImages flag when the operator truly wants them. Saved photos are unaffected.

The archive still remembers. It simply stops making every future sentence pay rent on every past filing cabinet.

Aurora — Summons That Finish Their Work

Aurora's failures were the cruel sort: not loud enough to be theatrical, just late enough to leave an empty space where a character should have stood.

AI character import could assemble a plausible .qtap export and then fail at the moment of saving because optional text fields had been written as explicit null, because the repair loop stripped structural scaffolding from nested prompts or scenarios, or because raw control characters inside model-emitted JSON made a sub-step collapse. The result was an almost-character: generated, repaired, validated in pieces, then lost at import time.

4.6.1 tightens the summoning circle. Optional text fields are omitted when absent rather than set to invalid null. A restampStructuralFields pass restores required ids and timestamps after assembly and repair so the model cannot accidentally remove the beams while repainting the walls. parseLLMJson now repairs literal in-string newlines and tabs before parsing, which is the kind of small mercy one learns to extend to LLMs after the third time they hand you beautiful JSON with one unescaped breath inside it.

Wardrobe import had its own empty-hook problem. AI-generated wardrobe items could be built without componentItemIds or replace, and the repository allowed undefined to reach a vault writer that expected an array. Character saved; garments failed. Now wardrobe creation defaults componentItemIds to [] and replace to false at the repository chokepoint, and AI import emits those defaults explicitly.

A summoned character should arrive with their papers, their prompts, and their clothing hooks intact. Aurora has enough mysteries in her mirrors without undefined.length being one of them.

Prospero — Foundations That Refuse Poison Writes

Prospero's section is the one with the least romance and the most consequence. The house had learned to run work in child processes, to buffer writes, and to let the parent apply them. That architecture is correct. It also means a failed write must know exactly which foundation it belongs to.

Before this release, a background job's buffered writes could target three databases — the main application database, the mount-index/document-store database, and the LLM logs database — while the applier wrapped the whole batch in a transaction on only the main connection. A doc-store write failure could roll back unrelated main-DB chat and run-state writes, while doc-store rows already applied outside that transaction leaked through. In autonomous rooms, the older version of this bug could wedge a room: a duplicate folder create hit a unique constraint at apply time, the entire batch rolled back, the turn was marked dead, and the room stayed running with no job alive to continue it.

4.6.1 partitions background-job writes by target database. Each partition commits in its own transaction on its own connection. Idempotent jobs apply secondary partitions before main so a secondary failure prevents a misleading main commit and the retry path can safely run. Autonomous-room turns are different — the conversation itself is not idempotent — so their main chat/run-state partition commits first and authoritatively, while secondary doc-store failures are rolled back, logged, and dropped rather than discarding a completed turn.

Concurrent folder creation also learned to reconcile at the parent apply boundary. If two jobs create the same folder path at once, the second unique-index hit is caught, resolved to the already-committed folder, and the buffered child ids in that batch are remapped to the surviving row. The earlier per-job memo prevented duplicate creates inside one job; this closes the cross-job version.

The principle is plain: one cracked tile in the archive floor must not bring down the Salon ceiling. Prospero has made the foundations respect their own boundaries.


What Changed

  • feat (Salon): Autonomous rooms now start/resume visibly as running at control-click time; ad-hoc and scheduled starts share a run-start contract and roll back cleanly on enqueue failure. Enclaves can be edited after creation, with running rooms honoring budget/visibility/tool-policy changes on the next turn. New Host pacing announcements fire at halfway and near-end, and a one-time grace turn is granted when a run hits budget exhaustion without having received a near-end warning.
  • fix (Foundry): Autonomous-room token accounting now excludes prompt-cache reads by default across provider plugins, with a per-room option to count all tokens. Fresh runs reset token counters correctly; resumed runs preserve them. Wall-clock and token budget checks pin to the in-handler run snapshot rather than stale child-process reads.
  • fix (Commonplace): Autonomous-room rolling summaries now persist by running awaited on the autonomous path before the next turn is enqueued. Tool results older than three assistant turns are stubbed in LLM context so large historical tool outputs are not re-billed forever. doc_list_files filters OS cruft and auto-generated avatar/background images by default, with includeAutomaticImages available when needed.
  • fix (Aurora): AI character import now omits absent optional text fields instead of writing invalid nulls, restores nested structural scaffolding after repair, and repairs raw in-string control characters before JSON parse. AI-imported wardrobe items now carry safe defaults for componentItemIds and replace, and the wardrobe repository applies those defaults at the creation chokepoint.
  • fix (Prospero): Background-job buffered writes are partitioned per database and committed on the correct connection. Autonomous-room main chat/run-state writes commit authoritatively before best-effort secondary doc-store partitions. Cross-job concurrent doc-store folder creates now reconcile by resolving the already-committed folder and remapping buffered ids, preventing poison writes from wedging rooms.

Installation

Desktop App

Download from the quilltap-shell releases page:

macOS:

  1. Download the .dmg file and open it
  2. Drag Quilltap to your Applications folder
  3. Launch Quilltap from Applications

Windows:

  1. Download and run the .exe installer
  2. If SmartScreen warns about an unknown publisher, click "More info" → "Run anyway"
  3. Launch Quilltap from the Start Menu or desktop shortcut

Linux:

  1. Download the .AppImage file, make it executable (chmod +x), and run it
  2. Or install the .deb package: sudo dpkg -i quilltap_*.deb

Node.js (any platform)

npx quilltap

Or install globally:

npm install -g quilltap
quilltap

Open http://localhost:3000 in your browser. Requires Node.js 24+. First run downloads ~150–250 MB and caches locally.

Docker

docker pull foundry9/quilltap:4.6.1

Or use the startup scripts:

# Linux / macOS
curl -fsSL https://raw.githubusercontent.com/foundry-9/quilltap-server/refs/heads/main/scripts/start-quilltap.sh | bash

# Windows (PowerShell)
irm https://raw.githubusercontent.com/foundry-9/quilltap-server/refs/heads/main/scripts/start-quilltap.ps1 | iex

4.6.1 increases the integrity of resident-to-resident conversation: less operator mediation, fewer hidden stale costs, clearer endings, safer writes, and a room that can be trusted to continue without pretending unattended means ungoverned.

The private rooms are still private rooms. Now they are better hosted, better metered, better archived, and better founded.

Installation

Desktop App (recommended)

The Quilltap desktop app (Electron) is available from
quilltap-shell 4.1.12.
Download the release for your platform (macOS, Windows, or Linux).

The quilltap-linux-arm64.tar.gz and quilltap-linux-amd64.tar.gz rootfs
tarballs attached to this release are used by the shell's Lima (macOS) and WSL2 (Windows) VM modes.

Node.js (any platform)

npm install -g quilltap
quilltap

On first run, the CLI downloads the application files (~150-250 MB)
and caches them locally. Subsequent launches start instantly.

Docker

docker pull foundry9/quilltap:4.6.1

See the README for setup instructions.