server@0.73.0
Minor Changes
- ea9f56b: Gram Functions tool-call and resource-read POSTs now retry on a saturated runner's
429 + Retry-Afterand Fly's503(both guaranteed before the function runs) instead of surfacing transient saturation as a hard failure, with jittered backoff to spread simultaneous retries and avoid a thundering herd. Transport errors that are transparently retried now log atWARNrather thanERROR, so recovered attempts no longer look like failures while the final unrecovered failure is still logged as an error. - c1ef552:
remoteSessionClientsand the org-admin client views now source theuser_session_issuerrelationship entirely from the join table. TheRemoteSessionClientresult replaces the singleuser_session_issuer_idwith auser_session_issuer_idsarray (breaking), create/clone accept zero or moreuser_session_issuer_idsso a client can be created standalone, and a client's issuer attachments are now managed through the newattachUserSessionIssuer/detachUserSessionIssuerendpoints instead ofupdate. No more reads or writes of the legacyremote_session_clients.user_session_issuer_idcolumn. - 4b45485:
chat.loadnow returns atotalsobject with whole-generation trace-entry counts (total,user_messages,assistant_messages,tool_calls,tool_results,risk_only). Because the detail-sheet transcript is paginated, the filter bar previously derived its counts from the loaded page — showing e.g. "Showing 150 of 150 entries" on a 19k-message chat, and a risk count that disagreed with the (generation-scoped) risk-only transcript. The dashboard now renders these counts from the server totals. Totals are scoped to the returned generation so they stay consistent with the messages on screen. - 1ba5adb: feat(dashboard): search within a chat thread. The chat detail sheet gains a find-in-conversation bar backed by full-thread server-side text search (
chat.loadqueryparam returns the messages matching the query plus surrounding context, mirroring the risk-windowed view). Jump between matches with the prev/next controls or Enter/Shift+Enter (wrapping at the ends), Escape clears. The active match is highlighted bright yellow and the rest pale — across message text, tool names, and tool argument/output sections — and the tool holding the active match expands, collapsing again as you navigate away. - 0d23d1f: Add
mcp_server_idas an optional filter on the observability overview query surface (getObservabilityOverview), threaded through the ClickHouse telemetry builders, the Goa payload, and the logs platform tool. A singlemcp_server_idscopes a fronting MCP server's activity across both remote-backed and toolset-backed sources. - ef2f5ef: Add an organization-level observability mode that makes generated hook plugins fully non-blocking. When enabled, hooks only observe and report and can never deny or delay a tool call. Defaults off, preserving existing behavior. Toggle it from the organization logging settings.
- 6f3180d: chat.load now paginates a generation's messages by
seqkeyset (limit,before_seq,after_seq) and exposes each message'sseqplushas_more_before/has_more_after. A newrisk_onlyflag returns just the messages with active risk findings padded with surrounding context, grouped into contiguousrisk_segmentsthat can be expanded on demand. The chat detail sheet consumes this with a virtualized transcript (@tanstack/react-virtual, constant DOM node count regardless of how many pages are loaded) and infinite scroll (scroll up to load older messages, anchored so the viewport doesn't jump), and renders the risk-only view as expandable segments with load-above/below and gap-fill controls. - 465ac0d: Function deployments now prefer the operator-set
memory_mib_override/scale_overridecolumns over the config-driven memory and scale, and carry those overrides forward across redeploys so they are not reset by a later customer deploy. - a942a2a: Add a common webhook-trigger abstraction and use it to ship Slack, Linear, and GitHub webhook triggers. A new
HMACScheme+WebhookVendorspec intriggers/webhook.gocentralizes signature verification (HMAC-SHA256/SHA1, hex/base64, prefix, timestamped templates with replay window) and envelope assembly, so a new webhook source lands as a small vendor file describing its signing scheme, event types, and an ingest function. Slack is rebuilt on the abstraction (no behavior change); Linear (HMAC-SHA256 hex over the bare body,Linear-Deliverydedup, comments fold onto their parent issue's conversation) and GitHub (sha256=-prefixed hex,X-GitHub-Deliverydedup, PR/review/comment correlation onto the PR, pushes onto repo+branch) are added as new triggers. All three share the same default-deny event-type allowlist + CEL filter semantics.
Patch Changes
-
d6d459e: assistants now reap individual stopped runtime VMs once they've been idle for 14 days, instead of waiting for the entire assistant to fall silent for a week. Busy projects no longer accumulate orphaned per-thread Fly machines, and the next event on a dormant thread cold-launches into the same Fly app — keeping its IP and secrets.
-
f0b8e05: Assistants now pick up MCP server additions and removals on the next turn instead of only on a fresh runtime bootstrap. The per-turn dispatch sends the current MCP set to the runner, which reconciles its live connections without recycling the VM. Previously a newly attached integration (e.g. GitHub MCP) stayed invisible to the running assistant until the runtime was restarted, leaving the model unable to use it or to invoke
mcp_force_reconnectfor it. -
23000bc: Isolate Claude Code session identity per
session.idwhen an OpenTelemetry Collector or gateway re-batches multiple sessions into one OTLP logs export, so a session is never cached or authorized with another session'suser.email/organization.id. -
84df8f5: Gram Functions tool calls now size their Fly concurrency limits to real execution capacity (so memory bumps no longer inflate the request cap), return a retryable
429 + Retry-Afterwhen a runner is saturated instead of dropping the connection, and retry tool-call POSTs only on safe pre-response transport errors. -
2fe346b: Public MCP and OAuth routes now start a fresh server-side trace per request and record the inbound W3C trace context as a span link, instead of adopting the client-supplied
traceparentas the span parent. This stops third parties from merging unrelated requests into one trace or steering our trace ids, and drops client-suppliedbaggageon those routes before it reaches handlers. The trusted/rpcand/adminsurfaces keep end-to-end parent-child trace continuity and their inbound baggage unchanged. -
b0002bc: The Challenge UI now suppresses challenges raised by users outside the organization. Previously, when a Speakeasy staff member impersonated a customer org their authz decisions appeared as challenge entries — and because internal users switch accounts frequently, these entries repeatedly cluttered the list.
access.listChallengesandaccess.listChallengeBucketsnow only return challenges whose principal is an active member of the organization or has no Gram user identity (e.g. API keys and external end-users); challenges from Gram users who are not members of the org are filtered out in ClickHouse so counts and pagination stay correct. -
d9604a2: fix(assistants): stop a single bad assistant turn from tearing down and recreating its runtime forever. Errors returned by a live runtime are now treated as terminal (and capped) instead of being mistaken for a dead machine, and a hard ceiling fails an event after repeated teardowns so a stuck event can no longer churn machines indefinitely.
-
3955c10: Better performance on tool logs page
-
b968804: Exclude tools lists from registry list view to lean out the response size and make the catalog experience more reliable in flake-y network conditions
-
44acd27: Deleting a chat that backs an active assistant is now blocked and returns a conflict. Previously the chat could be soft-deleted out from under a running assistant, which broke the assistant's ability to load its conversation and could leave it silently wedged.
-
e0da996: A chat that backs an active assistant now clears its soft-deleted state automatically when it receives another message, so an assistant whose chat was deleted out from under it recovers instead of staying wedged. Chats with no active assistant are left deleted, so this never resurrects a chat a user intentionally deleted.
-
081259c: Costs and session views now show a correct total token count for AI-coding sessions (Claude Code, etc.). These providers report input and output tokens but never emit
gen_ai.usage.total_tokens, which previously made per-session and per-user totals read "0 tokens". The telemetry queries now derive the total from input + output when the provider omits an explicit total, while sessions that do carry one are unchanged. -
9da601f: fix(assistants): stop assistant threads from getting stuck when a model response is cut off mid-tool-call. A truncated generation used to be saved with malformed tool-call arguments, which made the thread fail and retry forever (silent assistants, wedged cron digests). Such generations are now dropped at capture while the preceding messages are kept, so the thread stays usable.
-
6453492: fix(hooks): harden hook ingest against transient connection resets. Plugin hook senders now retry a dropped request with backoff instead of blocking the tool call or silently losing the event, and the server de-duplicates redelivered events so a retry is recorded exactly once across all coding assistants.
-
789beea: Improve failure handling and diagnostics for plugin and server-generated hooks.
- The Cursor hook now fails closed (emits a
denywith a readable reason) when Gram is unreachable or returns an error, instead of silently allowing the call and bypassing blocking policies. Only a2xxis treated as a decision; a3xx(e.g. an unfollowed redirect) now fails closed too. - Hook success is restricted to
2xxacross the Claude and Cursor hooks (previously2xx–3xx). - The Cursor hook surfaces missing credentials, accepts both
GRAM_HOOKS_*and legacyGRAM_API_KEY/GRAM_PROJECT_SLUGenv vars, and passes its API key via a mode-600curl config file instead of the command line. - The Claude hook now explains
mktempfailures instead of blocking with an empty reason. - The MCP inventory payload is sent on stdin (
--data-binary @-) instead of as a command-line argument, so large inventories no longer risk anARG_MAXfailure that silently drops telemetry. - The fire-and-forget MCP inventory and identity scripts gain an opt-in
GRAM_HOOKS_DEBUG=1channel that reports why inventory or user attribution was skipped.
- The Cursor hook now fails closed (emits a
-
365542d: fix(hooks): clearer message when an MCP tool call can't be verified. The deny reason now tells you to restart Claude or run /reload-plugins instead of suggesting the session is still initializing, and includes an error code so you can tell why the call couldn't be verified.
-
bb7592f: Add a nullable
match_configJSONB column torisk_custom_detection_rules.
Detection rules will evaluate this structured condition config instead of the
singleregexpattern;regexis retained (nullable) as a fallback until a
later backfill+contract migration. Schema-only. -
4576472: Rename the internal
mcpnamepackage totoolrefand route the Codex hook's
MCP tool-name attribution throughtoolref.AttributeToolinstead of a
hand-rolledmcp__<server>__<tool>split. No behavior change. -
3ec3917: User sessions enhancements: facet filters (status, client, user, MCP server) on the User Sessions page; a sessions panel on each MCP server's Authentication tab; revoke via right-click and ⋮ menus with brand-themed status badges; and two read-only assistant platform tools (list_user_sessions, get_user_session).
-
3ec3917: Add user sessions feed: enrich the userSessions list API with issuer slug, client name, resolved subject identity, and a status filter; add a filterable User Sessions page (under the org Identity nav group) with revoke.