server@0.73.0

gram-bot released this 24 Jun 21:40

· 32 commits to main since this release

b864320

Minor Changes

ea9f56b: Gram Functions tool-call and resource-read POSTs now retry on a saturated runner's 429 + Retry-After and Fly's 503 (both guaranteed before the function runs) instead of surfacing transient saturation as a hard failure, with jittered backoff to spread simultaneous retries and avoid a thundering herd. Transport errors that are transparently retried now log at WARN rather than ERROR, so recovered attempts no longer look like failures while the final unrecovered failure is still logged as an error.
c1ef552: remoteSessionClients and the org-admin client views now source the user_session_issuer relationship entirely from the join table. The RemoteSessionClient result replaces the single user_session_issuer_id with a user_session_issuer_ids array (breaking), create/clone accept zero or more user_session_issuer_ids so a client can be created standalone, and a client's issuer attachments are now managed through the new attachUserSessionIssuer / detachUserSessionIssuer endpoints instead of update. No more reads or writes of the legacy remote_session_clients.user_session_issuer_id column.
4b45485: chat.load now returns a totals object with whole-generation trace-entry counts (total, user_messages, assistant_messages, tool_calls, tool_results, risk_only). Because the detail-sheet transcript is paginated, the filter bar previously derived its counts from the loaded page — showing e.g. "Showing 150 of 150 entries" on a 19k-message chat, and a risk count that disagreed with the (generation-scoped) risk-only transcript. The dashboard now renders these counts from the server totals. Totals are scoped to the returned generation so they stay consistent with the messages on screen.
1ba5adb: feat(dashboard): search within a chat thread. The chat detail sheet gains a find-in-conversation bar backed by full-thread server-side text search (chat.load query param returns the messages matching the query plus surrounding context, mirroring the risk-windowed view). Jump between matches with the prev/next controls or Enter/Shift+Enter (wrapping at the ends), Escape clears. The active match is highlighted bright yellow and the rest pale — across message text, tool names, and tool argument/output sections — and the tool holding the active match expands, collapsing again as you navigate away.
0d23d1f: Add mcp_server_id as an optional filter on the observability overview query surface (getObservabilityOverview), threaded through the ClickHouse telemetry builders, the Goa payload, and the logs platform tool. A single mcp_server_id scopes a fronting MCP server's activity across both remote-backed and toolset-backed sources.
ef2f5ef: Add an organization-level observability mode that makes generated hook plugins fully non-blocking. When enabled, hooks only observe and report and can never deny or delay a tool call. Defaults off, preserving existing behavior. Toggle it from the organization logging settings.
6f3180d: chat.load now paginates a generation's messages by seq keyset (limit, before_seq, after_seq) and exposes each message's seq plus has_more_before/has_more_after. A new risk_only flag returns just the messages with active risk findings padded with surrounding context, grouped into contiguous risk_segments that can be expanded on demand. The chat detail sheet consumes this with a virtualized transcript (@tanstack/react-virtual, constant DOM node count regardless of how many pages are loaded) and infinite scroll (scroll up to load older messages, anchored so the viewport doesn't jump), and renders the risk-only view as expandable segments with load-above/below and gap-fill controls.
465ac0d: Function deployments now prefer the operator-set memory_mib_override / scale_override columns over the config-driven memory and scale, and carry those overrides forward across redeploys so they are not reset by a later customer deploy.
a942a2a: Add a common webhook-trigger abstraction and use it to ship Slack, Linear, and GitHub webhook triggers. A new HMACScheme + WebhookVendor spec in triggers/webhook.go centralizes signature verification (HMAC-SHA256/SHA1, hex/base64, prefix, timestamped templates with replay window) and envelope assembly, so a new webhook source lands as a small vendor file describing its signing scheme, event types, and an ingest function. Slack is rebuilt on the abstraction (no behavior change); Linear (HMAC-SHA256 hex over the bare body, Linear-Delivery dedup, comments fold onto their parent issue's conversation) and GitHub (sha256=-prefixed hex, X-GitHub-Delivery dedup, PR/review/comment correlation onto the PR, pushes onto repo+branch) are added as new triggers. All three share the same default-deny event-type allowlist + CEL filter semantics.

Patch Changes

d6d459e: assistants now reap individual stopped runtime VMs once they've been idle for 14 days, instead of waiting for the entire assistant to fall silent for a week. Busy projects no longer accumulate orphaned per-thread Fly machines, and the next event on a dormant thread cold-launches into the same Fly app — keeping its IP and secrets.
f0b8e05: Assistants now pick up MCP server additions and removals on the next turn instead of only on a fresh runtime bootstrap. The per-turn dispatch sends the current MCP set to the runner, which reconciles its live connections without recycling the VM. Previously a newly attached integration (e.g. GitHub MCP) stayed invisible to the running assistant until the runtime was restarted, leaving the model unable to use it or to invoke mcp_force_reconnect for it.
23000bc: Isolate Claude Code session identity per session.id when an OpenTelemetry Collector or gateway re-batches multiple sessions into one OTLP logs export, so a session is never cached or authorized with another session's user.email / organization.id.
84df8f5: Gram Functions tool calls now size their Fly concurrency limits to real execution capacity (so memory bumps no longer inflate the request cap), return a retryable 429 + Retry-After when a runner is saturated instead of dropping the connection, and retry tool-call POSTs only on safe pre-response transport errors.
2fe346b: Public MCP and OAuth routes now start a fresh server-side trace per request and record the inbound W3C trace context as a span link, instead of adopting the client-supplied traceparent as the span parent. This stops third parties from merging unrelated requests into one trace or steering our trace ids, and drops client-supplied baggage on those routes before it reaches handlers. The trusted /rpc and /admin surfaces keep end-to-end parent-child trace continuity and their inbound baggage unchanged.
b0002bc: The Challenge UI now suppresses challenges raised by users outside the organization. Previously, when a Speakeasy staff member impersonated a customer org their authz decisions appeared as challenge entries — and because internal users switch accounts frequently, these entries repeatedly cluttered the list. access.listChallenges and access.listChallengeBuckets now only return challenges whose principal is an active member of the organization or has no Gram user identity (e.g. API keys and external end-users); challenges from Gram users who are not members of the org are filtered out in ClickHouse so counts and pagination stay correct.
d9604a2: fix(assistants): stop a single bad assistant turn from tearing down and recreating its runtime forever. Errors returned by a live runtime are now treated as terminal (and capped) instead of being mistaken for a dead machine, and a hard ceiling fails an event after repeated teardowns so a stuck event can no longer churn machines indefinitely.
3955c10: Better performance on tool logs page
b968804: Exclude tools lists from registry list view to lean out the response size and make the catalog experience more reliable in flake-y network conditions
44acd27: Deleting a chat that backs an active assistant is now blocked and returns a conflict. Previously the chat could be soft-deleted out from under a running assistant, which broke the assistant's ability to load its conversation and could leave it silently wedged.
e0da996: A chat that backs an active assistant now clears its soft-deleted state automatically when it receives another message, so an assistant whose chat was deleted out from under it recovers instead of staying wedged. Chats with no active assistant are left deleted, so this never resurrects a chat a user intentionally deleted.
081259c: Costs and session views now show a correct total token count for AI-coding sessions (Claude Code, etc.). These providers report input and output tokens but never emit gen_ai.usage.total_tokens, which previously made per-session and per-user totals read "0 tokens". The telemetry queries now derive the total from input + output when the provider omits an explicit total, while sessions that do carry one are unchanged.
9da601f: fix(assistants): stop assistant threads from getting stuck when a model response is cut off mid-tool-call. A truncated generation used to be saved with malformed tool-call arguments, which made the thread fail and retry forever (silent assistants, wedged cron digests). Such generations are now dropped at capture while the preceding messages are kept, so the thread stays usable.
6453492: fix(hooks): harden hook ingest against transient connection resets. Plugin hook senders now retry a dropped request with backoff instead of blocking the tool call or silently losing the event, and the server de-duplicates redelivered events so a retry is recorded exactly once across all coding assistants.
789beea: Improve failure handling and diagnostics for plugin and server-generated hooks.
- The Cursor hook now fails closed (emits a deny with a readable reason) when Gram is unreachable or returns an error, instead of silently allowing the call and bypassing blocking policies. Only a 2xx is treated as a decision; a 3xx (e.g. an unfollowed redirect) now fails closed too.
- Hook success is restricted to 2xx across the Claude and Cursor hooks (previously 2xx–3xx).
- The Cursor hook surfaces missing credentials, accepts both GRAM_HOOKS_* and legacy GRAM_API_KEY/GRAM_PROJECT_SLUG env vars, and passes its API key via a mode-600 curl config file instead of the command line.
- The Claude hook now explains mktemp failures instead of blocking with an empty reason.
- The MCP inventory payload is sent on stdin (--data-binary @-) instead of as a command-line argument, so large inventories no longer risk an ARG_MAX failure that silently drops telemetry.
- The fire-and-forget MCP inventory and identity scripts gain an opt-in GRAM_HOOKS_DEBUG=1 channel that reports why inventory or user attribution was skipped.
365542d: fix(hooks): clearer message when an MCP tool call can't be verified. The deny reason now tells you to restart Claude or run /reload-plugins instead of suggesting the session is still initializing, and includes an error code so you can tell why the call couldn't be verified.
bb7592f: Add a nullable match_config JSONB column to risk_custom_detection_rules.
Detection rules will evaluate this structured condition config instead of the
single regex pattern; regex is retained (nullable) as a fallback until a
later backfill+contract migration. Schema-only.
4576472: Rename the internal mcpname package to toolref and route the Codex hook's
MCP tool-name attribution through toolref.AttributeTool instead of a
hand-rolled mcp__<server>__<tool> split. No behavior change.
3ec3917: User sessions enhancements: facet filters (status, client, user, MCP server) on the User Sessions page; a sessions panel on each MCP server's Authentication tab; revoke via right-click and ⋮ menus with brand-themed status badges; and two read-only assistant platform tools (list_user_sessions, get_user_session).
3ec3917: Add user sessions feed: enrich the userSessions list API with issuer slug, client name, resolved subject identity, and a status filter; add a filterable User Sessions page (under the org Identity nav group) with revoke.

Assets 2