Skip to content

Core Capability Negotiation

Mike Strobel edited this page Jun 26, 2026 · 1 revision

Capability negotiation

Every terminal is a little different — one speaks the Kitty keyboard protocol, the next only sends legacy key codes; one renders 24-bit color, the next quantizes to 256. Capability negotiation is how Cursorial finds out what this terminal can actually do, switches on the opt-in protocols your app wants, and hands you back a single snapshot you can branch on. It is the first thing that happens in a session, and in the common case TerminalSession does it for you.

This page covers the negotiator directly — useful when you're driving your own transports, want a read-only "what is this terminal?" probe, or need to understand exactly what a session enabled and will restore.

Detector and negotiator

ITerminalNegotiator does two jobs in one pass:

  • Detection — probe the terminal for its identity (name, version, family) and its passive capabilities (color depth, Sixel support, the modifier protocols it honors).
  • Negotiation — actively enable the opt-in protocols your app requested (mouse tracking, focus events, bracketed paste, the Kitty keyboard protocol, Win32 input mode, synchronized output) and record each one so it can be reversed later.

VtTerminalNegotiator is the concrete implementation. It drives the VT/ANSI probe-and-respond handshake over an IInputByteSource + IOutputByteSink pair and applies the enable sequences for the protocols you asked for. (On Windows-family terminals it folds the Win32-specific opt-in — Win32 Input Mode — into the same flow, gated on family identification so it doesn't claim a feature the terminal silently ignores.)

If you've used a capability-probing layer in another stack, the shape is familiar: one orchestrated handshake, one capability object out, one teardown that puts everything back.

Realized, not advertised

This is the single most important contract. The returned TerminalCapabilities reflects what the terminal actually honored, not what it claimed:

  • A terminal that advertises Sixel but never paints it is reported as not having it.
  • A protocol opt-in that the terminal swallowed without effect comes back off.
  • Truecolor can be empirically verified by a palette round-trip rather than trusted from a TERM string (the VerifyTruecolorViaRoundtrip option, on by default).

You can therefore branch on a capability flag directly — no second-guessing, no defensive re-validation:

if (caps.Output.Color.Depth >= ColorDepth.Truecolor)
    background = Color.FromRgb(20, 24, 40);     // safe: the terminal proved it
else
    background = Color.Palette(17);              // fall back to a 256-color index

The handshake

Detection is a probe-and-response dance: the negotiator writes query sequences, then reads the replies framed back by the input parser. The backbone is the XTVERSION + DA1 sentinel pattern — an XTVERSION query (which identifying terminals answer with a name/version string) followed by a DA1 (Primary Device Attributes) query whose reply doubles as a sentinel that all VT terminals answer. When the DA1 reply arrives, the negotiator knows the round-trip is complete and any terminal that was going to respond to the other queries already has — terminals that didn't answer a probe within ProbeTimeout (default 500 ms) are treated as not supporting it.

From the responses (plus TERM / TERM_PROGRAM and platform signals) the negotiator resolves a terminal familyKitty, Ghostty, WezTerm, Alacritty, WindowsTerminal, AppleTerminal, Tmux, and a couple dozen more, with Unknown / GenericVt as safe fallbacks. Family identity is how Cursorial gates protocol opt-ins (don't push Kitty keyboard at a terminal that can't speak it) and routes around per-terminal quirks. It is best-effort — the raw TERM variable alone is unreliable, since most modern terminals report xterm-256color regardless of identity.

TerminalCapabilities — the shape

NegotiateAsync returns a single aggregate record with three parts:

public sealed record TerminalCapabilities(
    TerminalIdentification Terminal,   // family, name, version, multiplexer, Sixel advertisement
    InputCapabilities      Input,      // Mouse / Keyboard / Pointer / Protocol
    OutputCapabilities     Output);    // Color / Styling / TextSizing / Graphics / Cursor / Window / Protocol
  • Terminal — a TerminalIdentification: Family, Name, Version, the raw env strings, InsideMultiplexer (true under tmux / GNU Screen), and AdvertisesSixel.
  • Input — categorized input capabilities (MouseCapabilities, KeyboardCapabilities, PointerCapabilities, ProtocolCapabilities). For example caps.Input.Keyboard.DistinguishesKeyUpDown tells you whether the terminal sends real key-up events or whether you'd need a key-release synthesizer to fabricate them.
  • Output — categorized output capabilities, including Color (with ColorDepth Depth and the verified-truecolor flag), Styling, TextSizing, Cursor, and so on. This is exactly the object the cell renderer takes to quantize styles to what the terminal can render.

Each sub-record exposes a None static for defaults, and TerminalCapabilities.None is the "nothing known, nothing supported" sentinel.

NegotiationOptions and OptInPolicy

NegotiationOptions is the knob bag. The defaults describe a "rich, modern terminal app" profile — every reasonable opt-in is on, timeouts tuned for interactive use — so most apps pass it unchanged.

var options = new NegotiationOptions
{
    EnableMouseButtons        = true,   // DECSET 1000
    EnableMouseButtonTracking = true,   // DECSET 1002
    EnableMouseTracking       = true,   // any-event motion (1003)
    EnableFocusEvents         = true,   // DECSET 1004
    EnableBracketedPaste      = true,   // DECSET 2004
    EnableKittyKeyboard       = true,   // pushes KittyKeyboardFlags
    EnableWin32InputMode      = true,   // DECSET 9001 under ConPTY
    EnableSynchronizedOutput  = true,   // DECSET 2026, tearing-free frames

    KittyKeyboardFlags = NegotiationOptions.DefaultKittyKeyboardFlags,
    ProbeTimeout       = NegotiationOptions.DefaultProbeTimeout,   // 500 ms
};

A few worth calling out:

  • EnableSgrPixelsMouse is off by default. It surfaces sub-cell pixel coordinates on mouse events but multiplies motion-event volume ~10–20×, so opt in only when you actually consume the pixel data (drag handles, custom cursors).
  • KittyKeyboardFlags defaults to a full-fidelity set. Compose against the exposed constant rather than reading a throwaway instance — e.g. KittyKeyboardFlags = NegotiationOptions.DefaultKittyKeyboardFlags | SomeOtherFlag.
  • ProbeTimeout — bump it for high-latency links: ProbeTimeout = NegotiationOptions.DefaultProbeTimeout + extra.

OptInPolicy — the master gate

OptIns sits above every individual Enable… flag:

  • OptInPolicy.Allowed (default) — honor the individual flags, apply the opt-ins, track them for restore.
  • OptInPolicy.Ignored — a passive probe. The negotiator identifies the terminal and reads its passive capabilities, but emits no enable sequences, requires no restore, and treats every Enable… flag as off. Use it for a read-only "what is this terminal?" introspection pass, or when embedding inside a host that already owns the protocol state.
// Detect only — change nothing.
var caps = await negotiator.NegotiateAsync(
    new NegotiationOptions { OptIns = OptInPolicy.Ignored });

Single-shot, with idempotent LIFO restore

Two lifetime rules keep "what to restore to" unambiguous:

  • Single-shot per instance. Call NegotiateAsync once. A second call on the same instance throws InvalidOperationException. To renegotiate, create a fresh negotiator (after the prior one restores).
  • Idempotent restore in LIFO order. RestoreAsync reverses every opt-in that was applied, in reverse order, and is safe to call more than once. It's best-effort — if the transport has died (terminal closed, broken pipe) it swallows the error rather than throwing. Disposing the negotiator (await using) runs restore automatically.

⚠️ You must restore (or dispose) before the process exits, or you'll leave the terminal in a non-default state — Kitty keyboard pushed, mouse tracking live, the cursor reshaped. Register a signal handler for SIGINT / SIGTERM / Ctrl-Break. For signal and process-exit paths, BuildRestoreSequence() returns the disable bytes synchronously (and marks the negotiator restored) so a handler can emit them via a direct syscall without risking a hang in the async pipeline.

Driving it by hand looks like this:

await using var negotiator = new VtTerminalNegotiator(source, sink);
var caps = await negotiator.NegotiateAsync(new NegotiationOptions());

// ... run your app against caps ...

// RestoreAsync runs automatically on dispose; call it explicitly to restore early.

You usually don't call this directly

In the common path, negotiation happens inside the session factory. TerminalSession.OpenAsync() opens raw-mode stdio, constructs and runs the negotiator, wires up the input device, and exposes the result — all signal-safe and restored on disposal:

await using var session = await TerminalSession.OpenAsync();
TerminalCapabilities caps = session.Capabilities;   // already negotiated

if (caps.Input.Keyboard.DistinguishesKeyUpDown)
    Console.WriteLine($"{caps.Terminal.Family} reports real key-up events");

To customize the negotiation, pass options through TerminalSessionOptions.Negotiation:

await using var session = await TerminalSession.OpenAsync(new TerminalSessionOptions
{
    Negotiation = new NegotiationOptions { EnableSgrPixelsMouse = true },
});

If you supply your own transports — the BYO OpenAsync(source, sink, …) overload — the session still runs the negotiator and restores its opt-ins on disposal, but leaves your transports (and terminal mode) under your control. Reach for VtTerminalNegotiator directly only when you want the capability handshake without a full session, such as a standalone probe tool.

See also

Clone this wiki locally