Skip to content

v0.5.12

Choose a tag to compare

@github-actions github-actions released this 02 Jun 22:20
· 42 commits to main since this release

Added

  • Web Router Dashboard: full set manager. The M4 read-only "Model Health"
    panel has been replaced with an interactive Active Set manager:
    • Add model picker with search + per-provider filter (147 routeable
      models from the new /api/router/catalog endpoint).
    • Remove button per row.
    • Move up / Move down buttons as a keyboard-accessible fallback.
    • HTML5 drag-and-drop to reorder priorities. Visual drop indicator
      (top/bottom border on the target row) shows where the row will land.
    • Active set switcher in the same header (when more than one set
      exists). Switches the daemon's activeSet via
      POST /sets/:name/activate.
    • Save indicator (saving… / ✓ saved / ⚠ error) shown next to the
      Add button so the user always knows whether the last drag actually
      persisted to disk.
  • "Sync best models" button. One-click rebuild of the active set
    using the probe-based sync-set pipeline. Re-pings every candidate
    with the user's actual API keys, picks only the models that come back
    2xx with a reasonable latency, and persists the result. The Web UI
    surfaces a "Synced N working models from M probes" toast with the
    probe count. The button shows a "saving…" state while the probe runs
    (up to ~60s for 16 candidates).
  • Daemon endpoints for granular set management.
    • POST /sets/:name/models — append a single model to a set.
    • DELETE /sets/:name/models — remove a single model by
      { provider, model }.
    • POST /sets/:name/reorder — accept a full priority order
      { order: ["provider/model", ...] } and re-number the set.
    • POST /sets/:name/sync — re-run sync-set against the named set
      with maxProbes: 16, targetCount: 5 so the Web UI never spends
      more than ~60s on a sync.
    • GET /api/router/catalog — lightweight catalog of routeable
      models for the Add picker.
  • Default router set is now probe-driven. buildDefaultRouterSet
    is async and accepts a probeFn that POSTs a 1-token chat completion
    to each candidate. Models that come back 2xx with low latency are
    pinned to the top of the set; failing models fall back to the static
    tier ordering. A new user with a half-broken key set now gets a
    default router set made of models that actually work, instead of a
    hardcoded NVIDIA-only list that 401s for everyone without a
    NIM-specific grant.
  • Web proxy for the new endpoints. /api/router/sets/:name/models
    (POST/DELETE), /api/router/sets/:name/reorder (POST),
    /api/router/sets/:name/activate (POST), /api/router/sets/:name/sync
    (POST, 180s proxy timeout), /api/router/catalog (GET, with a
    fallback path that synthesizes the catalog from sources.js when the
    daemon is offline).
  • Switch case routing fix. The if/regex blocks for parameterized
    routes (/api/router/sets/:name/...) were previously inside the
    request switch as case statements that never matched (JS switch
    is exact-string equality). Moved them above the switch so they
    actually run. This is why the original reorder/add/remove proxies
    returned 404 in the v0.5.11 web build.
  • Auto-heal the active router set on startup. When the daemon starts,
    it waits for the first probe burst to populate health data, then runs
    autoHealActiveSet() to swap any broken model in the active set
    (AUTH_ERROR or STALE state) for a working alternative. The picker
    prefers the same provider first, then falls through to cross-provider
    candidates, and skips providers whose every probe has been broken.
    Three passes run at 8s, 24s, and 40s after startup so a freshly-added
    replacement that turns out to be broken gets replaced again. The result:
    a new user with a half-broken key set lands on a usable default set
    by the time the Web Dashboard renders, and the Playground's fcm
    virtual model starts returning successful responses immediately.
  • router.userCustomized + router.autoHeal config flags. The first
    manual edit to the active set (add / remove / reorder / sync /
    activate / rename) flips userCustomized to true and autoHeal
    to false so the user's manual choices are never undone on the next
    daemon start. New users get userCustomized: false and autoHeal: true by default, which is what powers the M6 "default to working
    models" promise.
  • /api/router/status exposes autoHeal, userCustomized, and
    brokenModelCount.
    The Web Router Dashboard reads these to surface
    an amber banner when broken models remain in the active set, with a
    one-click "Fix now" button that re-runs the same sync-set probe
    pipeline the CLI uses.
  • Amber "models not responding" banner in the Web Router Dashboard.
    Shown when the daemon reports brokenModelCount > 0, with a
    dismissable X so the user can ignore it after acknowledging the
    problem. The "Fix now" button re-runs the probe-based heal and reloads
    the dashboard state.
  • Unusable row fade (TUI + Web + Desktop). Rows whose health is
    NO KEY (noauth) or AUTH FAIL (auth_error) are now rendered at
    80% opacity (20% less opaque) on every user-facing surface. The user
    can scan the table and instantly see which models they cannot
    actually use, even when the cursor is parked on a different model.
    • TUI: the new fadedRow() helper in src/tui/render-helpers.js
      multiplies every 24-bit RGB channel inside an ANSI-colored string by
      0.8, so the whole line reads as uniformly faded. This works on
      every terminal that supports truecolor and does not rely on the
      SGR 2 "faint" code, which is ignored by some terminals. The fade
      composes cleanly with the cursor highlight, the dark-red
      incompatible background, the green recommended background, and
      the gold favorite background, so no existing visual cue is lost
      — the "unusable" signal just rides on top of them.
    • Web / Desktop (Tauri): the ModelTable adds an
      .unusable { opacity: 0.8 } CSS class to rows whose m.status is
      noauth or auth_error. The class is held steady on hover so the
      "you cannot use this" signal never disappears while the user is
      inspecting the row.
  • fadedRow(input, factor = 0.8) helper. Pure function exported
    from src/tui/render-helpers.js, documented and unit-tested in
    isolation. Identity fast-path for factor >= 1, channels clamped to
    0–255, bold/dim/reset SGR codes pass through unchanged. Reusable for
    any future "fade a whole line" need (e.g. stale rows, soft-disabled
    providers).
  • 6 new unit tests for the granular set-management endpoints
    (add, duplicate, remove, reorder, reorder-with-missing-key,
    catalog), 2 new tests for buildDefaultRouterSet's probe path
    (probe-preference + sync fallback), 4 new unit tests for the
    auto-heal path (no-op when user-customized, no-op when auto-heal is
    disabled, no-op when no broken models, and the
    user-edit-flags-customization round-trip), and 12 new tests for
    the unusable row fade (7 for fadedRow + 5 for the renderTable
    integration). All 515 tests pass.

Notes

  • The TUI's --sync-set flag is unchanged — the Web "Sync best" button
    is the same pipeline with a smaller maxProbes so the UI stays
    snappy. CLI users still get the full 50-probe budget.
  • The fcm virtual model that the Playground uses by default
    automatically picks up the new working-models set, so the
    Playground will start returning successful responses as soon as the
    Web user clicks "Sync best" once.
  • The auto-heal is best-effort: if the user has only one working
    provider, the healed set will shrink to that one provider's top
    model. That's still better than the previous behavior of showing
    three models that all 401.
  • The new broken-model banner is intentionally subtle (amber, not red)
    because the auto-heal already does its best to recover. It's there
    for the "user has 0 working keys" edge case so the user can click
    through to "Fix now" / "Sync best" and either get a working set or
    see the toast explaining the situation.
  • The cursor row is still faded if the model is unusable — the user's
    request was "the WHOLE line at 80% opacity" and we honor that
    literally. The cursor highlight (blue background) gets its colors
    multiplied by 0.8 too, so it remains visible but reads as "dimmed",
    consistent with the rest of the line.

How the auto-heal picker works (M6)

For each broken model in the active set:
  1. Same provider — pick a working model of the same provider
     (skipping models that the circuit breaker already knows are broken).
  2. Cross-provider — fall through to any working model across all
     providers, sorted by static tier + swe score.
  3. If neither yields a working model, leave the broken entry in
     place and log a warning (the Web UI surfaces this in the
     "models not responding" banner).