Skip to content

GemX-v0.3.0

Choose a tag to compare

@Avaneesh40585 Avaneesh40585 released this 10 Jun 12:29

v0.3.0 Release Notes

The follow-up to v0.2.0 opens the model registry: you can now bring your own MLX model from HuggingFace through a new + Custom flow, with a three-layer safety net that prevents the silent crashes that runtime mismatches used to cause. The Settings panel collapses to a single Downloaded Models surface for both built-ins and customs, the assistant gets an animated avatar during thinking and inference, and a round of UX polish lands across the picker, modal, and composer.

Bring Your Own Model — Custom HuggingFace Models

  • + Custom button next to the model picker accepts any mlx-vlm or mlx-lm model from the mlx-community/* HuggingFace organisation. Paste a repo id, GemX probes HuggingFace for sane defaults, and the model joins your picker under a Custom subheader (label + runtime).
  • mlx-community/ gate* — both the modal (client-side) and the main-process probe reject other namespaces up-front with a link to huggingface.co/mlx-community/collections. Models from other orgs often aren't MLX-quantised and would fail at load.
  • Auto-detect runtime, context window, thinking support — fetches config.json + chat_template.jinja from huggingface.co/<repo>/resolve/main/…. Runtime comes from model_type + architectures (a known multimodal allow-list + a VL/Vision/Image arch-string check); context max from max_position_embeddings with model_max_length / n_positions fallbacks; thinking from enable_thinking in either chat-template source.
  • Lazy mlx-lm install — built-ins all run on mlx-vlm, so users who never add a text-only custom model don't pay the install cost. The first time you select an mlx-lm custom, GemX pip installs mlx-lm into the existing venv (~30 s) and then spawns mlx_lm.server instead of mlx_vlm.server. Subsequent boots skip the install.
  • HuggingFace token re-used — gated mlx-community repos pick up the token you've already saved in Settings; no per-model auth surface needed.
  • Default context clamps to 16 K — even if a repo reports 256 K, the default lands conservative so first-launch RAM stays modest on 8 GB Macs. You can raise it any time from Settings → Context Window.

Locked Runtime + Immutable Settings at Add Time

  • Runtime is auto-detected and locked. The modal shows it as a readonly chip with an "Auto-detected from config.json — Locked at add time" subtitle. A wrong runtime is the single biggest source of silent server crashes, so the override toggle is gone. If auto-detect was wrong, you remove and re-add.
  • All settings become immutable after Add. A prominent amber banner at the top of the modal makes this clear: "These settings are locked after Add. To fix a mistake, remove the model from Settings → Downloaded Models and re-add. Double-check every field before clicking Add model." There is no edit UI for custom models — remove + re-add is the only path.
  • Other probed fields stay user-editable at add time — label, max context window, default context window, and thinking-support are all overrideable because config.json max_position_embeddings is wrong for several real repos and the chat-template heuristic can miss thinking on unusual templates.

Runtime Mismatch Safety Net

  • Composer blocks image attach for text-only models. When the active model's runtime is mlx-lm, drag-and-drop, paste, and the file picker all filter out image/* files and surface a "This model is text-only — images can't be attached." toast. PDFs, code, audio, and other context attachments continue to work.
  • Boot-time mismatch detection. A new RuntimeMismatchError (mirroring the existing MLXVersionMismatchError pattern) is thrown by waitForHealth when the server's stderr matches a vision-vs-text architecture failure — direction-aware regexes for vision_tower/image_token_index/preprocessor_config.json (vlm-on-text) and vision encoder/multimodal not supported (lm-on-vision). For custom models, the caller surfaces an action-oriented advisory: "{label} appears to be text-only, but was added as mlx-vlm. Remove from Settings → Downloaded Models and re-add."
  • Stream-time mismatch detection. chatStream wraps its !res.ok branch: if image_url parts were in the request body and the response body mentions image / multimodal rejection, the error is upgraded to RuntimeMismatchError and rendered as an error bubble on the offending message. Catches the case where a text-only server loaded fine but rejects images at request time.

Unified Settings: One Downloaded Models Surface

  • Custom and cached built-in models merged into a single Downloaded Models list — no more separate "Custom Models" panel. Per-row layout is consistent for both: label first, custom / active / default tags after, repo id underneath for customs. One trash icon, one window.confirm confirmation, one IPC path.
  • Custom-model deletion removes both the registry entry and the cached weights, so the model vanishes from the picker and the disk in a single click.
  • Active model can't be deleted — the trash icon is disabled with the same "Switch to another model first" tooltip that built-ins use.
  • Thinking toggle gated on the active model. When the active custom has thinkingSupported === false, the Thinking buttons in Settings disable with a "This model doesn't expose a reasoning channel — toggle disabled." subtext. Built-ins (Gemma 4 family) all support thinking, so it stays enabled for them.
  • Context Window slider clamps to the active model's contextWindowMax — the steps filter respects the custom's declared ceiling, with a Math.min(effectiveCtx, maxCtx) belt-and-suspenders clamp for legacy contextWindowOverride values that exceeded a since-lowered Max.

Composer & Modal UX Polish

  • Animated assistant avatar during thinking and inference — a new AssistantAvatar component renders an animated logomark in the assistant bubble while the model is generating, replacing the static placeholder.
  • Mic icon visual parity with the attach icon — the mic SVG was visibly slimmer than the paperclip at the same h-4 w-4 box because its path was thinner. Stroke weight raised and the rect + arc footprint widened so both icons read at the same visual weight in the toolbar.
  • File-count badge removed entirely. The previous emerald-500 notification dot on the paperclip read like a phone-notification badge; a brief inline-chip replacement still felt redundant. Now the paperclip button stays clean and the attached files render as cards above the textarea (image thumbnails + file-icon-plus-name chips with ×) — same pattern Claude and Gemini use.
  • Custom-model context-window inputs match Settings. The modal's Max-context and Default-context fields used to be plain <input type="number"> controls with browser-native spinners. They now use the same display-box + vertical chevron stepper as Settings → Context Window, stepping through the power-of-2 sequence (1 K → 1 M) with disabled boundaries.
  • Custom subheader in the model picker no longer doubles as a per-row tag. The custom chip on each picker row was redundant and is gone — the Custom section header already labels them. Picker rows now show label + runtime, period.
  • Description field removed. The custom-model add modal collected a free-text description that nothing in the app ever rendered. Field dropped from the modal and from the CustomModel type. Legacy entries with stored descriptions still parse fine — the value is silently ignored.

Documentation & Versioning

  • README documents the entire + Custom flow, the locked-runtime + immutable-settings model, image-block behaviour for mlx-lm, and where runtime-mismatch advisories surface.
  • docs/data-model.md documents the updated CustomModel shape, runtime-lock rationale, and the three-layer safety net.
  • docs/mlx-runtime.md adds a new "Runtime-mismatch detection" section walking through Composer block + boot-time RuntimeMismatchError + chatStream upgrade with the file references.
  • docs/ipc-contract.md documents the new probeCustomModel / addCustomModel / removeCustomModel(name, alsoDeleteCache?) IPC surface and the advisory-defaults model.
  • Inspiration credit. Ammaar Reshi's gemma-chat — a beautiful single-purpose Gemma desktop client — is now credited in the README as the visual / structural inspiration that started GemX.

Please see the README in the main repository for complete installation instructions, system requirements, updated model memory specifications, and the new Bring your own model walkthrough.