GemX-v0.3.0
v0.3.0 Release Notes
The follow-up to v0.2.0 opens the model registry: you can now bring your own MLX model from HuggingFace through a new + Custom flow, with a three-layer safety net that prevents the silent crashes that runtime mismatches used to cause. The Settings panel collapses to a single Downloaded Models surface for both built-ins and customs, the assistant gets an animated avatar during thinking and inference, and a round of UX polish lands across the picker, modal, and composer.
Bring Your Own Model — Custom HuggingFace Models
- + Custom button next to the model picker accepts any
mlx-vlmormlx-lmmodel from themlx-community/*HuggingFace organisation. Paste a repo id, GemX probes HuggingFace for sane defaults, and the model joins your picker under a Custom subheader (label + runtime). - mlx-community/ gate* — both the modal (client-side) and the main-process probe reject other namespaces up-front with a link to
huggingface.co/mlx-community/collections. Models from other orgs often aren't MLX-quantised and would fail at load. - Auto-detect runtime, context window, thinking support — fetches
config.json+chat_template.jinjafromhuggingface.co/<repo>/resolve/main/…. Runtime comes frommodel_type+architectures(a known multimodal allow-list + a VL/Vision/Image arch-string check); context max frommax_position_embeddingswithmodel_max_length/n_positionsfallbacks; thinking fromenable_thinkingin either chat-template source. - Lazy
mlx-lminstall — built-ins all run onmlx-vlm, so users who never add a text-only custom model don't pay the install cost. The first time you select anmlx-lmcustom, GemXpip installsmlx-lminto the existing venv (~30 s) and then spawnsmlx_lm.serverinstead ofmlx_vlm.server. Subsequent boots skip the install. - HuggingFace token re-used — gated mlx-community repos pick up the token you've already saved in Settings; no per-model auth surface needed.
- Default context clamps to 16 K — even if a repo reports 256 K, the default lands conservative so first-launch RAM stays modest on 8 GB Macs. You can raise it any time from Settings → Context Window.
Locked Runtime + Immutable Settings at Add Time
- Runtime is auto-detected and locked. The modal shows it as a readonly chip with an "Auto-detected from
config.json— Locked at add time" subtitle. A wrong runtime is the single biggest source of silent server crashes, so the override toggle is gone. If auto-detect was wrong, you remove and re-add. - All settings become immutable after Add. A prominent amber banner at the top of the modal makes this clear: "These settings are locked after Add. To fix a mistake, remove the model from Settings → Downloaded Models and re-add. Double-check every field before clicking Add model." There is no edit UI for custom models — remove + re-add is the only path.
- Other probed fields stay user-editable at add time — label, max context window, default context window, and thinking-support are all overrideable because
config.json max_position_embeddingsis wrong for several real repos and the chat-template heuristic can miss thinking on unusual templates.
Runtime Mismatch Safety Net
- Composer blocks image attach for text-only models. When the active model's runtime is
mlx-lm, drag-and-drop, paste, and the file picker all filter outimage/*files and surface a "This model is text-only — images can't be attached." toast. PDFs, code, audio, and other context attachments continue to work. - Boot-time mismatch detection. A new
RuntimeMismatchError(mirroring the existingMLXVersionMismatchErrorpattern) is thrown bywaitForHealthwhen the server's stderr matches a vision-vs-text architecture failure — direction-aware regexes forvision_tower/image_token_index/preprocessor_config.json(vlm-on-text) andvision encoder/multimodal not supported(lm-on-vision). For custom models, the caller surfaces an action-oriented advisory: "{label} appears to be text-only, but was added as mlx-vlm. Remove from Settings → Downloaded Models and re-add." - Stream-time mismatch detection.
chatStreamwraps its!res.okbranch: ifimage_urlparts were in the request body and the response body mentions image / multimodal rejection, the error is upgraded toRuntimeMismatchErrorand rendered as an error bubble on the offending message. Catches the case where a text-only server loaded fine but rejects images at request time.
Unified Settings: One Downloaded Models Surface
- Custom and cached built-in models merged into a single Downloaded Models list — no more separate "Custom Models" panel. Per-row layout is consistent for both: label first,
custom/active/defaulttags after, repo id underneath for customs. One trash icon, onewindow.confirmconfirmation, one IPC path. - Custom-model deletion removes both the registry entry and the cached weights, so the model vanishes from the picker and the disk in a single click.
- Active model can't be deleted — the trash icon is disabled with the same "Switch to another model first" tooltip that built-ins use.
- Thinking toggle gated on the active model. When the active custom has
thinkingSupported === false, the Thinking buttons in Settings disable with a "This model doesn't expose a reasoning channel — toggle disabled." subtext. Built-ins (Gemma 4 family) all support thinking, so it stays enabled for them. - Context Window slider clamps to the active model's
contextWindowMax— thestepsfilter respects the custom's declared ceiling, with aMath.min(effectiveCtx, maxCtx)belt-and-suspenders clamp for legacycontextWindowOverridevalues that exceeded a since-lowered Max.
Composer & Modal UX Polish
- Animated assistant avatar during thinking and inference — a new
AssistantAvatarcomponent renders an animated logomark in the assistant bubble while the model is generating, replacing the static placeholder. - Mic icon visual parity with the attach icon — the mic SVG was visibly slimmer than the paperclip at the same
h-4 w-4box because its path was thinner. Stroke weight raised and the rect + arc footprint widened so both icons read at the same visual weight in the toolbar. - File-count badge removed entirely. The previous emerald-500 notification dot on the paperclip read like a phone-notification badge; a brief inline-chip replacement still felt redundant. Now the paperclip button stays clean and the attached files render as cards above the textarea (image thumbnails + file-icon-plus-name chips with ×) — same pattern Claude and Gemini use.
- Custom-model context-window inputs match Settings. The modal's Max-context and Default-context fields used to be plain
<input type="number">controls with browser-native spinners. They now use the same display-box + vertical chevron stepper as Settings → Context Window, stepping through the power-of-2 sequence (1 K → 1 M) with disabled boundaries. - Custom subheader in the model picker no longer doubles as a per-row tag. The
customchip on each picker row was redundant and is gone — the Custom section header already labels them. Picker rows now show label + runtime, period. - Description field removed. The custom-model add modal collected a free-text description that nothing in the app ever rendered. Field dropped from the modal and from the
CustomModeltype. Legacy entries with stored descriptions still parse fine — the value is silently ignored.
Documentation & Versioning
- README documents the entire + Custom flow, the locked-runtime + immutable-settings model, image-block behaviour for
mlx-lm, and where runtime-mismatch advisories surface. docs/data-model.mddocuments the updatedCustomModelshape, runtime-lock rationale, and the three-layer safety net.docs/mlx-runtime.mdadds a new "Runtime-mismatch detection" section walking through Composer block + boot-timeRuntimeMismatchError+ chatStream upgrade with the file references.docs/ipc-contract.mddocuments the newprobeCustomModel/addCustomModel/removeCustomModel(name, alsoDeleteCache?)IPC surface and the advisory-defaults model.- Inspiration credit. Ammaar Reshi's gemma-chat — a beautiful single-purpose Gemma desktop client — is now credited in the README as the visual / structural inspiration that started GemX.
Please see the README in the main repository for complete installation instructions, system requirements, updated model memory specifications, and the new Bring your own model walkthrough.