Releases · gianlucamazza/xllama

Release list

v0.3.0 — UX/UI completion Latest

Latest

gianlucamazza released this 22 May 23:53

v0.3.0

6593735

What's new

Bug fixes

Newline rendering: AppendOutput inserts LineBreak inlines instead of collapsing \n to space in RichTextBlock.
Prompt cleared after Run: TextBox emptied as soon as Run is pressed.
NewChat clears prompt: NewChat() also resets the TextBox.
Double FocusEngagement removed: removed from ScrollViewer; one A-press now opens the keyboard on Xbox.
Focus returns after generation: cursor moves back to TextBox when inference completes.
Smart autoscroll: suppressed while user scrolled up; resumes when back at bottom.
Status "Loading model…" at startup: Run disabled until model is confirmed loaded.
Context trim aligned to n_ctx: threshold lowered 3500 → 1800; trim surfaces a status message.
Partial save on cancel: assistant message saved with partial=true when user presses Cancel.

Features

Settings — sampling params: temperature, top_p, top_k, repetition_penalty, n_predict in Settings dialog, persisted in settings.json.
Settings — model selection: ComboBox to switch SmolLM2-360M (bundled) / SmolLM2-1.7B (USB) / SmolLM2-360M (HF).
History — per-item delete: ✕ button with confirmation dialog.
History — Clear all: Secondary button with confirmation.
History — current indicator: ● prefix on active conversation.
History — relative timestamps: "today", "yesterday", or date.
History — empty state: dialog with placeholder instead of silent no-op.

Tests

tests/test_chat_history.cpp: TitleFrom smoke tests on Linux CI; Save/Load/Delete/Clear suite on UWP.

Bench baseline unchanged: SmolLM2-360M 64.5 tok/s, n=990, peak=705 MB (Zen 2 Xbox Series S).

Assets 2

xllama 0.2.1

gianlucamazza released this 22 May 22:47

v0.2.1

e0debf4

What's Changed

Fixed

ChatML stop sequence: add <|im_end|> as stop sequence in UI inference path. SmolLM2-360M does not always emit EOS naturally (bench reaches n=990 with max_length=1024); without this fix the model could generate filler or hallucinate the next user turn up to n_predict=512 tokens beyond end-of-turn. Bench path (run\_inference) unchanged.
CHANGELOG structure: collapsed duplicate ### Added blocks in 0.2.0 section; recovered missing OrtModelPtr → OgaModelPtr fix entry.

Tests

Add tests/test_session.cpp: two smoke tests for Session::create error paths (non-existent path, empty path). Covers the Linux/llama.cpp LlamaSession constructor branch previously unexercised by CI.

Full Changelog: v0.2.0...v0.2.1

Assets 2

xllama 0.2.0

gianlucamazza released this 22 May 21:50

v0.2.0

9319922

xllama 0.2.0 — 2026-05-22

Highlights

Persistent inference session (no more per-turn model reload)
xllama::Session keeps the model and tokenizer loaded across chat turns. After the first message (~1–2 s cold load), subsequent turns jump straight to generation — no "Loading model..." pause. The session is transparently rebuilt if the model changes via Settings.

Multi-turn chat + persistent history
Full ChatML prompt construction with context trimming, conversation persistence to LocalState/chats/ (JSON), history browser overlay, and system prompt editor.

Bundled SmolLM2-360M-Instruct INT4 CPU (~403 MB)
Ships inside the MSIX; model is copied to LocalState on first launch. No external download required for the base model.

USB and in-app HF download fallbacks
Three-step model bootstrap: LocalState → InstalledPath bundle → USB stick (E:\xllama\models\<name>) → Hugging Face HttpClient download. Enables larger models (SmolLM2-1.7B, ~1.4 GB) via USB without MSIX rebuild.

Bench diagnostics
Each bench run now logs prompt=N tok, max_length=M (new≤K) so n-token counts in CSV are self-explanatory. Bench cap raised to 512 new tokens (was 128).

Performance baseline (Xbox Series S, CPU EP)

Model	tok/s	RAM peak
SmolLM2-360M INT4	69–73	680 MB
SmolLM2-1.7B INT4	23.6	2195 MB

Fixed

weakly_canonical: Access is denied crash: merge ONNX external data into monolithic model.onnx to bypass ORT AppContainer path-walking.
OgaModelPtr typo in OrtSession UWP build (MSVC C2065).
ASCII-safe status strings (removed em-dash / ellipsis that caused MSVC C4566).

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release list

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's new

Bug fixes

Features

Tests

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's Changed

Fixed

Tests

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

xllama 0.2.0 — 2026-05-22

Highlights

Performance baseline (Xbox Series S, CPU EP)

Fixed

Uh oh!

Releases: gianlucamazza/xllama

Release list

v0.3.0 — UX/UI completion

What's new

Bug fixes

Features

Tests

Uh oh!

xllama 0.2.1

What's Changed

Fixed

Tests

Uh oh!

xllama 0.2.0

xllama 0.2.0 — 2026-05-22

Highlights

Performance baseline (Xbox Series S, CPU EP)

Fixed

Uh oh!