Skip to content

Add two more downloadable models (four defaults)#245

Merged
FuJacob merged 4 commits into
mainfrom
add-four-default-models
May 25, 2026
Merged

Add two more downloadable models (four defaults)#245
FuJacob merged 4 commits into
mainfrom
add-four-default-models

Conversation

@FuJacob
Copy link
Copy Markdown
Owner

@FuJacob FuJacob commented May 25, 2026

Summary

Activates the two "newer models" that were sitting commented-out on the inference-engine branch, so the built-in downloadable catalog now offers four GGUF models across fast/balanced/quality tiers:

Model File Size
Cotabby-fast-1 Qwen3-0.6B-Q4_K_M.gguf ~0.4 GB (existing)
Cotabby-fast-2 Qwen3.5-0.8B-Q4_K_M.gguf ~0.5 GB (new)
Cotabby-balanced-1 gemma-3-1b-it-Q4_K_M.gguf ~0.8 GB (existing)
Cotabby-quality-1 gemma-4-E2B-it-Q4_K_M.gguf ~3.1 GB (new)

Display names route through displayName(for:) for consistency; expectedSizeBytes/sha256 carried over from the captured HuggingFace CDN headers.

Validation

xcodebuild -project Cotabby.xcodeproj -scheme Cotabby -destination 'platform=macOS' build   # ** BUILD SUCCEEDED **

Both new download URLs verified live (HTTP 200) via curl -sIL. No test asserts the catalog count, so none needed updating. README model table updated to match.

Linked issues

Risk / rollout notes

  • Additive only — existing two models unchanged; download/validation pipeline already handles arbitrary catalog entries (size + SHA-256 gate).
  • Cotabby-quality-1 is ~3.1 GB; it's opt-in to download like the others.
  • LlamaRuntimeConfiguration.preferredModelNames (runtime auto-load priority) left as-is; the new models are downloadable but not forced as the auto-load default.

Greptile Summary

This PR expands the built-in downloadable model catalog from two entries to four by activating two previously commented-out models (Qwen3.5-0.8B-Q4_K_M.gguf and gemma-4-E2B-it-Q4_K_M.gguf), and renames all display names to a consistent lowercase cotabby-* scheme.

  • New catalog entries: Both new DownloadableRuntimeModel structs include expectedSizeBytes and sha256 fields populated from HuggingFace CDN headers, which the existing size + SHA-256 validation gate will enforce on download.
  • Display name rename: All four display names are now lowercase (e.g. cotabby-swift-1, cotabby-balanced-1); since model identity uses filename as id, not the display string, existing user selections are unaffected.
  • LlamaRuntimeConfiguration.preferredModelNames is intentionally left with only the original two entries — the new models are opt-in downloads, not auto-load defaults.

Confidence Score: 5/5

Additive catalog change only — no existing behavior altered, download validation gate unchanged, and model identity relies on filenames not display strings.

The two new catalog entries follow the exact same pattern as the two existing ones, with expectedSizeBytes and sha256 populated so the validation gate will reject corrupt or wrong downloads. The display name rename from title-case to lowercase is cosmetic and safe because id is always the raw filename. Tests are updated to match, and preferredModelNames is left intentionally narrow.

No files require special attention.

Important Files Changed

Filename Overview
Cotabby/Models/LlamaRuntimeModels.swift Adds two new DownloadableRuntimeModel entries (Qwen3.5-0.8B and gemma-4-E2B) and renames all four display names to lowercase cotabby-* aliases; LlamaRuntimeConfiguration.preferredModelNames intentionally left unchanged.
CotabbyTests/ModelAndPresentationValueTests.swift Adds two new XCTAssertEqual assertions for the new filenames and updates existing assertions to match the renamed lowercase display names.
README.md Model table updated to list all four models with their new lowercase display names and correct sizes.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[RuntimeModelCatalog.downloadableModels] --> B[Qwen3-0.6B-Q4_K_M.gguf\ncotabby-swift-1 ~0.4 GB]
    A --> C[Qwen3.5-0.8B-Q4_K_M.gguf\ncotabby-swift-pro-1 ~0.5 GB\nNEW]
    A --> D[gemma-3-1b-it-Q4_K_M.gguf\ncotabby-balanced-1 ~0.8 GB]
    A --> E[gemma-4-E2B-it-Q4_K_M.gguf\ncotabby-careful-1 ~3.1 GB\nNEW]
    B & C & D & E --> F[displayName for filename]
    F --> G[DownloadableRuntimeModel\nfilename · displayName · downloadURL\nexpectedSizeBytes · sha256]
    G --> H[Download Manager]
    H --> I{Size + SHA-256 gate}
    I -->|pass| J[Install to models folder]
    I -->|fail| K[Reject staged file]
    L[LlamaRuntimeConfiguration.preferredModelNames] --> M[gemma-3-1b-it-Q4_K_M.gguf\nQwen3-0.6B-Q4_K_M.gguf]
    M --> N[Auto-load priority\nnew models excluded intentionally]
Loading

Comments Outside Diff (1)

  1. Cotabby/Models/LlamaRuntimeModels.swift, line 88-99 (link)

    P2 PR description lists different display names than the implementation

    The PR summary table uses Cotabby-fast-1, Cotabby-fast-2, Cotabby-balanced-1, and Cotabby-quality-1, but the code (and README) now uses Cotabby-Swift-1, Cotabby-Swift+-1, Cotabby-Balanced-1, and Cotabby-Careful-1. This is only a description inconsistency and won't affect runtime behaviour, but it may confuse reviewers or anyone searching for the names later.

    Fix in Codex Fix in Claude Code

Reviews (2): Last reviewed commit: "Rename model tiers to lowercase; swift+ ..." | Re-trigger Greptile

FuJacob added 3 commits May 25, 2026 04:51
Activate the 'newer models' that were commented out on the inference-engine
branch, giving four built-in downloadable GGUF models across fast/balanced/
quality tiers:

- Cotabby-fast-1     Qwen3-0.6B      (~0.4 GB, existing)
- Cotabby-fast-2     Qwen3.5-0.8B    (~0.5 GB, new)
- Cotabby-balanced-1 gemma-3-1b      (~0.8 GB, existing)
- Cotabby-quality-1  gemma-4-E2B     (~3.1 GB, new)

Display names route through displayName(for:) for consistency; expected size
and SHA-256 carried over from the captured HuggingFace CDN headers (both URLs
verified live, HTTP 200). README model table updated to match.
Comment thread Cotabby/Models/LlamaRuntimeModels.swift
Display names: cotabby-swift-1, cotabby-swift-pro-1, cotabby-balanced-1,
cotabby-careful-1. All four download URLs verified end to end (GGUF magic,
served size matches expectedSizeBytes, and x-linked-etag matches the committed
SHA-256), so downloads initiate and pass the install-time validator.
@FuJacob
Copy link
Copy Markdown
Owner Author

FuJacob commented May 25, 2026

Confirmed the unsloth/Qwen3.5-0.8B-GGUF source and metadata against the live HuggingFace CDN (not just trusted from the catalog):

  • Qwen3.5-0.8B-Q4_K_M.gguf resolves (HTTP 200) and the bytes start with the GGUF magic 47 47 55 46.
  • Served Content-Length = 532517120 = the committed expectedSizeBytes.
  • x-linked-etag (HuggingFace's SHA-256 for the LFS object) = bd258782e35f7f458f8aced1adc053e6e92e89bc735ba3be89d38a06121dc517 = the committed sha256.

Same end-to-end check passed for all four models (magic + size + SHA-256), so the download/validation gate will accept them. The 'Qwen3.5' naming is Unsloth's repo name for this quant; the metadata is captured from that exact file.

@FuJacob FuJacob merged commit be64f09 into main May 25, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant