feat: add model download cache manager#129
Conversation
Add structured ModelSource, ModelLoadOptions, resolver targets, and download/cache value models for future package-managed GGUF download flows. Wire LlamaEngine.loadModelSource to preserve existing local loading and route remote sources through URL-capable backends while explicitly rejecting unsupported foundation options. Document the additive API surface and add regression coverage for resolver validation, URL redaction, and placeholder download managers.
There was a problem hiding this comment.
Pull request overview
This PR introduces a first-party, package-managed model download + cache system for remote GGUF model sources (HTTP(S) and hf://), and wires it into LlamaEngine.loadModelSource(...) so native/file-backed backends download to a local cached file before loading while URL-capable web backends continue to load directly (with option restrictions). It also migrates example/testing tooling and updates docs/changelog to reflect the new structured model source workflow.
Changes:
- Added new structured model source/value-model APIs (
ModelSource,ModelLoadOptions,ModelResolvertargets) plus a cross-platformModelDownloadManagerwith native IO implementation and browser stub. - Updated
LlamaEngineto supportloadModelSource(...), including native download/cache integration, URL-backend option rejection, and URL redaction for logs/metadata. - Migrated server and verification tools to use the new download manager and added comprehensive unit coverage for source parsing, cache behavior, resume/retry, and cleanup.
Reviewed changes
Copilot reviewed 27 out of 27 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| website/docs/guides/model-lifecycle.md | Documents loadModelSource(...), cache policies, manager APIs, and mobile download guidance. |
| website/docs/changelog/recent-releases.md | Adds 0.6.12 release notes for the new model download/cache manager and engine wiring. |
| tool/testing/verify_recommended_models.dart | Replaces ad-hoc HTTP download logic with DefaultModelDownloadManager.ensureModel(...). |
| tool/testing/verify_models.dart | Switches model download path to ensureModel(...) while keeping legacy local-file behavior. |
| test/unit/core/models/model_source_test.dart | Adds unit tests for ModelSource parsing, redaction, and deterministic keying. |
| test/unit/core/models/model_resolver_test.dart | Adds tests for resolver targets, defaults, cancellation handling, and remote passthrough. |
| test/unit/core/models/model_load_options_test.dart | Adds tests for option storage, header immutability, and validation. |
| test/unit/core/models/download/model_download_manager_test.dart | Validates public export surface of the manager + IO implementation. |
| test/unit/core/models/download/model_download_manager_stub_test.dart | Ensures browser stub throws the expected unsupported exception. |
| test/unit/core/models/download/model_download_manager_io_test.dart | Adds end-to-end IO manager tests (cache policies, retry, resume, checksum, cleanup, prune APIs). |
| test/unit/core/models/download/model_download_manager_base_test.dart | Tests the throwing base manager behavior for unsupported operations. |
| test/unit/core/models/download/model_cache_entry_test.dart | Adds tests for progress math, cache entry JSON/redaction, and validations. |
| test/unit/core/engine/engine_test.dart | Adds engine tests for loadModelSource(...) routing, progress forwarding, and URL redaction behavior. |
| test/integration/engine_integration_test.dart | Updates native URL-loading expectation to LlamaUnsupportedException. |
| README.md | Adds usage docs for downloading/caching remote GGUFs via structured sources. |
| pubspec.yaml | Adds crypto dependency for SHA-256 keying/checksum support. |
| lib/src/core/models/model_source.dart | Introduces ModelSource (path/http/hf) parsing, validation, cache keying, and redaction. |
| lib/src/core/models/model_resolver.dart | Adds resolver interfaces + default resolver and load target value types. |
| lib/src/core/models/model_load_options.dart | Adds cache policy + load options (headers, bearer token, sha256, resume, retries, cancel). |
| lib/src/core/models/download/model_download_manager.dart | Adds conditional export for stub vs IO manager implementation. |
| lib/src/core/models/download/model_download_manager_stub.dart | Adds non-IO stub that throws LlamaUnsupportedException. |
| lib/src/core/models/download/model_download_manager_io.dart | Implements the native download/cache manager (streaming, .part, resume/retry, metadata, prune). |
| lib/src/core/models/download/model_download_manager_base.dart | Adds base API types: progress, cache entry metadata, and throwing manager base. |
| lib/src/core/engine/engine.dart | Wires structured model loading into the engine and adds URL redaction + option rejection. |
| lib/llamadart.dart | Exports new public APIs (sources/options/resolver/manager). |
| example/llamadart_server/lib/src/features/model_management/infrastructure/model_service.dart | Migrates server model acquisition to ModelSource + download manager. |
| CHANGELOG.md | Adds 0.6.12 changelog notes for model source/download/cache manager feature set. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #129 +/- ##
==========================================
+ Coverage 76.73% 77.74% +1.00%
==========================================
Files 70 75 +5
Lines 8734 9512 +778
==========================================
+ Hits 6702 7395 +693
- Misses 2032 2117 +85
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Summary
LlamaEngine.loadModelSourceso file-backed/native backends download to a local cached path before loading, while URL-capable backends reject native-cache-only options.mmprojhandling into independent asset sources so model/projector cache, download, delete, and activation behavior no longer assumes both files come from the same origin.Closes #125
Production-readiness scope
This PR is intended to be merge-ready for the declared feature scope:
loadModel(...).loadModelSource(...)API.mmprojassets are resolved independently for mixed local/remote source combinations where the platform can actually load them.loadModel(...)andloadModelFromUrl(...)callers are unchanged.Intentionally deferred follow-ups
These are not required for the current feature to work, and are tracked separately to avoid merging incomplete scope into
main:ModelSourceoption semantics.Test Plan
dart analyze lib test tool/testing/verify_recommended_models.dart tool/testing/verify_models.dartdart test -p vm -j 1 --exclude-tags local-only --reporter compactdart test -p chrome test/unit/core/models/download/model_download_manager_stub_test.dart --reporter compactcd example/llamadart_server && dart test --reporter compactcd example/chat_app && flutter analyzecd example/chat_app && flutter testcd example/chat_app && flutter test integration_test/model_cache_mmproj_e2e_test.dart -d macosconfirms local-only E2E is skipped by defaultcd example/chat_app && flutter test --run-skipped -t local-only integration_test/model_cache_mmproj_e2e_test.dart -d macosdart format --output=none --set-exit-if-changed .dart analyzegit diff --checkReview Notes
PASSfor the declared scope. Non-blocking follow-ups were either already tracked or have been filed as docs(models): clarify local ModelSource option semantics #137/feat(models): serialize concurrent downloads for the same cache key #138 and added to feat(chat-app): improve model/mmproj asset-level cache UX #132.noCachecleanup, refresh preservation, retry policy, checksum mismatch cleanup, partial resume validators/If-Range, cancellation cleanup, cache list/get/remove/clear/prune, browser stub behavior, engine remote-source wiring, independent model/mmproj asset handling, and local-only real model+projector loading.Current review-comment status
All Copilot review threads have been resolved after verifying that the comments were addressed by follow-up commits or updated documentation. No unaddressed merge-blocking review comment is known.