Skip to content

Evaluate Doc-to-LoRA for offline/edge model knowledge internalization #15

@RickCogley

Description

@RickCogley

Context

Doc-to-LoRA (Sakana AI, Feb 2026) converts documents into LoRA adapter weights that get baked into open-weight models, so they "know" the content without needing it in the context window. Sub-second inference, ~50MB memory vs 12GB+ for full in-context.

Not applicable to Claude (hosted API, no weight access), but could be relevant for future self-hosted model use cases.

Potential use cases

  1. Miko AI Search — If we ever run a smaller self-hosted model for query understanding or answer generation, Doc-to-LoRA could internalize the content corpus into adapters instead of RAG retrieval (lower latency, no vector search needed)
  2. Offline code review — A local Gemma/Llama model with standards baked in via LoRA for offline linting/suggestions without MCP or network access
  3. Text-to-LoRA (sibling project) — Describe a task in natural language ("enforce eSolia SvelteKit conventions") and it generates a task-specific adapter

Current approach (MCP) is still better for Claude workflows

Factor MCP (current) Doc-to-LoRA
Works with Claude Yes No (open-weight models only)
Always up-to-date Yes (R2 source of truth) Must re-generate adapters on edit
Accuracy 100% (verbatim) ~83.5% relative quality
Setup cost Done Days of GPU meta-training

References

Action

No immediate action. Revisit if we adopt a self-hosted open-weight model for any part of the pipeline (Miko, edge inference, offline tooling).

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions