Skip to content

refactor(model-platform): replace provider dialect configuration#1697

Merged
earayu merged 8 commits intomainfrom
refactor/model-system
Apr 25, 2026
Merged

refactor(model-platform): replace provider dialect configuration#1697
earayu merged 8 commits intomainfrom
refactor/model-system

Conversation

@earayu
Copy link
Copy Markdown
Collaborator

@earayu earayu commented Apr 25, 2026

Summary

  • Replace the old provider/dialect/custom_llm_provider configuration model with ModelProvider, ModelAccount, Model, ModelUse, and ModelRunner abstractions.
  • Add /api/v3/model-* APIs, a shared model runtime resolver/runner layer, and update collection/retrieval/agent/chat/title paths to reference model_id.
  • Remove legacy provider v2 UI/service code and the root models/ generated catalog assets.

Test plan

  • uv run ruff check aperag tests/unit_test/test_model_platform_v3_contract.py tests/unit_test/llm/test_model_runtime_resolver.py
  • uv run python -m pytest tests/unit_test/test_model_platform_v3_contract.py tests/unit_test/llm/test_model_runtime_resolver.py tests/unit_test/test_openapi_spec.py tests/unit_test/chat/test_chat_completion_service.py tests/unit_test/service/test_collection_service_embedding_lock.py tests/unit_test/test_modularization_boundaries.py tests/unit_test/test_web_typed_api_contract.py -q
  • uv run python scripts/export_openapi.py
  • cd web && yarn api:v2:types

Notes

  • cd web && yarn tsc --noEmit is still blocked by existing admin/quota/i18n type issues unrelated to this model-platform path.

Made with Cursor

earayu and others added 7 commits April 25, 2026 22:15
Introduce a first-class model account/model/use abstraction so users no longer configure LiteLLM dialects or custom provider routing names.

Made-with: Cursor
The model_provider table introduced by b4f2d91c8e3a declared
provider_type with a separate UniqueConstraint plus a non-unique index,
but the ORM declares the column as unique=True, index=True (which
SQLAlchemy renders as a single unique index). alembic check flagged the
drift and broke lint-and-unit on PR #1697. This revision drops the
redundant unique constraint and promotes the index to unique=True so
the autogenerate diff is clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The model-platform refactor deleted /api/v2/providers/* and
db.ops.query_provider_api_key without migrating the call sites that
still rely on them. Three small follow-ups to make CI green:

* Add AsyncLlmProviderRepositoryMixin.query_provider_api_key as a thin
  shim over ModelAccount so document_service.fetch_url_documents and
  web_access routes that look up the user's Jina key keep working.
  Falls back to public ACTIVE accounts when ``need_public`` is set,
  matching the old llm_provider.api_key semantics.
* Rewrite tests/e2e_http/hurl/full/10_provider_llm.hurl to drive the
  new /api/v3/model-providers, /model-accounts, /models, /model-uses
  surface plus /api/v1/embeddings + /api/v1/rerank with model_id. Uses
  provider_type=dashscope for the alibabacloud account and
  provider_type=openai_compatible for the openrouter account, both of
  which are seeded by 7c4e9e1f8b21.
* Update tests/unit_test/chat/test_chat_title_service.py to reflect the
  new chat-title flow: it no longer reaches into
  default_model_service; instead db_ops.query_model_uses returns the
  background-task ModelUse and the assertion now guards against
  model_invocation_service.chat being awoken on an empty history.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ice surface

The temporary ``query_provider_api_key`` shim on the LLM provider
repository is removed. Cross-domain callers that need a raw provider
API key (web_access JINA reader, knowledge_base fetch-url) now go
through ``ModelPlatformService.get_user_provider_api_key``, the
canonical surface for non-model-platform domains. The repository
keeps the SQL primitive as ``query_model_account_api_key`` — a
properly-named consumer of the new ``ModelAccount`` row, no longer
framed as a backward-compat shim.

This closes the design completeness gap PM flagged for #1697 (msg
=8ac3e7d9): the model-platform refactor must not leak a v2-shape api
key lookup into other domains' DB-ops surface.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The model-system refactor cuts ``ModelSpec.{model,
model_service_provider, custom_llm_provider}`` and only keeps
``model_id``. Pydantic silently drops extras, so any hurl file still
sending the old triple-shape parses as ``model_id=None`` and the
downstream code-path goes silently broken (collection vector index,
bot completion).

Following the template established by ``10_provider_llm.hurl``, each
file that exercises a real provider path (11, 13, 17) now seeds its
own ``ModelAccount`` + ``Model`` via the v3 routes and references the
captured ``model_id`` in the ``embedding`` / ``completion`` blobs.
``19_retrieval_http.hurl`` is a deterministic 4xx-shape test that
must not depend on any provider seed; the optional embedding /
completion config is dropped entirely.

Closes the second design completeness gap PM flagged for #1697
(msg=8ac3e7d9). Provider keys are still required to actually run
these against a live stack — the shape itself is now correct.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… fix

Three Weston blockers (msg=80e873c1) on the model-platform refactor:

* Blocker A — ``/api/v1/embeddings`` and ``/api/v1/rerank`` are
  permanent OpenAI-compat allowlist routes. The PR's first cut required
  ``model_id`` and broke pre-#1697 callers (provider hurl + external
  clients) with 422. ``EmbeddingRequest`` / ``RerankRequest`` now accept
  either the new ``{model_id}`` shape *or* the legacy
  ``{model, model_service_provider, custom_llm_provider}`` triple. The
  triple is resolved server-side via the new
  ``ModelPlatformService.resolve_legacy_model_id`` (provider_type +
  provider_model_id → ``Model.id``). ``/api/v3/model-*`` is untouched
  and still ``model_id``-only.

* Blocker B — alembic multi-head with #74. Both ``b4f2d91c8e3a`` and
  ``d8e2c4a17b91`` (which landed on main while #1697 was open) had
  ``down_revision=7c4e9e1f8b21``. Rebased onto current main and
  re-chained ``b4f2d91c8e3a → d8e2c4a17b91`` so ``alembic heads``
  reports a single head (``84fac9e3d8c2``).

* Blocker C — pre-#1697 collection / bot configs in the DB hold the
  legacy triple in JSON. Pydantic silently dropped extras after the
  schema cut, so existing rows would parse with ``model_id=None`` and
  the runtime resolver would 404. Two halves:

  - ``ModelSpec`` now stashes the legacy triple onto private
    ``legacy_*`` slots at parse time and exposes ``has_legacy_triple()``.
    Sync code paths (``base_embedding`` / ``base_completion``) and the
    async agent-runtime path (``_resolve_request``) call the new
    sync/async resolvers to fill ``model_id`` lazily before the
    runtime lookup runs. ``model_dump`` does not leak the legacy fields
    — only the canonical ``{model_id}`` shape goes back out the wire.

  - The ``b4f2d91c8e3a`` migration captures any user-supplied
    ``model_service_provider`` API keys + ``llm_provider_models`` rows
    and replays them as ``ModelAccount`` (``user_id="public"``) +
    ``Model`` rows in the new schema *before* dropping the legacy
    tables. Best-effort, not strict — rows that don't fit any new
    provider type are skipped (see migration docstring for the
    mapping). Idempotent w.r.t. repeated ``alembic upgrade`` (the
    legacy-table drops are now ``inspect``-guarded after Phase-7
    teardown rebuilt the baseline migrations from scratch).

Regression coverage in ``tests/unit_test/test_model_platform_v1_compat.py``:
new + legacy parses for both ``EmbeddingRequest`` / ``RerankRequest``
and ``ModelSpec``; ``model_id`` precedence over the legacy triple;
private legacy fields stay out of ``model_dump``. The existing v3
contract test (``test_model_platform_v3_contract.py``) still asserts
the new ``/api/v3/...`` schemas never expose the legacy field names.

Local gates: ``make db-check`` (single head, no autogen drift),
``make lint``, ``pytest tests/unit_test/`` (722 passed, 29 skipped).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ec shape

Closes Blocker D (Option A-extended, PM lock-in msg=e551e144 +
msg=06e8f718). PR #1697 collapses the legacy
``{model, model_service_provider, custom_llm_provider}`` triple and
keeps only ``model_id`` in ``ModelSpec``. Pydantic silently drops
extras, so any hurl / bash file still sending the old triple parses
as ``model_id=None`` and the downstream code-path goes silently
broken (collection vector index, bot completion, graph index).

Following the template established in ``10_provider_llm.hurl`` (and
the previous migration of 11/13/17 in ad69af9), each remaining file
that exercises a real provider path now seeds its own ``ModelAccount``
+ ``Model`` via the v3 routes and references the captured
``model_id`` in the ``embedding`` / ``completion`` blobs.

Files migrated:
* tests/e2e_http/hurl/full/12_bot.hurl
* tests/e2e_http/hurl/full/14_graph_http.hurl
* tests/e2e_http/hurl/full/15_agent_runtime_v3.hurl
* tests/e2e_http/hurl/full/20_knowledge_graph_http.hurl
* tests/e2e_http/scripts/run_chat_collection_flow.sh
* tests/e2e_http/scripts/run_graph_index_flow.sh

The bash scripts now require ``E2E_ALIBABACLOUD_API_KEY`` +
``E2E_OPENROUTER_API_KEY`` so they can seed the v3 model rows up
front, mirroring the hurl variable convention. No semantic changes
beyond the shape rewrite.

Final grep across ``aperag/ tests/ web/src/ docs/zh-CN/`` confirms
no live caller / hurl / bash / FE config payload still sends the
legacy triple — the only remaining matches are:

* the new Blocker A compat parser in ``aperag/domains/model_platform``
  + ``aperag/schema`` (intended — it accepts both shapes),
* litellm SDK ``custom_llm_provider`` keyword args inside the LLM
  invocation runners (a different namespace — that field is the
  Python kwarg name on ``litellm.completion``),
* the ghost-guard / regression-guard tests in
  ``tests/unit_test/test_model_platform_v3_contract.py`` and
  ``tests/unit_test/tasks/test_collection_init_skip.py``,
* the legacy migration files (which by definition had to ``CREATE
  TABLE`` the old schema before the refactor migration ``DROP``s it),
* and the FE i18n bundles + audit-log docs (string labels, not JSON
  contract field names — out of scope).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@earayu earayu force-pushed the refactor/model-system branch from e119437 to 9d03ac0 Compare April 25, 2026 14:32
query_model_account_api_key documents fallback semantics as "fall back to
public when the user has no personal account", but ORDER BY only sorted
by gmt_updated DESC — a freshly-edited shared "public" row silently
shadowed the caller's own credential when fallback_to_public=True. Both
production callers (fetch_url_documents, _get_user_jina_api_key) hit
this path, so it was a real correctness bug.

Add an ownership-first ORDER BY (CASE WHEN user_id = $caller THEN 0
ELSE 1) before the timestamp ordering so a user-owned row always
wins over public, regardless of update timestamps. Public is still
considered when (and only when) the caller has no personal row.

Regression tests in test_model_platform_v1_compat.py:
- test_user_personal_key_wins_over_newer_public_key: seed user row
  (1h ago) + public row (now), expect "user_key"; verified red on
  the un-fixed query
- test_public_key_returned_when_user_has_no_personal_account: sanity
  that the actual fallback path is unaffected

Weston blocker msg=fcefbaf7. No schema change (db-check clean).
@earayu earayu merged commit 384dd92 into main Apr 25, 2026
9 checks passed
@earayu earayu deleted the refactor/model-system branch April 25, 2026 15:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant