feat(plugins): add unified provider system with capabilities#52995
feat(plugins): add unified provider system with capabilities#52995kesor wants to merge 14 commits into
Conversation
Greptile SummaryThis PR introduces a unified Key areas of the change:
One concrete bug found:
Confidence Score: 3/5
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1c6db4db64
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
456f76d to
64d14b7
Compare
Add plugin support for media providers (TTS, STT, image, video) and embedding providers through a unified registerProvider API with capabilities. Key changes: - Add capabilities array to ProviderPlugin: chat, embedding, audio, image, video, tts - Add embed/embedBatch methods for embedding providers - Add transcribeAudio, describeImage, describeVideo, textToSpeech for media - Remove separate registerMediaProvider (now unified) - Simplify by removing complex caching system This allows plugins to provide any combination of AI capabilities through a single registration API.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 64d14b7cc3
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…fallback - Fix N1: auto-mode crash when plugin exists but has no API key - Remove dead code in TTS telephony (identical if/else branches) - Remove complex plugin TTS integration tests that require deep mocking - Add tests for resolveTtsApiKey custom provider lookup - Regenerate schema.base.generated.ts Refs: PR-52995
64d14b7 to
b133670
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b1336704f4
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
- G8: Respect disableFallback when assembling plugin TTS provider order Only add custom plugin providers to fallback list when fallback is allowed - G10: Preserve MIME type when checking plugin voice-note compatibility Pass contentType to isVoiceCompatibleAudio instead of relying on filename - G5/G6: Exclude legacy media providers from plugin-image dispatch Only search providers array (new unified system) for plugin-style handlers Legacy providers in mediaUnderstandingProviders now correctly use cfg-based request Refs: PR-52995
- A1: Verify cache invalidation works via version check (no changes needed) The version-based cache invalidation in buildMediaUnderstandingRegistry correctly handles registry changes via setActivePluginRegistry - A3: Export ProviderCapability and embedding types from plugin-sdk Added exports for ProviderCapability, ProviderEmbedRequest, ProviderEmbedResult, ProviderEmbedBatchRequest, ProviderEmbedBatchResult - A5: Deduplicate capabilities when merging providers Use Set to remove duplicate capabilities in merged providers Refs: PR-52995
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: dba5c17a65
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
… plugin viability in auto mode - N1: Remove loaded plugin shortcut from validation The shortcut trusted in-memory plugin registry state which could be stale. Now uses manifest registry which validates against the config being validated. - N2: Check custom plugin API keys before returning in auto mode Instead of returning first custom plugin, iterate and check API key viability. Skip plugins without valid API key and try next one. Refs: PR-52995
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 80a4f6ed8f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
- N3: Check apiKey !== undefined instead of truthy to allow keyless plugins Empty string is a valid API key for local/self-hosted plugins - N4: Pass agentDir when probing plugin API keys in auto mode This ensures agent-scoped credentials are considered during provider selection Refs: PR-52995
…d return empty on no viable providers - P1: web-search - return empty string when no viable provider found instead of returning first without credential check - P2: runner.ts - add checkAllRegistryProviders() to iterate through all registry providers for capability match before returning null This ensures custom plugin providers are considered as fallbacks and prevents returning providers without valid credentials. Refs: PR-52995
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: bf3100affe
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
- N1: validation.ts - Remove manifest-based embedding validation Cannot validate embedding capability from manifest since it only lists provider IDs without capability info. Fall through to runtime validation. - N2: provider-registry.ts - Bypass cache when cfg is provided Include config in cache bypass logic to prevent stale providers. - N4: tts.ts - Restore provider normalization in resolveTtsApiKey Add normalizeSpeechProviderId at function start to handle mixed-case inputs. - N5: plugins/types.ts - Remove duplicate type definition Make PluginTextToSpeechRequest a re-export alias of TextToSpeechRequest. Refs: PR-52995
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 594c186a52
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Let plugin providers try without upfront API key validation. This allows keyless plugins to work and delegates error handling to runtime. The existing fallback mechanism will handle failures appropriately. Refs: PR-52995
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 64ee3ef49f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
- Q1: Add warning for unknown memorySearch providers/fallbacks Warn instead of silently accepting unknown providers to catch typos - Q2: Gate image dispatch on declared image capability Check routingCapabilities includes 'image' before treating as plugin handler - Q3: Assign default model when auto-selecting image provider from registry Use DEFAULT_IMAGE_MODELS[providerId] for image capability fallback - Q4: Gate audio plugin detection on declared audio capability Check routingCapabilities includes 'audio' before treating as plugin handler Refs: PR-52995
Also check routingCapabilities.includes('video') before treating as plugin video handler.
This was missed in the previous fix - video was not covered by Q4.
Refs: PR-52995
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2e3f83f960
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…lookup - R1: Remove 'microsoft' from TTS_PROVIDERS since 'edge' is canonical Both would normalize to 'microsoft' causing duplicate provider in fallback - R2: Use findNormalizedProviderValue for plugin embedding config lookup This handles alias normalization (e.g., 'z.ai' -> 'zai') correctly Refs: PR-52995
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1256528a01
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
- S1: Fix missing backtick in docs/tools/plugin.md - S2: Skip auth check for keyless plugin media providers Registry providers may be keyless - return entry directly without auth precheck - S3: Add microsoft to builtinSet for plugin TTS fallback Edge normalizes to microsoft, so microsoft must be in builtin set - S4: Push to issues not warnings for unknown memory providers This catches typos like 'olama' and enforces validation Refs: PR-52995
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 426b6e009c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const model = cap === "image" ? DEFAULT_IMAGE_MODELS[providerId] : undefined; | ||
| return { type: "provider" as const, provider: providerId, model }; |
There was a problem hiding this comment.
Iterate auto media candidates instead of returning first match
checkAllRegistryProviders returns on the first provider that exposes the capability, without any viability check. Because buildMediaUnderstandingRegistry inserts built-ins first (groq/deepgram), auto audio with no credentials will pick a built-in provider and never consider later keyless plugin providers, then fail in runProviderEntry when API key enforcement kicks in. Fresh evidence in this patch is the unconditional early return inside the new helper, which makes provider order—not capability/readiness—determine selection.
Useful? React with 👍 / 👎.
- T1: provider-registry.ts - cfg contaminates shared cache Add !cfg check to cache-write guard - T2: runner.ts - iterate plugin providers not built-ins for auto Only use plugin providers as fallback, skip built-ins - T3: runner.entries.ts - use findNormalizedProviderValue for auth fallback Normalized lookup handles mixed-case provider keys Refs: PR-52995
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e28aa0ae3f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (knownMemoryProviders.has(provider)) { | ||
| return; | ||
| } | ||
| // Reject unknown providers - catches typos and enforces validation | ||
| issues.push({ |
There was a problem hiding this comment.
Allow plugin embedding IDs in memorySearch validation
This validator hard-codes a built-in provider allowlist and unconditionally flags any other memorySearch.provider as unknown, which blocks the new plugin embedding provider flow added in this commit. With provider/fallback now typed as free-form strings and createEmbeddingProvider supporting plugin IDs, configs like agents.defaults.memorySearch.provider: "my-plugin" fail validation before plugin loading, so plugin-backed memory search cannot be configured at all.
Useful? React with 👍 / 👎.
For some reason #41496 was accidentally closed by the bot :(
Summary
This PR adds a unified
registerProviderAPI with capabilities to the plugin system, allowing plugins to provide custom providers for:embed,embedBatch,embedBatchInputsfor multimodal)transcribeAudio)describeImage)describeVideo)textToSpeech)Key Changes
Unified Provider API: Single
registerProvider()method with optionalcapabilitiesarray:["chat", "embedding", "audio", "image", "video", "tts"]Embedding Support: Plugins can provide embedding providers with
embed,embedBatch, andembedBatchInputs(multimodal) methods. Supports custom provider IDs beyond built-ins.Media Provider Support: Plugins can provide audio transcription, image/video understanding, and TTS with full access to API keys, baseUrl, headers, and proxy-aware fetchFn.
Type Definitions: Added capability types and method signatures to plugin-sdk. Includes plugin-specific request types (
PluginImageDescriptionRequest,PluginAudioTranscriptionRequest,PluginVideoDescriptionRequest).Plugin SDK Exports: New types exported for plugin authors:
PluginImageDescriptionRequestPluginAudioTranscriptionRequestPluginVideoDescriptionRequestEmbeddingProviderRequest/EmbeddingProviderFallback(now supports custom string IDs)Provider ID Normalization: Uses
normalizeProviderId()consistently for provider lookups to handle aliases (e.g., "z.ai" → "zai")Telephony TTS: Plugins can override built-in TTS providers for telephony with automatic fallback to built-in on failure
Configuration Fallthrough: TTS plugins receive configured model/voice/headers from config when not overridden by directives
Architecture
getPluginProvidersByCapability()filters plugins by capabilityRisk Assessment
Low Risk:
Change Type
Scope
Security Impact
NoNoNoCompatibility
Yes- existing built-in providers unchangedNoneNo