Conversation
adds spec for backend guidance
WalkthroughThis PR updates default configuration values across docs and runtime, adds config validation and endpoint defaulting, changes EndpointConfig.Priority to a pointer, fixes proxy path stripping, and adds a new backend implementation spec and related tests. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
docs/content/configuration/practices/performance.md (1)
165-165:⚠️ Potential issue | 🟡 MinorPre-existing inconsistency: "5MB instead of 50MB" should read "5MB instead of 100MB".
This line wasn't changed in this PR, but
DefaultConfigsetsMaxBodySizeto100 * 1024 * 1024(100MB), and the FAQ was updated in this PR to correctly reference "default 100MB" (line 120 infaq.md). This creates an inconsistency within the documentation.Suggested fix
- max_body_size: 5242880 # 5MB instead of 50MB + max_body_size: 5242880 # 5MB instead of default 100MB🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/content/configuration/practices/performance.md` at line 165, Update the inline comment next to the max_body_size setting in practice docs so it correctly contrasts 5MB with the actual default of 100MB: change the text "5MB instead of 50MB" to "5MB instead of 100MB" to match DefaultConfig.MaxBodySize (100 * 1024 * 1024) and the FAQ update referencing the default 100MB.
🧹 Nitpick comments (3)
internal/app/handlers/handler_proxy_path_test.go (1)
74-74:mockDiscoveryServiceForTranslationis a repurposed mock — naming nit.The mock name implies it was authored for translation-specific tests. For this proxy path test, consider using a plainer name (e.g.,
mockDiscoveryService) so the test's intent is self-evident at a glance.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/app/handlers/handler_proxy_path_test.go` at line 74, Rename the repurposed mock type and its usages from mockDiscoveryServiceForTranslation to a neutral name like mockDiscoveryService to better reflect its reuse; update the instantiation in the test (the occurrence where discoveryService: &mockDiscoveryServiceForTranslation{} is used) and rename the underlying mock type declaration (mockDiscoveryServiceForTranslation) to mockDiscoveryService, then adjust any other references in tests or helpers to the new identifier so all references stay consistent.internal/adapter/discovery/service.go (1)
201-210: Fix inaccurateerrgroup.SetLimitcomment and prevent log spamming on every discovery cycle.Two issues:
Inaccurate comment:
SetLimit(0)deadlocks all goroutines rather than panic, andSetLimit(n < 0)removes the limit entirely. The panic only occurs whenSetLimitis called whilst goroutines are still active.Guard doesn't persist: Unlike the
Intervalguard inStart(which permanently setss.config.Interval), this guard only mutates the localworkerCountvariable. SincediscoverConcurrentlyis called every ticker cycle, theWarnlog fires on every discovery cycle ifConcurrentWorkersis misconfigured—potentially flooding logs.Move the guard to
Startinstead to normalise the config once:♻️ Proposed fix — normalise in `Start`, simplify `discoverConcurrently`
// in Start() if s.config.Interval <= 0 { s.logger.Warn("Discovery interval is zero or negative, defaulting to 5 minutes", "configured", s.config.Interval) s.config.Interval = 5 * time.Minute } + +// Guard against zero/negative workers — SetLimit(0) deadlocks all goroutines; +// negative unexpectedly removes the concurrency limit entirely. +if s.config.ConcurrentWorkers <= 0 { + s.logger.Warn("ConcurrentWorkers is zero or negative, defaulting to 1", "configured", s.config.ConcurrentWorkers) + s.config.ConcurrentWorkers = 1 +}func (s *ModelDiscoveryService) discoverConcurrently(ctx context.Context, endpoints []*domain.Endpoint) error { workerCount := s.config.ConcurrentWorkers - // Guard against zero from user config overrides — errgroup.SetLimit panics on <= 0. - if workerCount <= 0 { - s.logger.Warn("ConcurrentWorkers is zero or negative, defaulting to 1", "configured", workerCount) - workerCount = 1 - } if workerCount > len(endpoints) { workerCount = len(endpoints) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/adapter/discovery/service.go` around lines 201 - 210, Update the misleading comment about errgroup.SetLimit to state that SetLimit(0) deadlocks goroutines, SetLimit(n<0) removes the limit, and SetLimit only panics if called while goroutines are already running; then move the lower-bound guard for s.config.ConcurrentWorkers out of discoverConcurrently into Start so the config is normalized once (set s.config.ConcurrentWorkers = 1 when <= 0 and log a single warning there), and simplify discoverConcurrently to read the already-normalized workerCount (still cap to len(endpoints) without emitting the repeated warning).internal/config/config.go (1)
146-187: Solid runtime validation — catches the most dangerous zero/empty values early.The validation logic correctly prevents ticker panics (zero interval), immediate context expiry (zero timeout), and empty routing configuration. The conditional check on
ModelDiscovery.Enabledavoids false positives when discovery is intentionally disabled.Two observations:
Port upper bound not validated:
Server.Portonly checks> 0, butreference.mdline 814 documents "Ports must be in valid range (1-65535)". Consider adding|| c.Server.Port > 65535to catch this early rather than atnet.Listentime.
log.Printfvsslog: Line 179 uses stdliblog.Printfwhile the rest of the codebase usesslog. This is likely intentional (structured logger may not be configured yet at validation time), but a brief comment explaining the choice would help future readers.Optional: add port upper-bound check
- if c.Server.Port <= 0 { - return fmt.Errorf("server.port must be > 0, got %d", c.Server.Port) + if c.Server.Port <= 0 || c.Server.Port > 65535 { + return fmt.Errorf("server.port must be between 1 and 65535, got %d", c.Server.Port) }As per coding guidelines,
**/*.go: "Comment on why rather than what in Go code" — a brief comment on thelog.Printfchoice would satisfy this.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/config/config.go` around lines 146 - 187, The Validate method on Config omits an upper bound check for Server.Port and uses log.Printf instead of the project’s slog without explanation; update Config.Validate to reject ports outside 1–65535 (i.e., add a check for c.Server.Port > 65535 and return a clear fmt.Errorf) and add a short comment above the server.rate_limits.burst_size warning explaining why stdlib log.Printf is used there (or replace it with slog if you confirm slog is always initialized at this call site). Refer to the Config.Validate function, Server.Port, and the Server.RateLimits.BurstSize log call when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/spec/new-backend.md`:
- Around line 23-38: The two Markdown tables in the diff (the File/Purpose table
and the Modified Files table) are missing the required blank lines before and
after per MD058; update the document by inserting a blank line both immediately
before the first table header line starting with "| File | Purpose |" and after
the end of that table, and likewise insert a blank line before the "| File |
Change |" header and after its final row so both tables are separated from
surrounding text by blank lines.
- Around line 143-149: The example's path_indices mapping is inconsistent with
the api.paths ordering shown; update the path_indices values to match the
api.paths array order (so models, chat_completions, completions, embeddings,
health use indices corresponding to their positions in api.paths) or add a clear
note that path_indices must be customised per-backend to map each key (health,
models, chat_completions, completions, embeddings) to the correct index in that
backend's api.paths array; reference the path_indices block and the api.paths
ordering in the docs to make the required mapping explicit.
In `@internal/adapter/discovery/repository.go`:
- Around line 251-263: The applyEndpointDefaults function currently treats
Priority==0 as "unset" and overwrites an explicit zero, which conflicts with
validateEndpointConfig accepting Priority >= 0; update the behavior so explicit
0 is preserved: either change EndpointConfig.Priority to a nullable type (e.g.,
*int) and set the default only when Priority is nil, or use a sentinel like -1
and only apply DefaultPriority when Priority < 0; adjust applyEndpointDefaults
(and any callers/constructors) accordingly and ensure validateEndpointConfig
still accepts 0 and treats the sentinel/nil as "use default".
---
Outside diff comments:
In `@docs/content/configuration/practices/performance.md`:
- Line 165: Update the inline comment next to the max_body_size setting in
practice docs so it correctly contrasts 5MB with the actual default of 100MB:
change the text "5MB instead of 50MB" to "5MB instead of 100MB" to match
DefaultConfig.MaxBodySize (100 * 1024 * 1024) and the FAQ update referencing the
default 100MB.
---
Nitpick comments:
In `@internal/adapter/discovery/service.go`:
- Around line 201-210: Update the misleading comment about errgroup.SetLimit to
state that SetLimit(0) deadlocks goroutines, SetLimit(n<0) removes the limit,
and SetLimit only panics if called while goroutines are already running; then
move the lower-bound guard for s.config.ConcurrentWorkers out of
discoverConcurrently into Start so the config is normalized once (set
s.config.ConcurrentWorkers = 1 when <= 0 and log a single warning there), and
simplify discoverConcurrently to read the already-normalized workerCount (still
cap to len(endpoints) without emitting the repeated warning).
In `@internal/app/handlers/handler_proxy_path_test.go`:
- Line 74: Rename the repurposed mock type and its usages from
mockDiscoveryServiceForTranslation to a neutral name like mockDiscoveryService
to better reflect its reuse; update the instantiation in the test (the
occurrence where discoveryService: &mockDiscoveryServiceForTranslation{} is
used) and rename the underlying mock type declaration
(mockDiscoveryServiceForTranslation) to mockDiscoveryService, then adjust any
other references in tests or helpers to the new identifier so all references
stay consistent.
In `@internal/config/config.go`:
- Around line 146-187: The Validate method on Config omits an upper bound check
for Server.Port and uses log.Printf instead of the project’s slog without
explanation; update Config.Validate to reject ports outside 1–65535 (i.e., add a
check for c.Server.Port > 65535 and return a clear fmt.Errorf) and add a short
comment above the server.rate_limits.burst_size warning explaining why stdlib
log.Printf is used there (or replace it with slog if you confirm slog is always
initialized at this call site). Refer to the Config.Validate function,
Server.Port, and the Server.RateLimits.BurstSize log call when making the
change.
| | File | Purpose | | ||
| |------|---------| | ||
| | `config/profiles/{backend}.yaml` | Profile configuration | | ||
| | `internal/adapter/registry/profile/{backend}.go` | Response type definitions | | ||
| | `internal/adapter/registry/profile/{backend}_parser.go` | Parser implementation | | ||
| | `internal/adapter/registry/profile/{backend}_parser_test.go` | Parser tests | | ||
| | `internal/adapter/converter/{backend}_converter.go` | Converter implementation | | ||
| | `internal/adapter/converter/{backend}_converter_test.go` | Converter tests | | ||
|
|
||
| ### Modified Files | ||
| | File | Change | | ||
| |------|--------| | ||
| | `internal/core/constants/providers.go` | Add provider type constant | | ||
| | `internal/core/domain/profile.go` | Add profile constant | | ||
| | `internal/adapter/registry/profile/parsers.go` | Register parser | | ||
| | `internal/adapter/converter/factory.go` | Register converter | |
There was a problem hiding this comment.
Add blank lines around Markdown tables to satisfy MD058.
Both tables lack the required blank line before and after, as reported by markdownlint.
📝 Proposed fix
### New Files
+
| File | Purpose |
|------|---------|
| `config/profiles/{backend}.yaml` | Profile configuration |
| `internal/adapter/registry/profile/{backend}.go` | Response type definitions |
| `internal/adapter/registry/profile/{backend}_parser.go` | Parser implementation |
| `internal/adapter/registry/profile/{backend}_parser_test.go` | Parser tests |
| `internal/adapter/converter/{backend}_converter.go` | Converter implementation |
| `internal/adapter/converter/{backend}_converter_test.go` | Converter tests |
+
### Modified Files
+
| File | Change |
|------|--------|
| `internal/core/constants/providers.go` | Add provider type constant |
| `internal/core/domain/profile.go` | Add profile constant |
| `internal/adapter/registry/profile/parsers.go` | Register parser |
| `internal/adapter/converter/factory.go` | Register converter |
+🧰 Tools
🪛 markdownlint-cli2 (0.21.0)
[warning] 23-23: Tables should be surrounded by blank lines
(MD058, blanks-around-tables)
[warning] 33-33: Tables should be surrounded by blank lines
(MD058, blanks-around-tables)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/spec/new-backend.md` around lines 23 - 38, The two Markdown tables in
the diff (the File/Purpose table and the Modified Files table) are missing the
required blank lines before and after per MD058; update the document by
inserting a blank line both immediately before the first table header line
starting with "| File | Purpose |" and after the end of that table, and likewise
insert a blank line before the "| File | Change |" header and after its final
row so both tables are separated from surrounding text by blank lines.
There was a problem hiding this comment.
🧹 Nitpick comments (3)
internal/config/config.go (1)
180-184: Useslog.Warninstead oflog.Printffor consistency with the rest of the codebase.The rest of this file (and codebase) uses
log/slogfor structured logging (e.g.slog.Errorat line 260). Mixing in the unstructuredlog.Printfhere means this warning bypasses the configured log level, format (JSON), and output destination.♻️ Suggested fix
- if c.Server.RateLimits.BurstSize == 0 { - log.Printf("WARNING: server.rate_limits.burst_size is 0 — all requests will be throttled immediately") - } + if c.Server.RateLimits.BurstSize == 0 { + slog.Warn("server.rate_limits.burst_size is 0 — all requests will be throttled immediately") + }This also lets you drop the
"log"import on line 5 if it's no longer used elsewhere.As per coding guidelines, "Comment on why rather than what in Go code" — the existing inline comment on lines 180–181 already explains why this is a warning rather than an error, which is good. The concern here is purely about the logging mechanism.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/config/config.go` around lines 180 - 184, Replace the unstructured log.Printf warning with the structured slog.Warn call so the message respects configured log level/format/output: when checking c.Server.RateLimits.BurstSize == 0, call slog.Warn (include a descriptive message and any relevant attributes if desired) instead of log.Printf, and remove the "log" import if it becomes unused; modify the code that references c.Server.RateLimits.BurstSize accordingly to use slog.Warn for consistency with other uses like slog.Error.internal/adapter/discovery/repository_test.go (1)
141-158: Shallow copy ofEndpointConfigshares thePrioritypointer across subtests.Line 145 copies
tc.cfgby value, butPriority *intis a pointer — bothcfgandtc.cfgwill reference the sameint. This is safe today becauseapplyEndpointDefaultsonly writes a new pointer whenPriority == nil(never mutates through the existing pointer). However, ifapplyEndpointDefaultsever changes to modify*cfg.Priorityin-place, this will silently corrupt the table-driven test data.A defensive deep copy is trivial and prevents a future foot-gun:
🛡️ Suggested defensive copy
t.Run(tc.name, func(t *testing.T) { t.Parallel() - cfg := tc.cfg // copy to avoid mutation across subtests + cfg := tc.cfg // shallow copy + if cfg.Priority != nil { + p := *cfg.Priority + cfg.Priority = &p + } applyEndpointDefaults(&cfg)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/adapter/discovery/repository_test.go` around lines 141 - 158, The test copies tc.cfg by value but leaves the Priority *int pointer shared; update the subtest setup (before calling applyEndpointDefaults) to make a defensive deep copy of the Priority pointer so each subtest has its own int (e.g., copy the int value into a new variable and assign its address to cfg.Priority if tc.cfg.Priority != nil) to avoid mutating shared state when running applyEndpointDefaults or future changes to it; keep using cfg and applyEndpointDefaults(&cfg) as before.docs/spec/new-backend.md (1)
3-6: Minor: Clarify or future-proof the model version references.Line 6 references "Opus 4.5 | 4.6 / Sonnet 4.5 | 4.6" — these will become stale as newer models are released. Consider either removing the specific version list or noting when this was last verified (e.g. "Last verified: Feb 2026 with …").
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/spec/new-backend.md` around lines 3 - 6, Update the model-version mention "Opus 4.5 | 4.6 / Sonnet 4.5 | 4.6" to be future-proof by either removing the explicit versions or adding a validation timestamp; for example, replace that token with a version-agnostic phrase like "Verified on Opus / Sonnet" or append "Last verified: <YYYY-MM-DD> with Opus 4.6, Sonnet 4.6" so the `docs/spec/new-backend.md` text remains accurate over time.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@docs/spec/new-backend.md`:
- Around line 23-38: The two Markdown tables in docs/spec/new-backend.md (the
first listing File/Purpose block and the Modified Files block) violate MD058 by
lacking blank lines before and after each table; edit docs/spec/new-backend.md
to insert a blank line immediately before and immediately after each table so
both tables are separated from surrounding paragraphs/headings, preserving the
exact table content and alignment.
---
Nitpick comments:
In `@docs/spec/new-backend.md`:
- Around line 3-6: Update the model-version mention "Opus 4.5 | 4.6 / Sonnet 4.5
| 4.6" to be future-proof by either removing the explicit versions or adding a
validation timestamp; for example, replace that token with a version-agnostic
phrase like "Verified on Opus / Sonnet" or append "Last verified: <YYYY-MM-DD>
with Opus 4.6, Sonnet 4.6" so the `docs/spec/new-backend.md` text remains
accurate over time.
In `@internal/adapter/discovery/repository_test.go`:
- Around line 141-158: The test copies tc.cfg by value but leaves the Priority
*int pointer shared; update the subtest setup (before calling
applyEndpointDefaults) to make a defensive deep copy of the Priority pointer so
each subtest has its own int (e.g., copy the int value into a new variable and
assign its address to cfg.Priority if tc.cfg.Priority != nil) to avoid mutating
shared state when running applyEndpointDefaults or future changes to it; keep
using cfg and applyEndpointDefaults(&cfg) as before.
In `@internal/config/config.go`:
- Around line 180-184: Replace the unstructured log.Printf warning with the
structured slog.Warn call so the message respects configured log
level/format/output: when checking c.Server.RateLimits.BurstSize == 0, call
slog.Warn (include a descriptive message and any relevant attributes if desired)
instead of log.Printf, and remove the "log" import if it becomes unused; modify
the code that references c.Server.RateLimits.BurstSize accordingly to use
slog.Warn for consistency with other uses like slog.Error.
Address a few things:
/olla/proxypathing for remote nodesSummary by CodeRabbit