Skip to content

fix: Proxy Path issue & sensible defaults#107

Merged
thushan merged 5 commits intomainfrom
fix/proxy-path
Feb 20, 2026
Merged

fix: Proxy Path issue & sensible defaults#107
thushan merged 5 commits intomainfrom
fix/proxy-path

Conversation

@thushan
Copy link
Owner

@thushan thushan commented Feb 20, 2026

Address a few things:

  • Resolution of /olla/proxy pathing for remote nodes
  • Sensible defaults
  • Documentation + spec for backend for AI usage

Summary by CodeRabbit

  • Documentation
    • Updated defaults and guidance: health-check interval 5s, timeout 2s; endpoint priority default 100; model discovery now documented as enabled by default; retry/backoff and cleanup defaults clarified; improved performance optimisation and quickstart notes; added a new guide for adding backends.
  • Bug Fixes
    • Fixed proxy path handling so requests forward with the correct target path.
    • Safer runtime validation and sensible defaults for discovery, timeouts and priorities to avoid misconfiguration panics.

@coderabbitai
Copy link

coderabbitai bot commented Feb 20, 2026

Walkthrough

This PR updates default configuration values across docs and runtime, adds config validation and endpoint defaulting, changes EndpointConfig.Priority to a pointer, fixes proxy path stripping, and adds a new backend implementation spec and related tests.

Changes

Cohort / File(s) Summary
Documentation – Defaults & Configuration
docs/content/concepts/health-checking.md, docs/content/concepts/model-unification.md, docs/content/configuration/overview.md, docs/content/configuration/practices/performance.md, docs/content/configuration/reference.md, docs/content/faq.md, docs/content/getting-started/quickstart.md, docs/content/integrations/backend/ollama.md
Updated docs to reflect new defaults (e.g., check_interval 5s, check_timeout 2s, concurrent_workers 5, priority 100, retry_backoff 1s, cleanup_interval 5m) and clarified which fields are optional.
New Backend Implementation Guide
docs/spec/new-backend.md
New comprehensive guide detailing how to add a new backend (profiles, parsers, converters, constants, repo layout, examples, tests).
Endpoint Defaults & Priority pointer
internal/adapter/discovery/repository.go, internal/adapter/discovery/repository_test.go, internal/adapter/discovery/repository_benchmark_test.go, internal/config/types.go
Added exported defaults (DefaultCheckInterval, DefaultCheckTimeout, DefaultPriority) and applyEndpointDefaults to populate zero-value fields before validation. Changed EndpointConfig.Priority from int to *int and updated tests/benchmarks accordingly.
Configuration Validation & ModelDiscovery
internal/config/config.go, internal/config/config_test.go, internal/config/config_test.go
Added ModelDiscoveryConfig, Config.Validate() and TranslatorsConfig.Validate(); integrated validation into load flow. Extended tests for validation and defaults. Introduced helper ptrInt(int) *int and updated DefaultConfig to use pointer priorities.
Discovery & Routing Safeguards
internal/adapter/discovery/service.go, internal/adapter/registry/routing/discovery_strategy.go
Added guards/defaulting for invalid discovery intervals, concurrent workers and discovery timeout with warning logs to avoid panics and ensure sane runtime values.
Proxy Handler Path Fix & Tests
internal/app/handlers/handler_proxy.go, internal/app/handlers/handler_proxy_path_test.go
Strip route prefix from r.URL.Path before forwarding in proxyHandler; added TestProxyPathStripping to assert correct forwarded path handling.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

bug, documentation, enhancement

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 57.14% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the two main changes: fixing a proxy path issue and introducing sensible default configuration values.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/proxy-path

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
docs/content/configuration/practices/performance.md (1)

165-165: ⚠️ Potential issue | 🟡 Minor

Pre-existing inconsistency: "5MB instead of 50MB" should read "5MB instead of 100MB".

This line wasn't changed in this PR, but DefaultConfig sets MaxBodySize to 100 * 1024 * 1024 (100MB), and the FAQ was updated in this PR to correctly reference "default 100MB" (line 120 in faq.md). This creates an inconsistency within the documentation.

Suggested fix
-    max_body_size: 5242880    # 5MB instead of 50MB
+    max_body_size: 5242880    # 5MB instead of default 100MB
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/content/configuration/practices/performance.md` at line 165, Update the
inline comment next to the max_body_size setting in practice docs so it
correctly contrasts 5MB with the actual default of 100MB: change the text "5MB
instead of 50MB" to "5MB instead of 100MB" to match DefaultConfig.MaxBodySize
(100 * 1024 * 1024) and the FAQ update referencing the default 100MB.
🧹 Nitpick comments (3)
internal/app/handlers/handler_proxy_path_test.go (1)

74-74: mockDiscoveryServiceForTranslation is a repurposed mock — naming nit.

The mock name implies it was authored for translation-specific tests. For this proxy path test, consider using a plainer name (e.g., mockDiscoveryService) so the test's intent is self-evident at a glance.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/app/handlers/handler_proxy_path_test.go` at line 74, Rename the
repurposed mock type and its usages from mockDiscoveryServiceForTranslation to a
neutral name like mockDiscoveryService to better reflect its reuse; update the
instantiation in the test (the occurrence where discoveryService:
&mockDiscoveryServiceForTranslation{} is used) and rename the underlying mock
type declaration (mockDiscoveryServiceForTranslation) to mockDiscoveryService,
then adjust any other references in tests or helpers to the new identifier so
all references stay consistent.
internal/adapter/discovery/service.go (1)

201-210: Fix inaccurate errgroup.SetLimit comment and prevent log spamming on every discovery cycle.

Two issues:

  1. Inaccurate comment: SetLimit(0) deadlocks all goroutines rather than panic, and SetLimit(n < 0) removes the limit entirely. The panic only occurs when SetLimit is called whilst goroutines are still active.

  2. Guard doesn't persist: Unlike the Interval guard in Start (which permanently sets s.config.Interval), this guard only mutates the local workerCount variable. Since discoverConcurrently is called every ticker cycle, the Warn log fires on every discovery cycle if ConcurrentWorkers is misconfigured—potentially flooding logs.

Move the guard to Start instead to normalise the config once:

♻️ Proposed fix — normalise in `Start`, simplify `discoverConcurrently`
 // in Start()
 if s.config.Interval <= 0 {
     s.logger.Warn("Discovery interval is zero or negative, defaulting to 5 minutes", "configured", s.config.Interval)
     s.config.Interval = 5 * time.Minute
 }
+
+// Guard against zero/negative workers — SetLimit(0) deadlocks all goroutines;
+// negative unexpectedly removes the concurrency limit entirely.
+if s.config.ConcurrentWorkers <= 0 {
+    s.logger.Warn("ConcurrentWorkers is zero or negative, defaulting to 1", "configured", s.config.ConcurrentWorkers)
+    s.config.ConcurrentWorkers = 1
+}
 func (s *ModelDiscoveryService) discoverConcurrently(ctx context.Context, endpoints []*domain.Endpoint) error {
     workerCount := s.config.ConcurrentWorkers
-    // Guard against zero from user config overrides — errgroup.SetLimit panics on <= 0.
-    if workerCount <= 0 {
-        s.logger.Warn("ConcurrentWorkers is zero or negative, defaulting to 1", "configured", workerCount)
-        workerCount = 1
-    }
     if workerCount > len(endpoints) {
         workerCount = len(endpoints)
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/adapter/discovery/service.go` around lines 201 - 210, Update the
misleading comment about errgroup.SetLimit to state that SetLimit(0) deadlocks
goroutines, SetLimit(n<0) removes the limit, and SetLimit only panics if called
while goroutines are already running; then move the lower-bound guard for
s.config.ConcurrentWorkers out of discoverConcurrently into Start so the config
is normalized once (set s.config.ConcurrentWorkers = 1 when <= 0 and log a
single warning there), and simplify discoverConcurrently to read the
already-normalized workerCount (still cap to len(endpoints) without emitting the
repeated warning).
internal/config/config.go (1)

146-187: Solid runtime validation — catches the most dangerous zero/empty values early.

The validation logic correctly prevents ticker panics (zero interval), immediate context expiry (zero timeout), and empty routing configuration. The conditional check on ModelDiscovery.Enabled avoids false positives when discovery is intentionally disabled.

Two observations:

  1. Port upper bound not validated: Server.Port only checks > 0, but reference.md line 814 documents "Ports must be in valid range (1-65535)". Consider adding || c.Server.Port > 65535 to catch this early rather than at net.Listen time.

  2. log.Printf vs slog: Line 179 uses stdlib log.Printf while the rest of the codebase uses slog. This is likely intentional (structured logger may not be configured yet at validation time), but a brief comment explaining the choice would help future readers.

Optional: add port upper-bound check
-	if c.Server.Port <= 0 {
-		return fmt.Errorf("server.port must be > 0, got %d", c.Server.Port)
+	if c.Server.Port <= 0 || c.Server.Port > 65535 {
+		return fmt.Errorf("server.port must be between 1 and 65535, got %d", c.Server.Port)
 	}

As per coding guidelines, **/*.go: "Comment on why rather than what in Go code" — a brief comment on the log.Printf choice would satisfy this.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/config/config.go` around lines 146 - 187, The Validate method on
Config omits an upper bound check for Server.Port and uses log.Printf instead of
the project’s slog without explanation; update Config.Validate to reject ports
outside 1–65535 (i.e., add a check for c.Server.Port > 65535 and return a clear
fmt.Errorf) and add a short comment above the server.rate_limits.burst_size
warning explaining why stdlib log.Printf is used there (or replace it with slog
if you confirm slog is always initialized at this call site). Refer to the
Config.Validate function, Server.Port, and the Server.RateLimits.BurstSize log
call when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/spec/new-backend.md`:
- Around line 23-38: The two Markdown tables in the diff (the File/Purpose table
and the Modified Files table) are missing the required blank lines before and
after per MD058; update the document by inserting a blank line both immediately
before the first table header line starting with "| File | Purpose |" and after
the end of that table, and likewise insert a blank line before the "| File |
Change |" header and after its final row so both tables are separated from
surrounding text by blank lines.
- Around line 143-149: The example's path_indices mapping is inconsistent with
the api.paths ordering shown; update the path_indices values to match the
api.paths array order (so models, chat_completions, completions, embeddings,
health use indices corresponding to their positions in api.paths) or add a clear
note that path_indices must be customised per-backend to map each key (health,
models, chat_completions, completions, embeddings) to the correct index in that
backend's api.paths array; reference the path_indices block and the api.paths
ordering in the docs to make the required mapping explicit.

In `@internal/adapter/discovery/repository.go`:
- Around line 251-263: The applyEndpointDefaults function currently treats
Priority==0 as "unset" and overwrites an explicit zero, which conflicts with
validateEndpointConfig accepting Priority >= 0; update the behavior so explicit
0 is preserved: either change EndpointConfig.Priority to a nullable type (e.g.,
*int) and set the default only when Priority is nil, or use a sentinel like -1
and only apply DefaultPriority when Priority < 0; adjust applyEndpointDefaults
(and any callers/constructors) accordingly and ensure validateEndpointConfig
still accepts 0 and treats the sentinel/nil as "use default".

---

Outside diff comments:
In `@docs/content/configuration/practices/performance.md`:
- Line 165: Update the inline comment next to the max_body_size setting in
practice docs so it correctly contrasts 5MB with the actual default of 100MB:
change the text "5MB instead of 50MB" to "5MB instead of 100MB" to match
DefaultConfig.MaxBodySize (100 * 1024 * 1024) and the FAQ update referencing the
default 100MB.

---

Nitpick comments:
In `@internal/adapter/discovery/service.go`:
- Around line 201-210: Update the misleading comment about errgroup.SetLimit to
state that SetLimit(0) deadlocks goroutines, SetLimit(n<0) removes the limit,
and SetLimit only panics if called while goroutines are already running; then
move the lower-bound guard for s.config.ConcurrentWorkers out of
discoverConcurrently into Start so the config is normalized once (set
s.config.ConcurrentWorkers = 1 when <= 0 and log a single warning there), and
simplify discoverConcurrently to read the already-normalized workerCount (still
cap to len(endpoints) without emitting the repeated warning).

In `@internal/app/handlers/handler_proxy_path_test.go`:
- Line 74: Rename the repurposed mock type and its usages from
mockDiscoveryServiceForTranslation to a neutral name like mockDiscoveryService
to better reflect its reuse; update the instantiation in the test (the
occurrence where discoveryService: &mockDiscoveryServiceForTranslation{} is
used) and rename the underlying mock type declaration
(mockDiscoveryServiceForTranslation) to mockDiscoveryService, then adjust any
other references in tests or helpers to the new identifier so all references
stay consistent.

In `@internal/config/config.go`:
- Around line 146-187: The Validate method on Config omits an upper bound check
for Server.Port and uses log.Printf instead of the project’s slog without
explanation; update Config.Validate to reject ports outside 1–65535 (i.e., add a
check for c.Server.Port > 65535 and return a clear fmt.Errorf) and add a short
comment above the server.rate_limits.burst_size warning explaining why stdlib
log.Printf is used there (or replace it with slog if you confirm slog is always
initialized at this call site). Refer to the Config.Validate function,
Server.Port, and the Server.RateLimits.BurstSize log call when making the
change.

Comment on lines +23 to +38
| File | Purpose |
|------|---------|
| `config/profiles/{backend}.yaml` | Profile configuration |
| `internal/adapter/registry/profile/{backend}.go` | Response type definitions |
| `internal/adapter/registry/profile/{backend}_parser.go` | Parser implementation |
| `internal/adapter/registry/profile/{backend}_parser_test.go` | Parser tests |
| `internal/adapter/converter/{backend}_converter.go` | Converter implementation |
| `internal/adapter/converter/{backend}_converter_test.go` | Converter tests |

### Modified Files
| File | Change |
|------|--------|
| `internal/core/constants/providers.go` | Add provider type constant |
| `internal/core/domain/profile.go` | Add profile constant |
| `internal/adapter/registry/profile/parsers.go` | Register parser |
| `internal/adapter/converter/factory.go` | Register converter |
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add blank lines around Markdown tables to satisfy MD058.

Both tables lack the required blank line before and after, as reported by markdownlint.

📝 Proposed fix
 ### New Files
+
 | File | Purpose |
 |------|---------|
 | `config/profiles/{backend}.yaml` | Profile configuration |
 | `internal/adapter/registry/profile/{backend}.go` | Response type definitions |
 | `internal/adapter/registry/profile/{backend}_parser.go` | Parser implementation |
 | `internal/adapter/registry/profile/{backend}_parser_test.go` | Parser tests |
 | `internal/adapter/converter/{backend}_converter.go` | Converter implementation |
 | `internal/adapter/converter/{backend}_converter_test.go` | Converter tests |
+
 ### Modified Files
+
 | File | Change |
 |------|--------|
 | `internal/core/constants/providers.go` | Add provider type constant |
 | `internal/core/domain/profile.go` | Add profile constant |
 | `internal/adapter/registry/profile/parsers.go` | Register parser |
 | `internal/adapter/converter/factory.go` | Register converter |
+
🧰 Tools
🪛 markdownlint-cli2 (0.21.0)

[warning] 23-23: Tables should be surrounded by blank lines

(MD058, blanks-around-tables)


[warning] 33-33: Tables should be surrounded by blank lines

(MD058, blanks-around-tables)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/spec/new-backend.md` around lines 23 - 38, The two Markdown tables in
the diff (the File/Purpose table and the Modified Files table) are missing the
required blank lines before and after per MD058; update the document by
inserting a blank line both immediately before the first table header line
starting with "| File | Purpose |" and after the end of that table, and likewise
insert a blank line before the "| File | Change |" header and after its final
row so both tables are separated from surrounding text by blank lines.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
internal/config/config.go (1)

180-184: Use slog.Warn instead of log.Printf for consistency with the rest of the codebase.

The rest of this file (and codebase) uses log/slog for structured logging (e.g. slog.Error at line 260). Mixing in the unstructured log.Printf here means this warning bypasses the configured log level, format (JSON), and output destination.

♻️ Suggested fix
-	if c.Server.RateLimits.BurstSize == 0 {
-		log.Printf("WARNING: server.rate_limits.burst_size is 0 — all requests will be throttled immediately")
-	}
+	if c.Server.RateLimits.BurstSize == 0 {
+		slog.Warn("server.rate_limits.burst_size is 0 — all requests will be throttled immediately")
+	}

This also lets you drop the "log" import on line 5 if it's no longer used elsewhere.

As per coding guidelines, "Comment on why rather than what in Go code" — the existing inline comment on lines 180–181 already explains why this is a warning rather than an error, which is good. The concern here is purely about the logging mechanism.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/config/config.go` around lines 180 - 184, Replace the unstructured
log.Printf warning with the structured slog.Warn call so the message respects
configured log level/format/output: when checking c.Server.RateLimits.BurstSize
== 0, call slog.Warn (include a descriptive message and any relevant attributes
if desired) instead of log.Printf, and remove the "log" import if it becomes
unused; modify the code that references c.Server.RateLimits.BurstSize
accordingly to use slog.Warn for consistency with other uses like slog.Error.
internal/adapter/discovery/repository_test.go (1)

141-158: Shallow copy of EndpointConfig shares the Priority pointer across subtests.

Line 145 copies tc.cfg by value, but Priority *int is a pointer — both cfg and tc.cfg will reference the same int. This is safe today because applyEndpointDefaults only writes a new pointer when Priority == nil (never mutates through the existing pointer). However, if applyEndpointDefaults ever changes to modify *cfg.Priority in-place, this will silently corrupt the table-driven test data.

A defensive deep copy is trivial and prevents a future foot-gun:

🛡️ Suggested defensive copy
 		t.Run(tc.name, func(t *testing.T) {
 			t.Parallel()
 
-			cfg := tc.cfg // copy to avoid mutation across subtests
+			cfg := tc.cfg // shallow copy
+			if cfg.Priority != nil {
+				p := *cfg.Priority
+				cfg.Priority = &p
+			}
 			applyEndpointDefaults(&cfg)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/adapter/discovery/repository_test.go` around lines 141 - 158, The
test copies tc.cfg by value but leaves the Priority *int pointer shared; update
the subtest setup (before calling applyEndpointDefaults) to make a defensive
deep copy of the Priority pointer so each subtest has its own int (e.g., copy
the int value into a new variable and assign its address to cfg.Priority if
tc.cfg.Priority != nil) to avoid mutating shared state when running
applyEndpointDefaults or future changes to it; keep using cfg and
applyEndpointDefaults(&cfg) as before.
docs/spec/new-backend.md (1)

3-6: Minor: Clarify or future-proof the model version references.

Line 6 references "Opus 4.5 | 4.6 / Sonnet 4.5 | 4.6" — these will become stale as newer models are released. Consider either removing the specific version list or noting when this was last verified (e.g. "Last verified: Feb 2026 with …").

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/spec/new-backend.md` around lines 3 - 6, Update the model-version
mention "Opus 4.5 | 4.6 / Sonnet 4.5 | 4.6" to be future-proof by either
removing the explicit versions or adding a validation timestamp; for example,
replace that token with a version-agnostic phrase like "Verified on Opus /
Sonnet" or append "Last verified: <YYYY-MM-DD> with Opus 4.6, Sonnet 4.6" so the
`docs/spec/new-backend.md` text remains accurate over time.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@docs/spec/new-backend.md`:
- Around line 23-38: The two Markdown tables in docs/spec/new-backend.md (the
first listing File/Purpose block and the Modified Files block) violate MD058 by
lacking blank lines before and after each table; edit docs/spec/new-backend.md
to insert a blank line immediately before and immediately after each table so
both tables are separated from surrounding paragraphs/headings, preserving the
exact table content and alignment.

---

Nitpick comments:
In `@docs/spec/new-backend.md`:
- Around line 3-6: Update the model-version mention "Opus 4.5 | 4.6 / Sonnet 4.5
| 4.6" to be future-proof by either removing the explicit versions or adding a
validation timestamp; for example, replace that token with a version-agnostic
phrase like "Verified on Opus / Sonnet" or append "Last verified: <YYYY-MM-DD>
with Opus 4.6, Sonnet 4.6" so the `docs/spec/new-backend.md` text remains
accurate over time.

In `@internal/adapter/discovery/repository_test.go`:
- Around line 141-158: The test copies tc.cfg by value but leaves the Priority
*int pointer shared; update the subtest setup (before calling
applyEndpointDefaults) to make a defensive deep copy of the Priority pointer so
each subtest has its own int (e.g., copy the int value into a new variable and
assign its address to cfg.Priority if tc.cfg.Priority != nil) to avoid mutating
shared state when running applyEndpointDefaults or future changes to it; keep
using cfg and applyEndpointDefaults(&cfg) as before.

In `@internal/config/config.go`:
- Around line 180-184: Replace the unstructured log.Printf warning with the
structured slog.Warn call so the message respects configured log
level/format/output: when checking c.Server.RateLimits.BurstSize == 0, call
slog.Warn (include a descriptive message and any relevant attributes if desired)
instead of log.Printf, and remove the "log" import if it becomes unused; modify
the code that references c.Server.RateLimits.BurstSize accordingly to use
slog.Warn for consistency with other uses like slog.Error.

@thushan thushan merged commit e4cea91 into main Feb 20, 2026
8 checks passed
@thushan thushan deleted the fix/proxy-path branch February 20, 2026 09:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant