Skip to content

fix(integrations): strip inference path from backend URL (#2075)#2101

Merged
M3gA-Mind merged 3 commits into
tinyhumansai:mainfrom
obchain:fix/2075-composio-url-normalization
May 19, 2026
Merged

fix(integrations): strip inference path from backend URL (#2075)#2101
M3gA-Mind merged 3 commits into
tinyhumansai:mainfrom
obchain:fix/2075-composio-url-normalization

Conversation

@obchain
Copy link
Copy Markdown
Contributor

@obchain obchain commented May 18, 2026

Summary

  • Strip inference-style paths (/openai/v1/chat/completions) from BACKEND_URL env / compile-time fallback before it becomes any integration call's prefix.
  • Recover scheme-less inputs (api.tinyhumans.ai/...) in normalize_backend_api_base_url so a misbaked env still normalizes.
  • Re-sanitize backend_url inside IntegrationClient::new and warn! on first fix-up — defense-in-depth + runtime detector.

Problem

Sentry OPENHUMAN-TAURI-H6 (103 events) and OPENHUMAN-TAURI-HN (50 events) — 153 affected users could never load or manage Composio integrations:

GET https://api.tinyhumans.ai/openai/v1/chat/completions/agent-integrations/composio/connections → 404
GET https://openrouter.ai/api/v1/chat/completions/agent-integrations/composio/connections       → 404

effective_backend_api_url's override branch already normalized via normalize_backend_api_base_url, but the env / compile-time branch returned the raw value, so a misconfigured BACKEND_URL=https://api.tinyhumans.ai/openai/v1/chat/completions (the cited prod misbake) flowed straight into every integration request as the prefix. list_connections, delete_connection, and every other IntegrationClient operation 404'd.

Solution

Three-layer fix:

  1. src/api/config.rseffective_backend_api_url: route api_base_from_env() through normalize_backend_api_base_url before returning, closing the env-fallback bypass.
  2. normalize_backend_api_base_url: promoted to pub(crate); on parse failure (scheme-less input) retries with https:// prefix so the path can still be stripped. Empty input round-trips unchanged so callers' "no backend" sentinel is not overwritten.
  3. src/openhuman/integrations/client.rsIntegrationClient::new: re-runs normalize_backend_api_base_url on the constructor argument and warn!s once when input had to be fixed. Holds the local invariant ("backend_url has no inference path") even if a future caller forgets the resolver, and emits the diagnostic the issue asked for.

The Sentry HN case (openrouter.ai/...) is additionally caught by the pre-existing looks_like_local_ai_endpoint heuristic (path-end match works on any host); the new sanitize layer guarantees correct behavior even if the heuristic is bypassed.

Submission Checklist

  • Tests added or updated (happy path + at least one failure / edge case) per Testing Strategy
  • Diff coverage ≥ 80% — every new branch in the three touched functions has a dedicated unit test (8 new tests cover env-bake, scheme-less, idempotent-clean, empty-input, sanitize variants, and IntegrationClient::new invariant).
  • Coverage matrix updated — N/A: behaviour-only change (bug fix, no new/removed/renamed feature row).
  • All affected feature IDs from the matrix are listed in the PR description under ## RelatedN/A: behaviour-only change.
  • No new external network dependencies introduced (mock backend used per Testing Strategy).
  • Manual smoke checklist updated if this touches release-cut surfaces — N/A: bug fix in URL normalization, no release-cut surface touched.
  • Linked issue closed via Closes #NNN in the ## Related section.

Impact

  • Runtime / platform: Rust core only. Affects every desktop platform that ships the core. No frontend / Tauri shell changes.
  • Performance: O(1) URL parse on IntegrationClient::new (once per construction). Negligible.
  • Security: none — strictly defensive URL hygiene.
  • Migration / compatibility: backward-compatible. Clean URLs round-trip unchanged; misconfigured URLs are silently fixed up with a warn! so the regression is observable.

Related


AI Authored PR Metadata (required for Codex/Linear PRs)

Keep this section for AI-authored PRs. For human-only PRs, mark each field N/A.

Linear Issue

  • Key: N/A
  • URL: N/A

Commit & Branch

  • Branch: N/A
  • Commit SHA: N/A

Validation Run

  • pnpm --filter openhuman-app format:check — N/A: no app/ changes
  • pnpm typecheck — N/A: no app/ changes
  • Focused tests: N/A
  • Rust fmt/check (if changed): N/A
  • Tauri fmt/check (if changed): N/A

Validation Blocked

  • command: N/A
  • error: N/A
  • impact: N/A

Behavior Changes

  • Intended behavior change: N/A
  • User-visible effect: N/A

Parity Contract

  • Legacy behavior preserved: N/A
  • Guard/fallback/dispatch parity checks: N/A

Duplicate / Superseded PR Handling

  • Duplicate PR(s): N/A
  • Canonical PR: N/A
  • Resolution (closed/superseded/updated): N/A

Summary by CodeRabbit

  • Bug Fixes
    • Backend URL handling now robustly normalizes and sanitizes misconfigured values (including schemeless or malformed overrides), strips any inference/path fragments, and ensures client-backed URLs use the clean root. Emits a redacted warning when an input URL is altered.
  • Tests
    • Added tests covering sanitization, idempotency, empty-input behavior, and client-side enforcement.

Review Change Stack

obchain added 2 commits May 18, 2026 17:32
`effective_backend_api_url` returned `api_base_from_env()` verbatim,
so a misconfigured `BACKEND_URL=https://.../openai/v1/chat/completions`
baked into the build flowed straight into every integration call as a
prefix, yielding 404 URLs like
`…/openai/v1/chat/completions/agent-integrations/composio/connections`
(Sentry OPENHUMAN-TAURI-H6, -HN). Route the env value through
`normalize_backend_api_base_url` and recover a scheme-less override by
retrying with `https://` before path stripping.

Refs tinyhumansai#2075
Defense-in-depth for issue tinyhumansai#2075 / Sentry OPENHUMAN-TAURI-H6, -HN.
Every prod call site already routes `backend_url` through
`effective_backend_api_url`, but a future caller that forgets that
step would silently re-introduce the 404 cascade. Re-run
`normalize_backend_api_base_url` inside the constructor so the field
invariant (no inference path) holds locally, and `warn!` once when the
input had to be fixed so the regression is observable.

Refs tinyhumansai#2075
@obchain obchain requested a review from a team May 18, 2026 12:09
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 18, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0c99c397-5bc6-45d2-abe8-4fd935433b0d

📥 Commits

Reviewing files that changed from the base of the PR and between 3621415 and 88cddb2.

📒 Files selected for processing (2)
  • src/api/config.rs
  • src/openhuman/integrations/client.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/openhuman/integrations/client.rs
  • src/api/config.rs

📝 Walkthrough

Walkthrough

Config-level normalization now strips inference-style paths from env-derived backend URLs and is exposed crate-wide; integration client construction also sanitizes backend URLs to ensure stored bases are host-rooted. Tests added for both normalization and client sanitization behaviors.

Changes

Backend URL Sanitization for Integration Clients

Layer / File(s) Summary
Backend URL normalization in config module
src/api/config.rs
effective_backend_api_url now returns normalize_backend_api_base_url(&env_url) for env fallbacks. normalize_backend_api_base_url was made pub(crate) and now handles schemeless/malformed inputs by attempting parse with and without https://, then strips path/query/fragment to a host-root string. redact_url_for_log visibility widened. Tests verify env-path stripping, schemeless normalization, idempotency, and clean-root pass-through.
Integration client backend URL sanitization
src/openhuman/integrations/client.rs, src/openhuman/integrations/client_tests.rs
Adds sanitize_backend_url (trim, remove trailing slashes, normalize via config cleaner, log redacted warning when changed) and applies it in IntegrationClient::new. Tests assert sanitizer strips inference paths, is idempotent on clean roots, preserves empty input, and that IntegrationClient::new stores the cleaned backend URL.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Poem

🐰 I nudged a URL, trimmed and neat,
Stripped the path where /completions meet.
A warning whispered, userinfo redacted,
Client stores a root that's clean and acted.
Now requests stroll home, no 404 bleat.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely describes the main change: fixing a bug where inference paths are stripped from backend URLs in integrations.
Linked Issues check ✅ Passed The PR comprehensively addresses all coding objectives from #2075: normalizes backend URLs to prevent inference paths [effective_backend_api_url, normalize_backend_api_base_url], adds sanitization in IntegrationClient::new, includes runtime detection/logging, and provides 8 unit tests covering varied backend_url shapes.
Out of Scope Changes check ✅ Passed All changes are directly scoped to fixing #2075: three files modified for URL normalization logic and tests, no extraneous features or unrelated modifications present.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added the working A PR that is being worked on by the team. label May 18, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/integrations/client.rs`:
- Around line 54-59: The warning currently logs raw URL values via
tracing::warn!(input = %trimmed, cleaned = %cleaned, ...) which can leak
credentials or tokens; instead sanitize or omit those values before logging
(e.g., build sanitized_trimmed and sanitized_cleaned by parsing the URL and
removing userinfo, query, and fragment or replacing them with "[REDACTED]" using
the Url crate), then call tracing::warn! with the sanitized variables (or log
only host/path or a boolean flag) so the tracing::warn invocation no longer
contains raw sensitive URL strings.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: dbff64ff-f88c-40aa-8292-00d7312a3bc4

📥 Commits

Reviewing files that changed from the base of the PR and between 70fdedc and 3621415.

📒 Files selected for processing (3)
  • src/api/config.rs
  • src/openhuman/integrations/client.rs
  • src/openhuman/integrations/client_tests.rs

Comment thread src/openhuman/integrations/client.rs
CodeRabbit flagged a credential-leak risk in the `warn!` emitted by
`sanitize_backend_url`: a misconfigured `BACKEND_URL` carrying
`https://user:secret@host/...` would leak the userinfo into logs. Route
both `input` and `cleaned` through the existing `redact_url_for_log`
helper (promoted to `pub(crate)`) so host/path survive for debugging
while username/password are scrubbed.
Copy link
Copy Markdown
Contributor

@M3gA-Mind M3gA-Mind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix is correct and complete. Three-layer defense is the right approach: normalize the env-fallback branch (root cause), harden the normalizer for scheme-less inputs, and add a constructor guard with a redacted warn log. All Sentry scenarios are covered by the regression tests. CI green, CodeRabbit's redaction concern addressed. Merging.

@M3gA-Mind M3gA-Mind merged commit 4384cd1 into tinyhumansai:main May 19, 2026
27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

working A PR that is being worked on by the team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix(composio): connections URL built from LLM base URL instead of backend URL

2 participants