Skip to content

configure: "AI Gateway V2 not available" is misleading on auth failures (returns HTTP 400 "Invalid Token") #84

@dhruv0811

Description

@dhruv0811

Summary

When ucode configure runs against a workspace where the user is authenticated but the resolved token doesn't match the target workspace, ensure_ai_gateway_v2 raises:

ERROR Databricks AI Gateway V2 is required but not available on this workspace (HTTP 400 Bad Request).

…even though AI Gateway V2 is enabled on the workspace. The actual HTTP 400 response body is Invalid Token (an auth error), not a gateway availability error.

Repro

  1. Have a shell with DATABRICKS_CONFIG_PROFILE=other-workspace exported (e.g. from a previous omniagents run --profile other-workspace, or directly in shell rc).
  2. Run ucode configure --workspaces https://target-workspace.cloud.databricks.com --agents claude,codex,pi.
  3. Complete the databricks auth login prompt successfully.
  4. ucode configure immediately fails with "AI Gateway V2 is required but not available."

Under the hood: databricks auth token --host <target> in get_databricks_token() (databricks.py:466) honors DATABRICKS_CONFIG_PROFILE over the --host arg (it's the CLI's documented resolution order). So the returned token is cached for other-workspace, not target. The token is technically valid — just for the wrong workspace — so the gateway probe returns HTTP 400 Invalid Token instead of 401/403.

ensure_ai_gateway_v2 (databricks.py:822) collapses every non-200 into the "v2 not available" error, hiding the real cause.

Debug log evidence

With UCODE_DEBUG=1:

get_databricks_token.env: set=DATABRICKS_CODEX_TOKEN,DATABRICKS_CONFIG_FILE,DATABRICKS_CONFIG_PROFILE,DATABRICKS_HOST
auth token: rc=0 stderr=''
GET https://target-workspace.cloud.databricks.com/api/ai-gateway/v2/endpoints?page_size=1: HTTP 400 Bad Request
body: Invalid Token

The token fetch succeeds (rc=0), then the gateway returns 400 with Invalid Token. Notice DATABRICKS_CONFIG_PROFILE in the env list — that's the load-bearing var.

Suggested fix

In ensure_ai_gateway_v2, branch on the response code/body before raising:

  • 400 with body containing Invalid Token (or 401 / 403): "Auth resolved a token that this workspace rejected. Likely a stale DATABRICKS_CONFIG_PROFILE or token cache. Try databricks auth logout --host <X> and re-run." (And ideally build_databricks_cli_env should also env.pop("DATABRICKS_CONFIG_PROFILE", None) so --host isn't shadowed in the first place — Codex already does this in its auth.command, so the precedent exists.)
  • 404: "AI Gateway V2 not enabled. See [docs link]."
  • other: surface the response body verbatim.

Even just appending body=<resp> to the existing error message would have saved several hours of misdiagnosis on our end.

Context

We hit this twice in different shapes while integrating ucode into the omniagents setup flow (databricks-eng/agent-framework#1332, #1340). The first time it was duplicate same-host profile sections causing databricks auth token to refuse with "multiple profiles match"; the second time it was DATABRICKS_CONFIG_PROFILE shadowing --host. Both surfaced as the same misleading "v2 not available" error.

This issue was drafted by Isaac.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions