Skip to content

Extract SPOG org-id from cluster httpPath for non-Thrift requests#366

Open
msrathore-db wants to merge 1 commit into
databricks:mainfrom
msrathore-db:fix-spog-org-id-extract-cluster-path
Open

Extract SPOG org-id from cluster httpPath for non-Thrift requests#366
msrathore-db wants to merge 1 commit into
databricks:mainfrom
msrathore-db:fix-spog-org-id-extract-cluster-path

Conversation

@msrathore-db
Copy link
Copy Markdown
Contributor

Summary

  • Fixes silent telemetry loss on SPOG (custom-URL) hosts when connecting to an all-purpose cluster via an httpPath like sql/protocolv1/o/<workspace-id>/<cluster-id>.
  • extractSpogHeaders now extracts the workspace ID from the /o/<wsid>/ path segment as a fallback when ?o=<wsid> is not present, and emits an x-databricks-org-id header so the wrapped telemetryClient / feature-flag client can route correctly on SPOG.
  • Priority order preserved: ?o= in httpPath/sql/protocolv1/o/<wsid>/ path segment.

Why

On a SPOG host the workspace identity has to be in either the URL or the x-databricks-org-id header for PoPP to route a request to the correct workspace. For all-purpose cluster Thrift this is free — the workspace ID is in the /o/<wsid>/ segment of httpPath, so PoPP routes Thrift via routing_reason=workspace-id and the session opens fine without an explicit ?o=.

The telemetry and feature-flag transports built by connector.Connect wrap c.client with withSpogHeaders only when extractSpogHeaders(c.cfg.HTTPPath) returns a non-nil map. Before this PR that only happened when ?o= was present, so cluster URLs without ?o= produced no x-databricks-org-id header, PoPP fell back to default (account) routing on /telemetry-ext, and the responses were 303 redirects to /login — silently dropping telemetry on SPOG.

What changes

connector.go

  • New clusterPathOrgIDPattern regex ((?:^|/)sql/protocolv1/o/(\d+)/[^/?]+) compiled once at package init.
  • extractSpogHeaders checks ?o= first (existing behavior), then falls back to the cluster path segment, then returns nil. The malformed-query-string path now also falls through to path inspection instead of returning early. Log messages indicate which source produced the workspace ID.
  • regexp added to imports.

connector_spog_test.go

  • Four new table entries under TestExtractSpogHeaders:
    • Cluster path without ?o= → header extracted from /o/<wsid>/
    • Cluster path with leading / → same
    • Cluster path with ?o= → query-param value wins
    • Warehouse path without ?o= → still nil (regression guard: the new regex must not match warehouse paths)

Test plan

  • go test -run TestExtractSpogHeaders -v . — 12 subtests pass (8 existing + 4 new).
  • go test -short . — root package suite passes (no regressions).
  • go vet ./... clean.
  • Behavior validated on the OSS JDBC equivalent fix against Prod SPOG (peco.azuredatabricks.net) all-purpose cluster: telemetry POST /telemetry-ext flipped from HTTP 303 → /login to HTTP 200 OK once the path-segment extraction populated x-databricks-org-id. Same shape of fix here.

Out of scope

This pull request and its description were written with assistance from Claude Code.

For all-purpose-compute Thrift connections on SPOG (custom-URL) hosts
httpPath is /sql/protocolv1/o/<workspace-id>/<cluster-id> and the
workspace ID is encoded in the path itself. PoPP routes the Thrift
request correctly off the /o/<wsid>/ segment, so the connection succeeds
without an explicit ?o= query parameter.

Other clients on the same driver (telemetry pushes to /telemetry-ext,
feature-flag fetches, ...) hit different paths that don't carry the
workspace ID. Previously extractSpogHeaders only looked at ?o= in
httpPath, so the x-databricks-org-id header was never set for cluster
URLs without ?o=. On SPOG hosts PoPP then had no workspace context for
these requests and redirected them to /login, silently dropping telemetry.

Extend extractSpogHeaders to also extract the workspace ID from the
cluster path segment via clusterPathOrgIDPattern as a fallback when
?o= is absent. Priority order is preserved: ?o= query param wins over
the path-segment match. Adds four new test cases — cluster path
without ?o=, leading-slash variant, ?o= wins precedence, and a
warehouse-path regression guard so the new regex does not match
warehouse paths.

Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant