Extract SPOG org-id from cluster httpPath for non-Thrift requests#366
Open
msrathore-db wants to merge 1 commit into
Open
Extract SPOG org-id from cluster httpPath for non-Thrift requests#366msrathore-db wants to merge 1 commit into
msrathore-db wants to merge 1 commit into
Conversation
For all-purpose-compute Thrift connections on SPOG (custom-URL) hosts httpPath is /sql/protocolv1/o/<workspace-id>/<cluster-id> and the workspace ID is encoded in the path itself. PoPP routes the Thrift request correctly off the /o/<wsid>/ segment, so the connection succeeds without an explicit ?o= query parameter. Other clients on the same driver (telemetry pushes to /telemetry-ext, feature-flag fetches, ...) hit different paths that don't carry the workspace ID. Previously extractSpogHeaders only looked at ?o= in httpPath, so the x-databricks-org-id header was never set for cluster URLs without ?o=. On SPOG hosts PoPP then had no workspace context for these requests and redirected them to /login, silently dropping telemetry. Extend extractSpogHeaders to also extract the workspace ID from the cluster path segment via clusterPathOrgIDPattern as a fallback when ?o= is absent. Priority order is preserved: ?o= query param wins over the path-segment match. Adds four new test cases — cluster path without ?o=, leading-slash variant, ?o= wins precedence, and a warehouse-path regression guard so the new regex does not match warehouse paths. Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
sql/protocolv1/o/<workspace-id>/<cluster-id>.extractSpogHeadersnow extracts the workspace ID from the/o/<wsid>/path segment as a fallback when?o=<wsid>is not present, and emits anx-databricks-org-idheader so the wrappedtelemetryClient/ feature-flag client can route correctly on SPOG.?o=inhttpPath▶/sql/protocolv1/o/<wsid>/path segment.Why
On a SPOG host the workspace identity has to be in either the URL or the
x-databricks-org-idheader for PoPP to route a request to the correct workspace. For all-purpose cluster Thrift this is free — the workspace ID is in the/o/<wsid>/segment of httpPath, so PoPP routes Thrift viarouting_reason=workspace-idand the session opens fine without an explicit?o=.The telemetry and feature-flag transports built by
connector.Connectwrapc.clientwithwithSpogHeadersonly whenextractSpogHeaders(c.cfg.HTTPPath)returns a non-nil map. Before this PR that only happened when?o=was present, so cluster URLs without?o=produced nox-databricks-org-idheader, PoPP fell back to default (account) routing on/telemetry-ext, and the responses were 303 redirects to/login— silently dropping telemetry on SPOG.What changes
connector.goclusterPathOrgIDPatternregex ((?:^|/)sql/protocolv1/o/(\d+)/[^/?]+) compiled once at package init.extractSpogHeaderschecks?o=first (existing behavior), then falls back to the cluster path segment, then returns nil. The malformed-query-string path now also falls through to path inspection instead of returning early. Log messages indicate which source produced the workspace ID.regexpadded to imports.connector_spog_test.goTestExtractSpogHeaders:?o=→ header extracted from/o/<wsid>//→ same?o=→ query-param value wins?o=→ still nil (regression guard: the new regex must not match warehouse paths)Test plan
go test -run TestExtractSpogHeaders -v .— 12 subtests pass (8 existing + 4 new).go test -short .— root package suite passes (no regressions).go vet ./...clean.peco.azuredatabricks.net) all-purpose cluster: telemetry POST/telemetry-extflipped fromHTTP 303 → /logintoHTTP 200 OKonce the path-segment extraction populatedx-databricks-org-id. Same shape of fix here.Out of scope
databricks/databricks-sql-python) has a sibling PR for the same fix.databricks/databricks-sql-nodejs) already extracts org ID from both query param and path segment (seeextractWorkspaceIdinlib/DBSQLClient.ts).This pull request and its description were written with assistance from Claude Code.