Skip to content

Databricks Operator with Azure Service Principal Oauth tries to authenticate to host without schema #47437

@SaremS

Description

@SaremS

Apache Airflow Provider(s)

databricks

Versions of Apache Airflow Providers

apache-airflow-providers-databricks==6.2.0

Apache Airflow version

2.10.4

Operating System

Azure Kubernetes Service / Ubuntu 22.04

Deployment

Official Apache Airflow Helm Chart

Deployment details

Using official helm chart of Apache Airflow (v. 2.10.4 - Python 3.9), but this should be independent of the exact deployment.

What happened

Using databricks operator with a AIRFLOW_CONN_ environment in the form databricks://{SERVICE_PRINCIPAL_ID}:{SERVICE_PRINCIPAL_OAUTH_TOKEN}@{DATABRICKS_HOST}?service_principal_oauth=true results in

"Invalid URL '{DATABRICKS_HOST}/oidc/v1/token': No scheme supplied. Perhaps you meant 'https://{DATABRICKS_HOST}/oidc/v1/token'?"

What you think should happen instead

The OAUTH token request should be sent to a URL with scheme provided.

How to reproduce

  • Create an Azure Service Principal and import it into Azure Databricks (this should also work with Databricks managed SPs)
  • Create an OAUTH secret in Databricks
  • Provide the respective AIRFLOW_CONN_ string as described above (e.g. AIRFLOW_CONN_DATABRICKS_DEFAULT=databricks://{SERVICE_PRINCIPAL_ID}:{SERVICE_PRINCIPAL_OAUTH_TOKEN}@{DATABRICKS_HOST}?service_principal_oauth=true
  • Use any Databricks operator that needs to authenticate, e.g. DatabricksRunNowOperator to get the error

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions