Skip to content
Different ways to connect to storage in Azure Databricks
Branch: master
Clone or download
Latest commit 3274401 Jun 13, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
notebooks fixed typos Jun 13, 2019
DatabricksStorage.pptx added ppt Jun 13, 2019
README.md

README.md

Different ways to connect to storage in Azure Databricks

The following is a summary of the various ways to connect to Blob Storage and Azure Data Lake Gen2 from Azure Databricks.

To download all sample notebooks, here is the DBC archive you can import to your workspace.

How to connect Scope of connection Authentication Authorization Requirements Code Sample Docs/Supported Storage
Direct connect Typicaly SparkSession* Storage Key All rights Python, SQL Blob
OAuth via Service Principal (SP) **SP has correct RBAC role assigned OR ACLs permissions to files/folders in ADLS Gen2 Python, SQL ADLS Gen2
AD Passthrough **User has correct RBAC role assigned OR ACLs permissions to files/folders in ADLS Gen2 Python, SQL ADLS Gen2
Mount on DBFS Databricks Workspace Storage Key All rights Python Blob, ADLS Gen2
OAuth via Service Principal (SP) **SP has correct RBAC role assigned OR ACLs permissions to files/folders in ADLS Gen2 Python ADLS Gen2
--- --- --- --- --- ---

*This will depend on where Spark Configuration is set. This is typically set on the SparkSession of the running notebook and therefore scoped to only that SparkSession.

**IMPORTANT NOTE on Authorization requirements

You need to assign specifically either of the following RBAC roles to the Service Principal or User. See here for more information.

  • Storage Blob Data Owner
  • Storage Blob Data Contributor
  • Storage Blob Data Reader

NOTE: Owner/Contributor role is insufficient.

For more granular access control, you can use use ACLs on folders/files in the ADLS Gen2 Filesystem.

Azure Databricks Secrets

All examples do not make sure of Azure Databricks secrets for simplicity.

Azure Databricks Secrets is the recommended way to store sensitive information in Azure Databricks. Essentially, you create Secret Scopes where you can store secrets in. Permissions are managed at the Secret Scope level. Users with the correct permission to a particular scope can retrieve secrets within it.

There are two types of Secret Scopes:

You can’t perform that action at this time.