Skip to content
master
Switch branches/tags
Go to file
Code

Latest commit

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README.md

PowerShell Module for Databricks

This repository contains the source code for the PowerShell module "DatabricksPS". The module can also be found in the public PowerShell gallery: https://www.powershellgallery.com/packages/DatabricksPS/

It works for Databricks on Azure and also AWS. The APIs are almost identical so I decided to bundle them in one single module. The official API documentations can be found here:

Azure Databricks - https://docs.azuredatabricks.net/api/latest/index.html

Databricks on AWS - https://docs.databricks.com/api/latest/index.html

Release History

v1.6.2.0:

  • Fix issue with Cluster cmdlets to properly support pipelineing
  • Added support for Instance Pools in Clulster cmdlets

v1.6.0.0:

  • Add support for Project APIs (experimental, link)
  • Added Workspace Config settings

v1.5.0.0:

  • Add support for SQL Analytics APIs (experimental, link)

v1.3.1.0:

  • Add support for Workspace configs (get/set)

v1.3.0.0:

  • Add support for Global Init Scripts

v1.2.2.0:

  • Add -Entitlements parameter to Add-DatabricksSCIMGroup
  • Some fixes for proper pipelining when working with Groups and SCIM APIs
  • Add test-case for Security (SCIM, Groups, memberships, ...)

v1.2.1.0:

  • Fixed issue with Import of already existing files and folders

v1.2.0.1:

  • Add support for Azure backed Secret Scopes for non-standard Azure environments like AzureChinaCloud or AzureUSGovernment

v1.2.0.0:

  • Add support for AAD authentication in non-standard Azure environments like AzureChinaCloud or AzureUSGovernment

v1.1.4.0:

  • Fix Secrets API when creating Azure KeyVault Backed Secret Scopes.

v1.1.3.0:

  • Minor fix for Secrets API making -InitialManagePrincipal optional.

v1.1.2.0:

  • Changed -ApiRootUrl parameter to support any URL and not just a fixed list.
  • Added Get-DatabricksApiRootUrl cmdlet to be able to get a list of predefined API Root URLs

v1.1.1.0:

  • Added new cmdlet Add-DatabricksClusterLocalLibrary to add a local library (.jar, .whl, ...) to a cluster with a single command

v1.0.0.0:

  • Added Azure Active Directory (AAD) Authentication for Service Principals and Users

Setup and Installation

The easiest way to install the PowerShell module is to use the PowerShell built-in Install-Module cmdlet:

Install-Module -Name DatabricksPS

Alternatively you can also download this repository and copy the folder \Modules\DatabricksPS locally and install it from the local path, also using the Import-Module cmdlet:

Import-Module "C:\MyPSModules\Modules\DatabricksPS"

Usage

The module is designed to set the connection relevant properties once and they are used for all other cmdlets then. You can always update this information during your PS sessions to connect to different Databricks environments in the same session.

$accessToken = "dapi123456789e672c4007052d4694a7c51"
$apiUrl = "https://westeurope.azuredatabricks.net"

Set-DatabricksEnvironment -AccessToken $accessToken -ApiRootUrl $apiUrl

Once the environment is setup, you can use the other cmdlets:

Get-DatabricksWorkspaceItem -Path "/"
Export-DatabricksWorkspaceItem -Path "/TestNotebook1" -LocalPath "C:\TestNotebook1_Export.ipynb" -Format JUPYTER

Start-DatabricksJob -JobID 123 -NotebookParams @{myParameter = "test"}

Using pipelined cmdlets:

# stop all clusters
Get-DatabricksCluster | Stop-DatabricksCluster

# create multiple directories
"/test1","/test2" | Add-DatabricksWorkspaceDirectory

# get all run outputs for a given job
Get-DatabricksJobRun -JobID 123 | Get-DatabricksJobRunOutput

Authentication

There are 3 ways to authenticate against the Databricks REST API of which 2 are unique to Azure:

  • Personal Access token
  • Azure Active Directory (AAD) Username/Password (Azure only!)
  • Azure Active Directory (AAD) Service Principal (Azure only!)

Personal Access Token

This is the most straight forward authentication and works for both, Azure and AWS. The official documentation can be found here (Azure) or here (AWS) and is also persisted in this repository here.

$accessToken = "dapi123456789e672c4007052d4694a7c51"
$apiUrl = "https://westeurope.azuredatabricks.net"

Set-DatabricksEnvironment -AccessToken $accessToken -ApiRootUrl $apiUrl

Azure Active Directory (AAD) Username/Password

This authentication method is very similar to what you use when logging in interactively when accessing the Databricks web UI. You provide the Databricks workspace you want to connect to, the username and a password. The official documentation can be found here and is also persisted in this repository here.

$credUser = Get-Credential
$tenantId = '93519689-1234-1234-1234-e4b9f59d1963'
$subscriptionId = '30373b46-5678-5678-5678-d5560532fc32'
$resourceGroupName = 'myResourceGroup'
$workspaceName = 'myDatabricksWorkspace'
$azureResourceId = "/subscriptions/$subscriptionId/resourceGroups/$resourceGroupName/providers/Microsoft.Databricks/workspaces/$workspaceName"
$clientId = 'db00e35e-1111-2222-3333-c8cc85e6f524'

$apiUrl = "https://westeurope.azuredatabricks.net"

Set-DatabricksEnvironment -ClientID $clientId -Credential $credUser -AzureResourceID $azureResourceId -TenantID $tenantId -ApiRootUrl $apiUrl

Azure Active Directory (AAD) Service Principal

Service Principals are special accounts in Azure Active Directory which can be used for automated tasks like CI/CD pipelines. You provide the Databricks workspace you want to connect to, the ClientID and a ClientSecret/ClientKey. ClientID and ClientSecret need to be wrapped into a PSCredential where the ClientID is the usernamen and ClientSecret/ClientKey is the password. The rest is very similar to the Username/Password authentication except that you also need to specify the -ServicePrincipal flag. The official documentation can be found here and is also persisted in this repository here

$clientId = '12345678-6789-6789-6789-6e44bf2f5d11' # = Application ID
$clientSecret = 'tN4Lrez.=12345AgRx6w6kJ@6C.ap7Y'
$secureClientSecret = ConvertTo-SecureString $clientSecret -AsPlainText -Force
$credSP = New-Object System.Management.Automation.PSCredential($clientId, $secureClientSecret)
$tenantId = '93519689-1234-1234-1234-e4b9f59d1963'
$subscriptionId = '30373b46-5678-5678-5678-d5560532fc32'
$resourceGroupName = 'myResourceGroup'
$workspaceName = 'myDatabricksWorkspace'
$azureResourceId = "/subscriptions/$subscriptionId/resourceGroups/$resourceGroupName/providers/Microsoft.Databricks/workspaces/$workspaceName"

$apiUrl = "https://westeurope.azuredatabricks.net"

Set-DatabricksEnvironment -ClientID $clientId -Credential $credSP -AzureResourceID $azureResourceId -TenantID $tenantId -ApiRootUrl $apiUrl -ServicePrincipal

Supported APIs and endpoints

About

PowerShell wrapper for the Databricks API

Resources

License

Releases

No releases published

Packages

No packages published

Languages