In [1]:
import datetime

#### Get secrets from Secret Scope

How to set up an AKV-backed Databricks secret scope: https://docs.azuredatabricks.net/user-guide/secrets/secret-scopes.html#create-an-azure-key-vault-backed-secret-scope

In [3]:
# ADLS gen2 storage account key
storageAccountKey = dbutils.secrets.get(scope = "secrets", key = "StorageAccountKey")

#### Variables

Nothing sensitive should be here - only variables/info that is fine to be visible/open

In [5]:
storageAccountName = ""

# Note we are using the ADLS gen2 ABFS driver, not WASB/S.
adls2uri_raw = "abfss://raw@" + storageAccountName + ".dfs.core.windows.net/"
adls2uri_staging1 = "abfss://staging1@" + storageAccountName + ".dfs.core.windows.net/"
adls2uri_curated = "abfss://curated@" + storageAccountName + ".dfs.core.windows.net/"

#### Connect to ADLS gen 2 file system

We will access the ADLS gen2 file system directly, as that can be done without a mount point (see below)
See:
- https://docs.databricks.com/spark/latest/data-sources/azure/azure-datalake-gen2.html#access-an-azure-data-lake-storage-gen2-account-directly

Note: we could create a mount point (e.g. /mnt/azure/) but that requires a Service Principal, which requires Azure AD access to create.
See:
- https://docs.databricks.com/spark/latest/data-sources/azure/azure-datalake-gen2.html#mount-an-azure-data-lake-storage-gen2-filesystem-with-dbfs
- https://docs.databricks.com/spark/latest/data-sources/azure/azure-datalake-gen2.html#requirements-azure-data-lake
- https://docs.microsoft.com/en-us/azure/active-directory/develop/howto-create-service-principal-portal

In [7]:
# Set Spark config to point at the ADLS gen2 storage account
spark.conf.set("fs.azure.account.key." + storageAccountName + ".dfs.core.windows.net", storageAccountKey)

In [8]:
# Prepare date variables we'll need

# ASSUMPTION - we have data in a folder with YESTERDAY'S date. Adjust as obviously needed.

# Start with current date and subtract a day
now = datetime.datetime.now() + datetime.timedelta(days=-1)

# Get year, month, and day into separate vars
int_year = now.year
int_month = now.month
int_day = now.day

# Get string formatted versions of these with leading zeroes for month and day (for eventual output paths)
str_year = str(int_year)
str_month = "{:02d}".format(int_month)
str_day = "{:02d}".format(int_day)

# Prepare path chunk for yyyy/mm/dd so we don't have to keep doing this below
path_chunk_date = str_year + "/" + str_month + "/" + str_day
print (path_chunk_date)