#### Mount Azure Data lake using service principal 
- 1. Register Azure AD principal/ Service principal
- 2. Generate a secret/ password for the application 
- 3. Configure the dataBricks to access the storage account via service principal by setting spark config parameters. 
- 4. Assign required role for Storage Data contributor to the Data Lake.

- call file system unity mount mount to mount the storage
- Explore other files system utilities related to mount (list all mount, unmount)


Role (Storage blob data contributor gives full access to storage account.

In [0]:
def mount_containers(storage_account,container_name):
    client_id = dbutils.secrets.get('formula1dataScope','formula1-app-client-id')
    tenant_id = dbutils.secrets.get ('formula1dataScope','formula1-app-tenant-id')
    client_secret = dbutils.secrets.get ('formula1dataScope','formula1-app-sp-secret-value')

    configs = {"fs.azure.account.auth.type": "OAuth",
              "fs.azure.account.oauth.provider.type": "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
              "fs.azure.account.oauth2.client.id": client_id,
              "fs.azure.account.oauth2.client.secret": client_secret,
              "fs.azure.account.oauth2.client.endpoint": f"https://login.microsoftonline.com/{tenant_id}/oauth2/token"}
    
    dbutils.fs.mount(
    source = f"abfss://{container_name}@{storage_account}.dfs.core.windows.net",
    mount_point = f"/mnt/{storage_account}/{container_name}",
    extra_configs = configs )


    

In [0]:
mount_containers('formularacedata', 'raw')
mount_containers('formularacedata', 'processed')
mount_containers('formularacedata', 'presentation')

In [0]:
display(dbutils.fs.mounts())

mountPoint,source,encryptionType
/databricks-datasets,databricks-datasets,
/mnt/formularacedata/raw,abfss://raw@formularacedata.dfs.core.windows.net,
/databricks/mlflow-tracking,databricks/mlflow-tracking,
/mnt/formularacedata/presentation,abfss://presentation@formularacedata.dfs.core.windows.net,
/databricks-results,databricks-results,
/mnt/formularacedata/processed,abfss://processed@formularacedata.dfs.core.windows.net,
/databricks/mlflow-registry,databricks/mlflow-registry,
/,DatabricksRoot,


In [0]:
display(dbutils.fs.ls("/mnt/formularacedata/raw"))

path,name,size,modificationTime
dbfs:/mnt/formularacedata/raw/circuits.csv,circuits.csv,10044,1686570285000
dbfs:/mnt/formularacedata/raw/constructors.json,constructors.json,30415,1686570285000
dbfs:/mnt/formularacedata/raw/drivers.json,drivers.json,180812,1686570285000
dbfs:/mnt/formularacedata/raw/lap_times/,lap_times/,0,1686570362000
dbfs:/mnt/formularacedata/raw/pit_stops.json,pit_stops.json,1369387,1686570285000
dbfs:/mnt/formularacedata/raw/qualifying/,qualifying/,0,1686570383000
dbfs:/mnt/formularacedata/raw/races.csv,races.csv,116847,1686570285000
dbfs:/mnt/formularacedata/raw/results.json,results.json,7165641,1686570287000


In [0]:
display(dbutils.fs.ls("/databricks-datasets"))

path,name,size,modificationTime
dbfs:/databricks-datasets/COVID/,COVID/,0,1686563968174
dbfs:/databricks-datasets/README.md,README.md,976,1532502320000
dbfs:/databricks-datasets/Rdatasets/,Rdatasets/,0,1686563968174
dbfs:/databricks-datasets/SPARK_README.md,SPARK_README.md,3359,1516124912000
dbfs:/databricks-datasets/adult/,adult/,0,1686563968174
dbfs:/databricks-datasets/airlines/,airlines/,0,1686563968174
dbfs:/databricks-datasets/amazon/,amazon/,0,1686563968174
dbfs:/databricks-datasets/asa/,asa/,0,1686563968174
dbfs:/databricks-datasets/atlas_higgs/,atlas_higgs/,0,1686563968174
dbfs:/databricks-datasets/bikeSharing/,bikeSharing/,0,1686563968174


To unmount the data path

In [0]:
dbutils.fs.unmount('/mnt/formula1racedata/demo')

/mnt/formula1racedata/demo has been unmounted.
Out[27]: True