Defining the Variables and getting secrets from Scope for Authentication:

In [0]:
val filesystemname = "datasets-adls"
val app_id = dbutils.secrets.get(scope="my-scope", key="my-app-id")
val storage_account_name = "sastudyadls"
val app_secret = dbutils.secrets.get(scope="my-scope", key="app-secret")
val tenantID = dbutils.secrets.get(scope="my-scope", key="tenant-id")



Building the Spark Config Map:

In [0]:
val configs = Map(
  s"fs.azure.account.auth.type.${storage_account_name}.dfs.core.windows.net" -> "OAuth",
  s"fs.azure.account.oauth.provider.type.${storage_account_name}.dfs.core.windows.net" -> "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
  s"fs.azure.account.oauth2.client.id.${storage_account_name}.dfs.core.windows.net" -> app_id,
  s"fs.azure.account.oauth2.client.secret.${storage_account_name}.dfs.core.windows.net" -> app_secret,
  s"fs.azure.account.oauth2.client.endpoint.${storage_account_name}.dfs.core.windows.net" -> s"https://login.microsoftonline.com/$tenantID/oauth2/token"
)

configs.foreach { case (key, value) => spark.conf.set(key, value)}

In [0]:
val df=spark.read.format("csv").option("header","true").option("inferSchema","true").load(s"abfss://$filesystemname@$storage_account_name.dfs.core.windows.net/cars.csv")

display(df)

Make,Model,Type,Origin,DriveTrain,Length,Mileage
Acura,MDX,SUV,Asia,All,4451,11
Acura,RSX Type S 2dr,Sedan,Asia,Front,2778,13
Acura,TSX 4dr,Sedan,Asia,Front,3230,10
Acura,TL 4dr,Sedan,Asia,Front,3575,14
Acura,3.5 RL 4dr,Sedan,Asia,Front,3880,14
Acura,3.5 RL w/Navigation 4dr,Sedan,Asia,Front,3893,12
Acura,NSX coupe 2dr manual S,Sports,Asia,Rear,3153,14
Audi,A4 1.8T 4dr,Sedan,Europe,Front,3252,14
Audi,A41.8T convertible 2dr,Sedan,Europe,Front,3638,12
Audi,A4 3.0 4dr,Sedan,Europe,Front,3462,13


In [0]:
df.createOrReplaceTempView("cars")

In [0]:
spark.sql("select * from cars")

In [0]:
%sql
select make,avg(mileage) from cars group by make order by avg(mileage) desc

make,avg(mileage)
Jeep,17.333333333333332
Buick,17.11111111111111
Saab,17.0
Pontiac,16.181818181818183
Lexus,16.09090909090909
Kia,16.0
Dodge,16.0
Mitsubishi,15.923076923076923
Volkswagen,15.666666666666666
Mercury,15.555555555555555


Mounts

In [0]:
val additionalconfigs = configs + ("fs.azure.createRemoteFileSystemDutingInitialization" -> "true")

In [0]:
println(additionalconfigs.toString)

In [0]:
val additionalconfigs = Map(
  "fs.azure.account.auth.type" -> "OAuth",
  "fs.azure.account.oauth.provider.type" -> "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
  "fs.azure.account.oauth2.client.id" -> app_id,
  "fs.azure.account.oauth2.client.secret" -> app_secret,
  "fs.azure.account.oauth2.client.endpoint" -> s"https://login.microsoftonline.com/$tenantID/oauth2/token",
  "fs.azure.createRemoteFileSystemDuringInitialization" -> "true"
)


Mounting commands

In [0]:
dbutils.fs.mount(
  source = s"abfss://$filesystemname@$storage_account_name.dfs.core.windows.net/",
  mountPoint ="/mnt/datasets",
  extraConfigs = additionalconfigs
)

In [0]:
%fs

In [0]:
%fs ls /mnt/datasets

path,name,size,modificationTime
dbfs:/mnt/datasets/books.csv,books.csv,808,1752685047000
dbfs:/mnt/datasets/cars.csv,cars.csv,20751,1752685047000


In [0]:
dbutils.fs.ls("/mnt/datasets").foreach(println)

In [0]:
dbutils.fs.mounts()

In [0]:
dbutils.fs.mounts().foreach(println)

In [0]:
dbutils.fs.mounts().foreach(m => println(s"${m.mountPoint} -> ${m.source}"))

In [0]:
display(dbutils.fs.ls("/databricks-datasets"))

path,name,size,modificationTime
dbfs:/databricks-datasets/COVID/,COVID/,0,1753304019108
dbfs:/databricks-datasets/README.md,README.md,976,1532502324000
dbfs:/databricks-datasets/Rdatasets/,Rdatasets/,0,1753304019108
dbfs:/databricks-datasets/SPARK_README.md,SPARK_README.md,3359,1455505834000
dbfs:/databricks-datasets/adult/,adult/,0,1753304019108
dbfs:/databricks-datasets/airlines/,airlines/,0,1753304019108
dbfs:/databricks-datasets/amazon/,amazon/,0,1753304019108
dbfs:/databricks-datasets/asa/,asa/,0,1753304019108
dbfs:/databricks-datasets/atlas_higgs/,atlas_higgs/,0,1753304019108
dbfs:/databricks-datasets/bikeSharing/,bikeSharing/,0,1753304019108


In [0]:
display(dbutils.fs.ls("/mnt/datasets"))

path,name,size,modificationTime
dbfs:/mnt/datasets/books.csv,books.csv,808,1752685047000
dbfs:/mnt/datasets/cars.csv,cars.csv,20751,1752685047000


To Unmount:

In [0]:
dbutils.fs.unmount("/mnt/datasets")