**Methods of Azure Data Lake Storage (ADLS) Integration with Databricks**
1. Using ADLS access key directly
2. Creating a mount point using ADLS access key

**Method 1**

**Using ADLS access key directly**

We have to provide storage account name which in our case is - adlsowaisde

We also need to copy the access key from the ADLS account on azure

We need to setup spark configuration by giving the parameters as shown

In [0]:
spark.conf.set(
"fs.azure.account.key.adlsowaisde.dfs.core.windows.net",
"xW2tf0opSZwBp7NtjHTJupO4ZtYEop9hD6CNOMRdELFvSRIfvNHOdYMYwJgELD1nRDdqR1A5zWIr+ASty2gjiw==")

We have now connected to the ADLS Container. We will now list the files in ADLS Container using 
dbutils.fs.ls(), we need to provide the two parameters **container name (container-owaisde) and storage account name (adlsowaisde)**
full syntax is shown below

In [0]:
dbutils.fs.ls("abfss://container-owaisde@adlsowaisde.dfs.core.windows.net/")

Out[11]: [FileInfo(path='abfss://container-owaisde@adlsowaisde.dfs.core.windows.net/Housing.csv', name='Housing.csv', size=29981, modificationTime=1675512982000)]

In [0]:
# set the data lake file location
file_location = "abfss://container-owaisde@adlsowaisde.dfs.core.windows.net/"

# read in the data to the dataframe df
df = spark.read.format("csv").option("inferSchema", "true").option("header", 
                                                                   "true").option("delimiter", ",").load(file_location)

# display the dataframe
display(df)

price,area,bedrooms,bathrooms,stories,mainroad,guestroom,basement,hotwaterheating,airconditioning,parking,prefarea,furnishingstatus
13300000,7420,4,2,3,yes,no,no,no,yes,2,yes,furnished
12250000,8960,4,4,4,yes,no,no,no,yes,3,no,furnished
12250000,9960,3,2,2,yes,no,yes,no,no,2,yes,semi-furnished
12215000,7500,4,2,2,yes,no,yes,no,yes,3,yes,furnished
11410000,7420,4,1,2,yes,yes,yes,no,yes,2,no,furnished
10850000,7500,3,3,1,yes,no,yes,no,yes,2,yes,semi-furnished
10150000,8580,4,3,4,yes,no,no,no,yes,2,yes,semi-furnished
10150000,16200,5,3,2,yes,no,no,no,no,0,no,unfurnished
9870000,8100,4,1,2,yes,yes,yes,no,yes,2,yes,furnished
9800000,5750,3,2,4,yes,yes,no,no,yes,1,yes,unfurnished


**Method 2**

**Creating Creating a mount point using ADLS access key**

Creating mount point (MP) is a more standardized way of integration. Once MP is created Databricks considers as if ADLS is its local file system even thought it is not. MP strategy is mostly used by projects.

We need three parameters **source - can be found on ADLS endpoints and copied here as shown : mount_point - its a path and adls_test is a name randomly chosen. we can name it anything we like : extrac_config - configuration setting and access key we copied from the first cell in this notebook**

In [0]:
dbutils.fs.mount(
    source = "wasbs://container-owaisde@adlsowaisde.blob.core.windows.net/",
    mount_point = "/mnt/adls_test",
    extra_configs = {"fs.azure.account.key.adlsowaisde.blob.core.windows.net":
                    "xW2tf0opSZwBp7NtjHTJupO4ZtYEop9hD6CNOMRdELFvSRIfvNHOdYMYwJgELD1nRDdqR1A5zWIr+ASty2gjiw=="})

Out[14]: True

In [0]:
# To list the files in ADLS using newly created mount point
dbutils.fs.ls("/mnt/adls_test")

Out[15]: [FileInfo(path='dbfs:/mnt/adls_test/Housing.csv', name='Housing.csv', size=29981, modificationTime=1675512982000)]

In [0]:
# In big projects we have many resources in ADLS hence many mount points. To list them we will do the following
dbutils.fs.mounts()

Out[17]: [MountInfo(mountPoint='/databricks-datasets', source='databricks-datasets', encryptionType='sse-s3'),
 MountInfo(mountPoint='/databricks/mlflow-tracking', source='databricks/mlflow-tracking', encryptionType='sse-s3'),
 MountInfo(mountPoint='/databricks-results', source='databricks-results', encryptionType='sse-s3'),
 MountInfo(mountPoint='/mnt/adls_test', source='wasbs://container-owaisde@adlsowaisde.blob.core.windows.net/', encryptionType=''),
 MountInfo(mountPoint='/databricks/mlflow-registry', source='databricks/mlflow-registry', encryptionType='sse-s3'),
 MountInfo(mountPoint='/', source='DatabricksRoot', encryptionType='sse-s3')]

In [0]:
# We can remove the mount point created as
#dbitils.fs.unmount("/mnt/adls_test")

In [0]:
# set the data lake file location using the mount point
file_location = "dbfs:/mnt/adls_test/"

# read in the data to the dataframe df
df = spark.read.format("csv").option("inferSchema", "true").option("header", 
                                                                   "true").option("delimiter", ",").load(file_location)

# display the dataframe
display(df)

price,area,bedrooms,bathrooms,stories,mainroad,guestroom,basement,hotwaterheating,airconditioning,parking,prefarea,furnishingstatus
13300000,7420,4,2,3,yes,no,no,no,yes,2,yes,furnished
12250000,8960,4,4,4,yes,no,no,no,yes,3,no,furnished
12250000,9960,3,2,2,yes,no,yes,no,no,2,yes,semi-furnished
12215000,7500,4,2,2,yes,no,yes,no,yes,3,yes,furnished
11410000,7420,4,1,2,yes,yes,yes,no,yes,2,no,furnished
10850000,7500,3,3,1,yes,no,yes,no,yes,2,yes,semi-furnished
10150000,8580,4,3,4,yes,no,no,no,yes,2,yes,semi-furnished
10150000,16200,5,3,2,yes,no,no,no,no,0,no,unfurnished
9870000,8100,4,1,2,yes,yes,yes,no,yes,2,yes,furnished
9800000,5750,3,2,4,yes,yes,no,no,yes,1,yes,unfurnished
