# 02_ADLS_Library_Usage

## Importing and Using the Custom Python Library from ADLS Gen2

This notebook:
‚úÖ Adds the mounted ADLS `libs` path to Python‚Äôs `sys.path`  
‚úÖ Imports the custom library (`mylib`)  
‚úÖ Calls and tests its functions

‚ö† Make sure you‚Äôve run the `01_ADLS_Library_Setup` notebook before this.

## Local Setup Before Running the Notebook

Before using this notebook, prepare the following on your local machine:

‚úÖ Create a simple Python library file named `mylib.py` with this content:

```python
# mylib.py

def greet(name):
    return f"Hello, {name}! Welcome to Databricks."

def add(a, b):
    return a + b

def multiply(a, b):
    return a * b
```

- ‚úÖ Copy this file to your local path (example: /tmp/mylib.py or C:\\temp\\mylib.py).

- ‚úÖ Ensure the Databricks Secrets setup (scope: adls-secrets, key: adls-access-key) is ready and that your ADLS container has a libs/ folder for library uploads.

- ‚úÖ Make sure Databricks CLI is configured if you plan to upload files via CLI (optional).

Upload the Library File to Databricks

You have two options:

‚úÖ Option 1 ‚Üí Use the Databricks Data UI
	‚Ä¢	Go to the Databricks workspace UI (left sidebar ‚Üí Data ‚Üí Upload Data).
	‚Ä¢	Choose Target Location ‚Üí /tmp/.
	‚Ä¢	Upload your mylib.py file.
	‚Ä¢	This will place the file at: /dbfs/tmp/mylib.py inside the Databricks environment.

‚úÖ Option 2 ‚Üí Use the Cell Below to Write the Sample File Directly
	‚Ä¢	If you can‚Äôt upload through the UI, use the provided notebook cell that writes the sample file programmatically into /tmp/.

‚úÖ Once uploaded (by either method), this notebook will move the file to the ADLS-mounted folder for shared library access.

In [0]:
# Write mylib.py into DBFS /tmp using dbutils.fs.put

dbutils.fs.put("dbfs:/tmp/mylib.py", """
def greet(name):
    return f"Hello, {name}! Welcome to Databricks."

def add(a, b):
    return a + b

def multiply(a, b):
    return a * b
""", overwrite=True)

print("‚úÖ Successfully wrote mylib.py to dbfs:/tmp/")

Wrote 141 bytes.
‚úÖ Successfully wrote mylib.py to dbfs:/tmp/


### Step üöÄ: Upload `mylib.py` from DBFS `/tmp/` to ADLS `/libs/` Folder

This cell performs the following:

‚úÖ **Configures Spark** with the storage account key to securely access the ADLS Gen2 container.

‚úÖ **Checks if `mylib.py` already exists** in the ADLS `/libs/` folder:
- If it exists, it **deletes** the old file to avoid duplicate conflicts.
- If it doesn‚Äôt exist, it simply proceeds.

‚úÖ **Copies the new `mylib.py` file** from:

## üìå Prerequisite: Run 00_Setup Notebook First

Before running this notebook, make sure you have completed the setup steps in:

`00_Setup and Access-ADLS-Container from the Notebook.ipynb`

That notebook walks you through:
‚úÖ Creating the Databricks secret scope  
‚úÖ Adding the ADLS storage account access key as a secret  
‚úÖ Verifying access to the ADLS container

Once the setup is done, this notebook will correctly retrieve the secret using:

```python
dbutils.secrets.get(scope="adls-secrets", key="adls-access-key")

In [0]:
# --- CONFIG ---
storage_account = "araostorage"
container_name = "araolibraryloadtest"
storage_account_key = dbutils.secrets.get(scope="adls-secrets", key="adls-access-key").strip()
adls_libs_path = f"abfss://{container_name}@{storage_account}.dfs.core.usgovcloudapi.net/libs/mylib.py"

# --- Set Spark Config ---
spark.conf.set(
    f"fs.azure.account.key.{storage_account}.dfs.core.usgovcloudapi.net",
    storage_account_key
)
print("‚úÖ Spark config set for ADLS access")

# --- Check if target file exists ---
try:
    dbutils.fs.ls(adls_libs_path)
    dbutils.fs.rm(adls_libs_path)
    print(f"‚ôªÔ∏è Existing file {adls_libs_path} removed")
except Exception:
    print(f"‚úÖ No existing file at {adls_libs_path}, ready to copy")

# --- Copy file from DBFS /tmp/ to ADLS libs folder ---
dbutils.fs.cp("dbfs:/tmp/mylib.py", adls_libs_path)

print(f"‚úÖ Copied mylib.py to {adls_libs_path}")

‚úÖ Spark config set for ADLS access
‚ôªÔ∏è Existing file abfss://araolibraryloadtest@araostorage.dfs.core.usgovcloudapi.net/libs/mylib.py removed
‚úÖ Copied mylib.py to abfss://araolibraryloadtest@araostorage.dfs.core.usgovcloudapi.net/libs/mylib.py


### Step üîß: Add ADLS `/libs/` Folder to Python sys.path and Import `mylib`

This cell:

‚úÖ Maps the ADLS path `/libs/` to a local `/dbfs/` path  
‚úÖ Adds that local path to Python‚Äôs `sys.path` so it can import the `mylib` module  
‚úÖ Imports `mylib` and runs example function calls to verify everything works

‚ö† **Reminder:** Only files inside the `/dbfs/` path are accessible to Python directly.

In [0]:
# Copy mylib.py from ADLS to local DBFS path for Python import
dbutils.fs.mkdirs("dbfs:/tmp/libs")

dbutils.fs.cp(
    "abfss://araolibraryloadtest@araostorage.dfs.core.usgovcloudapi.net/libs/mylib.py",
    "dbfs:/tmp/libs/mylib.py"
)

print("‚úÖ Copied mylib.py to local DBFS /dbfs/tmp/libs/")

‚úÖ Copied mylib.py to local DBFS /dbfs/tmp/libs/


In [0]:
import sys

local_libs_path = "/dbfs/tmp/libs"
if local_libs_path not in sys.path:
    sys.path.append(local_libs_path)
    print(f"‚úÖ Added {local_libs_path} to sys.path")
else:
    print(f"‚úÖ sys.path already includes {local_libs_path}")

‚úÖ sys.path already includes /dbfs/tmp/libs


In [0]:
import mylib

# Test the functions
print(mylib.greet("Anand"))          # Should print: Hello, Anand! Welcome to Databricks.
print(f"5 + 7 = {mylib.add(5, 7)}") # Should print: 5 + 7 = 12
print(f"3 * 4 = {mylib.multiply(3, 4)}") # Should print: 3 * 4 = 12

Hello, Anand! Welcome to Databricks.
5 + 7 = 12
3 * 4 = 12


- ‚úîÔ∏è You successfully copied your custom mylib.py
- ‚úîÔ∏è Added it to the Python path
- ‚úîÔ∏è Imported it
- ‚úîÔ∏è And ran its functions smoothly inside your Databricks notebook!

## Option 2: Package and Install a Custom Python Library (`mylib`) on Databricks

This section describes how to take a custom Python library, package it as a `.whl`, and install it cleanly inside Databricks notebooks or clusters.

---

### üìÅ Folder Structure

In this repo, under the `/mylib_package/` subfolder (from the provided zip), you will find:
```text
mylib_package/
‚îú‚îÄ‚îÄ mylib/
‚îÇ   ‚îú‚îÄ‚îÄ __init__.py
‚îÇ   ‚îî‚îÄ‚îÄ core.py
‚îú‚îÄ‚îÄ setup.py
‚îî‚îÄ‚îÄ README.md
```

- `core.py` ‚Üí Contains the main functions (`greet`, `add`, `multiply`)  
- `__init__.py` ‚Üí Makes the module importable  
- `setup.py` ‚Üí Defines the package metadata and build instructions

---

### üöÄ Steps to Build and Use

1Ô∏è‚É£ **On Your Local Machine**
- Navigate into the `mylib_package/` folder.
- Run:
    ```bash
    pip install --upgrade setuptools wheel
    python setup.py bdist_wheel
    ```
- This creates a `.whl` file under `dist/`:
    ```
    dist/mylib-0.1-py3-none-any.whl
    ```

---

2Ô∏è‚É£ **Upload to Databricks**
- Use the Databricks CLI or UI to upload:
    ```
    dist/mylib-0.1-py3-none-any.whl ‚Üí dbfs:/tmp/mylib-0.1-py3-none-any.whl
    ```

---

3Ô∏è‚É£ **Install in Notebook**
- In a Databricks notebook, run:
    ```python
    %pip install /dbfs/tmp/mylib-0.1-py3-none-any.whl
    ```

---

4Ô∏è‚É£ **Restart the Python Kernel**
- After installing, restart the notebook kernel:
    ```python
    %restart_python
    ```

---

5Ô∏è‚É£ **Import and Use**
- In the notebook, import and test the library:
    ```python
    import mylib

    print(mylib.greet("Anand"))
    print(f"5 + 7 = {mylib.add(5, 7)}")
    print(f"3 * 4 = {mylib.multiply(3, 4)}")
    ```

‚úÖ This ensures your custom code runs as a proper, installable Python package inside Databricks.

---

### üí° Notes

- To make the library **available cluster-wide**, you can upload the `.whl` through **Workspace ‚Üí Libraries ‚Üí Install New ‚Üí Upload** and attach it at the cluster level.
- For versioning, update the `version=` field in `setup.py` before rebuilding.

In [0]:
%pip install /dbfs/tmp/mylib-0.1-py3-none-any.whl
%restart_python

Processing /dbfs/tmp/mylib-0.1-py3-none-any.whl
mylib is already installed with the same version as the provided wheel. Use --force-reinstall to force an installation of the wheel.
[43mNote: you may need to restart the kernel using %restart_python or dbutils.library.restartPython() to use updated packages.[0m


In [0]:
import mylib

print(mylib.greet("Anand"))
print(f"5 + 7 = {mylib.add(5, 7)}")
print(f"3 * 4 = {mylib.multiply(3, 4)}")

Hello, Anand! Welcome to Databricks.
5 + 7 = 12
3 * 4 = 12
