# 02_ADLS_Library_Usage

## Importing and Using the Custom Python Library from ADLS Gen2

This notebook:
✅ Adds the mounted ADLS `libs` path to Python’s `sys.path`  
✅ Imports the custom library (`mylib`)  
✅ Calls and tests its functions

⚠ Make sure you’ve run the `01_ADLS_Library_Setup` notebook before this.

## Local Setup Before Running the Notebook

Before using this notebook, prepare the following on your local machine:

✅ Create a simple Python library file named `mylib.py` with this content:

```python
# mylib.py

def greet(name):
    return f"Hello, {name}! Welcome to Databricks."

def add(a, b):
    return a + b

def multiply(a, b):
    return a * b
```

- ✅ Copy this file to your local path (example: /tmp/mylib.py or C:\\temp\\mylib.py).

- ✅ Ensure the Databricks Secrets setup (scope: adls-secrets, key: adls-access-key) is ready and that your ADLS container has a libs/ folder for library uploads.

- ✅ Make sure Databricks CLI is configured if you plan to upload files via CLI (optional).

Upload the Library File to Databricks

You have two options:

✅ Option 1 → Use the Databricks Data UI
	•	Go to the Databricks workspace UI (left sidebar → Data → Upload Data).
	•	Choose Target Location → /tmp/.
	•	Upload your mylib.py file.
	•	This will place the file at: /dbfs/tmp/mylib.py inside the Databricks environment.

✅ Option 2 → Use the Cell Below to Write the Sample File Directly
	•	If you can’t upload through the UI, use the provided notebook cell that writes the sample file programmatically into /tmp/.

✅ Once uploaded (by either method), this notebook will move the file to the ADLS-mounted folder for shared library access.

In [0]:
# Write mylib.py into DBFS /tmp using dbutils.fs.put

dbutils.fs.put("dbfs:/tmp/mylib.py", """
def greet(name):
    return f"Hello, {name}! Welcome to Databricks."

def add(a, b):
    return a + b

def multiply(a, b):
    return a * b
""", overwrite=True)

print("✅ Successfully wrote mylib.py to dbfs:/tmp/")

Wrote 141 bytes.
✅ Successfully wrote mylib.py to dbfs:/tmp/


### Step 🚀: Upload `mylib.py` from DBFS `/tmp/` to ADLS `/libs/` Folder

This cell performs the following:

✅ **Configures Spark** with the storage account key to securely access the ADLS Gen2 container.

✅ **Checks if `mylib.py` already exists** in the ADLS `/libs/` folder:
- If it exists, it **deletes** the old file to avoid duplicate conflicts.
- If it doesn’t exist, it simply proceeds.

✅ **Copies the new `mylib.py` file** from:

## 📌 Prerequisite: Run 00_Setup Notebook First

Before running this notebook, make sure you have completed the setup steps in:

`00_Setup and Access-ADLS-Container from the Notebook.ipynb`

That notebook walks you through:
✅ Creating the Databricks secret scope  
✅ Adding the ADLS storage account access key as a secret  
✅ Verifying access to the ADLS container

Once the setup is done, this notebook will correctly retrieve the secret using:

```python
dbutils.secrets.get(scope="adls-secrets", key="adls-access-key")

In [0]:
# --- CONFIG ---
storage_account = "araostorage"
container_name = "araolibraryloadtest"
storage_account_key = dbutils.secrets.get(scope="adls-secrets", key="adls-access-key").strip()
adls_libs_path = f"abfss://{container_name}@{storage_account}.dfs.core.usgovcloudapi.net/libs/mylib.py"

# --- Set Spark Config ---
spark.conf.set(
    f"fs.azure.account.key.{storage_account}.dfs.core.usgovcloudapi.net",
    storage_account_key
)
print("✅ Spark config set for ADLS access")

# --- Check if target file exists ---
try:
    dbutils.fs.ls(adls_libs_path)
    dbutils.fs.rm(adls_libs_path)
    print(f"♻️ Existing file {adls_libs_path} removed")
except Exception:
    print(f"✅ No existing file at {adls_libs_path}, ready to copy")

# --- Copy file from DBFS /tmp/ to ADLS libs folder ---
dbutils.fs.cp("dbfs:/tmp/mylib.py", adls_libs_path)

print(f"✅ Copied mylib.py to {adls_libs_path}")

✅ Spark config set for ADLS access
♻️ Existing file abfss://araolibraryloadtest@araostorage.dfs.core.usgovcloudapi.net/libs/mylib.py removed
✅ Copied mylib.py to abfss://araolibraryloadtest@araostorage.dfs.core.usgovcloudapi.net/libs/mylib.py


### Step 🔧: Add ADLS `/libs/` Folder to Python sys.path and Import `mylib`

This cell:

✅ Maps the ADLS path `/libs/` to a local `/dbfs/` path  
✅ Adds that local path to Python’s `sys.path` so it can import the `mylib` module  
✅ Imports `mylib` and runs example function calls to verify everything works

⚠ **Reminder:** Only files inside the `/dbfs/` path are accessible to Python directly.

In [0]:
# Copy mylib.py from ADLS to local DBFS path for Python import
dbutils.fs.mkdirs("dbfs:/tmp/libs")

dbutils.fs.cp(
    "abfss://araolibraryloadtest@araostorage.dfs.core.usgovcloudapi.net/libs/mylib.py",
    "dbfs:/tmp/libs/mylib.py"
)

print("✅ Copied mylib.py to local DBFS /dbfs/tmp/libs/")

✅ Copied mylib.py to local DBFS /dbfs/tmp/libs/


In [0]:
import sys

local_libs_path = "/dbfs/tmp/libs"
if local_libs_path not in sys.path:
    sys.path.append(local_libs_path)
    print(f"✅ Added {local_libs_path} to sys.path")
else:
    print(f"✅ sys.path already includes {local_libs_path}")

✅ sys.path already includes /dbfs/tmp/libs


In [0]:
import mylib

# Test the functions
print(mylib.greet("Anand"))          # Should print: Hello, Anand! Welcome to Databricks.
print(f"5 + 7 = {mylib.add(5, 7)}") # Should print: 5 + 7 = 12
print(f"3 * 4 = {mylib.multiply(3, 4)}") # Should print: 3 * 4 = 12

Hello, Anand! Welcome to Databricks.
5 + 7 = 12
3 * 4 = 12


- ✔️ You successfully copied your custom mylib.py
- ✔️ Added it to the Python path
- ✔️ Imported it
- ✔️ And ran its functions smoothly inside your Databricks notebook!

## Option 2: Package and Install a Custom Python Library (`mylib`) on Databricks

This section describes how to take a custom Python library, package it as a `.whl`, and install it cleanly inside Databricks notebooks or clusters.

---

### 📁 Folder Structure

In this repo, under the `/mylib_package/` subfolder (from the provided zip), you will find:
```text
mylib_package/
├── mylib/
│   ├── __init__.py
│   └── core.py
├── setup.py
└── README.md
```

- `core.py` → Contains the main functions (`greet`, `add`, `multiply`)  
- `__init__.py` → Makes the module importable  
- `setup.py` → Defines the package metadata and build instructions

---

### 🚀 Steps to Build and Use

1️⃣ **On Your Local Machine**
- Navigate into the `mylib_package/` folder.
- Run:
    ```bash
    pip install --upgrade setuptools wheel
    python setup.py bdist_wheel
    ```
- This creates a `.whl` file under `dist/`:
    ```
    dist/mylib-0.1-py3-none-any.whl
    ```

---

2️⃣ **Upload to Databricks**
- Use the Databricks CLI or UI to upload:
    ```
    dist/mylib-0.1-py3-none-any.whl → dbfs:/tmp/mylib-0.1-py3-none-any.whl
    ```

---

3️⃣ **Install in Notebook**
- In a Databricks notebook, run:
    ```python
    %pip install /dbfs/tmp/mylib-0.1-py3-none-any.whl
    ```

---

4️⃣ **Restart the Python Kernel**
- After installing, restart the notebook kernel:
    ```python
    %restart_python
    ```

---

5️⃣ **Import and Use**
- In the notebook, import and test the library:
    ```python
    import mylib

    print(mylib.greet("Anand"))
    print(f"5 + 7 = {mylib.add(5, 7)}")
    print(f"3 * 4 = {mylib.multiply(3, 4)}")
    ```

✅ This ensures your custom code runs as a proper, installable Python package inside Databricks.

---

### 💡 Notes

- To make the library **available cluster-wide**, you can upload the `.whl` through **Workspace → Libraries → Install New → Upload** and attach it at the cluster level.
- For versioning, update the `version=` field in `setup.py` before rebuilding.

In [0]:
%pip install /dbfs/tmp/mylib-0.1-py3-none-any.whl
%restart_python

Processing /dbfs/tmp/mylib-0.1-py3-none-any.whl
mylib is already installed with the same version as the provided wheel. Use --force-reinstall to force an installation of the wheel.
[43mNote: you may need to restart the kernel using %restart_python or dbutils.library.restartPython() to use updated packages.[0m


In [0]:
import mylib

print(mylib.greet("Anand"))
print(f"5 + 7 = {mylib.add(5, 7)}")
print(f"3 * 4 = {mylib.multiply(3, 4)}")

Hello, Anand! Welcome to Databricks.
5 + 7 = 12
3 * 4 = 12
