# *** DBFS ***

- ### DBFS commands used to interact with the Databricks File System (DBFS) in Azure Databricks.

- DBFS commands e.g.
---
| **Command**                                      | **Description**                                            | **Example**                                                        |
| ------------------------------------------------ | ---------------------------------------------------------- | ------------------------------------------------------------------ |
| `dbutils.fs.ls(path)`                            | **Lists files and directories at the given path.**    | `dbutils.fs.ls("/mnt/raw-data/")`                                  |
| `dbutils.fs.cp(src, dst)`                        | **Copies file from `src` to `dst`.**                           | `dbutils.fs.cp("/mnt/src/file.txt", "/mnt/dest/file.txt")`         |
| `dbutils.fs.cp(src, dst, recurse=True)`          | **Recursively copies a folder.** ( You copy the folder itself, all of its files, and all of its subfolders and their contents, no matter how deeply nested )                               | `dbutils.fs.cp("/mnt/src/", "/mnt/dest/", recurse=True)`           |
| `dbutils.fs.mv(src, dst)`                        | **Moves/renames file or directory.**                           | `dbutils.fs.mv("/mnt/file1.txt", "/mnt/file2.txt")`                |
| `dbutils.fs.rm(path)`                            | **Deletes a file.**                                            | `dbutils.fs.rm("/mnt/file.txt")`                                   |
| `dbutils.fs.rm(path, recurse=True)`              | **Deletes a directory and its contents recursively.** ( Ensures that if the path is a directory, it will be deleted along with all its subdirectories and files )          | `dbutils.fs.rm("/mnt/folder/", recurse=True)`                      |
| `dbutils.fs.mkdirs(path)`                        | **Creates the directory structure specified.**                 | `dbutils.fs.mkdirs("/mnt/new-folder/")`                            |
| `dbutils.fs.put(path, contents)`                 | **Creates a new file at path and writes text contents to it.** | `dbutils.fs.put("/mnt/sample.txt", "Hello World!")`                |
| `dbutils.fs.put(path, contents, overwrite=True)` | **Overwrites the file if it exists.**                          | `dbutils.fs.put("/mnt/sample.txt", "New content", overwrite=True)` |
| `dbutils.fs.head(path)`                          | **Reads the first few bytes (default 65536) of the file.**     | `dbutils.fs.head("/mnt/sample.txt")`                               |
| `dbutils.fs.mounts()`                            | **Lists all mounted storage containers.**                      | `dbutils.fs.mounts()`                                              |
| `dbutils.fs.unmount(path)`                       | **Unmounts the given mount point.**                            | `dbutils.fs.unmount("/mnt/raw-data")`                              |
|`dbutils.secrets.get()`  |  **Get the secreate details**  | `dbutils.secrets.get(scope="<SCOPE_NAME>", key="<KEY_NAME>")`|

## DBFS Secreat and mount point 

### How to create a Databricks backed secret scope:
- https://www.youtube.com/watch?v=vsJvriTpMYU&list=PLzU8IF5r8skukp9lOHgChlOESJLjGNBww&index=5

### How to access Databricks secret scopes
- https://www.youtube.com/watch?v=PtJCLWbP2EU&list=PLzU8IF5r8skukp9lOHgChlOESJLjGNBww&index=14

### Bash command 
databricks secrets create-scope **Secreat_name**

databricks secrets put-secret **Secreat_name** username

databricks secrets put-secret **Secreat_name** password

### Python command 
```python

# Read the storage account key from Databricks secrets
storage_account_name = "mystorageaccount"
container_name = "mycontainer"
mount_point = "/mnt/mydata"

storage_key = dbutils.secrets.get(scope="adls-creds", key="adls-key")

# Build the config dictionary
configs = {
  f"fs.azure.account.key.{storage_account_name}.dfs.core.windows.net": storage_key
}

# Source path
source_uri = f"abfss://{container_name}@{storage_account_name}.dfs.core.windows.net/"

# Mount if not already mounted
if not any(mount.mountPoint == mount_point for mount in dbutils.fs.mounts()):
    dbutils.fs.mount(
        source = source_uri,
        mount_point = mount_point,
        extra_configs = configs
    )
    print(f"Mounted {source_uri} to {mount_point}")
else:
    print(f"{mount_point} is already mounted.")


# 📘 Databricks Widgets 

Widgets in **Databricks** allow you to **parameterize notebooks**. They create interactive input controls that you can use to dynamically set values when running notebooks.

---

## 🔹 Types of Widgets

## 1. Text Widget
- Used for string inputs.
```python

dbutils.widgets.text("param1", "default_value", "Parameter 1")

```

## 2. Dropdown Widget
- User selects from a fixed list of values.
```python

dbutils.widgets.dropdown("param2", "A", ["A", "B", "C"], "Parameter 2")

```

## 3. Combobox Widget
- Like dropdown, but allows users to type custom values too.
```python

dbutils.widgets.combobox("param3", "X", ["X", "Y", "Z"], "Parameter 3")

```

## 4. Multiselect Widget
```python

dbutils.widgets.multiselect("param4", "X", ["X", "Y", "Z"], "Parameter 4")

```

## 📥 Accessing Widget Values
- User can choose multiple values.
```python

param_value = dbutils.widgets.get("param1")

```

## 🧹 Removing Widgets
```python

dbutils.widgets.removeAll()

```

## Passing widget in notebook 

### CHILD NOTEBOOK
```python

# Declare widgets (same names used in parent arguments)
dbutils.widgets.text("state", "", "State")
dbutils.widgets.text("file_format", "", "File Format")

# Get widget values
state = dbutils.widgets.get("state")
file_format = dbutils.widgets.get("file_format")

# Do something with them
print(f"✅ Received in child -> State: {state}, File Format: {file_format}")

# Return value back to parent (optional)
dbutils.notebook.exit(f"Received state={state}, file_format={file_format}")

```


### PARENT NOTEBOOK
```python

# Example metadata table
metadata_df = spark.read.table("your_catalog.your_schema.metadata_table")

# Get parameters for a specific pipeline
pipeline_name = "pipeline_1"
row = metadata_df.filter(f"pipeline_name = '{pipeline_name}'").limit(1).collect()[0]

# Extract values
state = row["state"]
file_format = row["file_format"]

# Call child notebook with values
result = dbutils.notebook.run(
    "/Workspace/ChildNotebook", 
    timeout_seconds=60, 
    arguments={"state": state, "file_format": file_format}
)

print(f"✅ Parent got response: {result}")

```


In [0]:
dbutils.widgets.text("text", "default_value", "Text Widget Example")
dbutils.widgets.dropdown("DropDown", "A", ["A", "B", "C"], "DropDown Widget Example")
dbutils.widgets.combobox("ComboBox", "X", ["X", "Y", "Z"], "ComboBox  Widget Example")
dbutils.widgets.multiselect("MultiSelect", "X", ["X", "Y", "Z"], "MultiSelect  Widget Example")

In [0]:
# Widget name is case senstive 
print("Text widget value :",dbutils.widgets.get("text"))
print("DropDown widget value :",dbutils.widgets.get("DropDown"))
print("Combobox widget value :",dbutils.widgets.get("ComboBox"))
print("multiselect widget value :",dbutils.widgets.get("MultiSelect"))

Text widget value : default_value
DropDown widget value : A
Combobox widget value : X
multiselect widget value : X


In [0]:
# Get selected table name
selected_table = dbutils.widgets.get("text")

# Use it in SQL
spark.sql(f"SELECT  '{selected_table}' as col1").show()


+-------------+
|         col1|
+-------------+
|default_value|
+-------------+



In [0]:
%sql
CREATE WIDGET DROPDOWN state DEFAULT "CA" CHOICES SELECT * FROM (VALUES ("CA"), ("IL"), ("MI"), ("NY"), ("OR"), ("VA"))

In [0]:
%sql
select :state

-- not sure why this is not working in community account

[0;31m---------------------------------------------------------------------------[0m
[0;31mAnalysisException[0m                         Traceback (most recent call last)
File [0;32m<command-1784379865908692>:7[0m
[1;32m      5[0m     display(df)
[1;32m      6[0m     [38;5;28;01mreturn[39;00m df
[0;32m----> 7[0m   _sqldf [38;5;241m=[39m [43m____databricks_percent_sql[49m[43m([49m[43m)[49m
[1;32m      8[0m [38;5;28;01mfinally[39;00m:
[1;32m      9[0m   [38;5;28;01mdel[39;00m ____databricks_percent_sql

File [0;32m<command-1784379865908692>:4[0m, in [0;36m____databricks_percent_sql[0;34m()[0m
[1;32m      2[0m [38;5;28;01mdef[39;00m [38;5;21m____databricks_percent_sql[39m():
[1;32m      3[0m   [38;5;28;01mimport[39;00m [38;5;21;01mbase64[39;00m
[0;32m----> 4[0m   df [38;5;241m=[39m [43mspark[49m[38;5;241;43m.[39;49m[43msql[49m[43m([49m[43mbase64[49m[38;5;241;43m.[39;49m[43mstandard_b64decode[49m[43m([49m[38;5;124;43m"[39;

# Databricks `dbutils.notebook` Command Guide

The `dbutils.notebook` utilities in Databricks are used to call one notebook from another, allowing modular pipeline design and parameter passing between notebooks.

---

## 🧩 1. `dbutils.notebook.run()`
- Runs a child notebook and optionally passes parameters. Execution is **synchronous**.

```python

dbutils.notebook.run(notebook_path: str, timeout_seconds: int, arguments: Dict[str, str]) -> str

```
---
## Paramenters

| Parameter         | Type   | Description                                                         |
| ----------------- | ------ | ------------------------------------------------------------------- |
| `notebook_path`   | string | Path to the child notebook (relative or absolute).                  |
| `timeout_seconds` | int    | Maximum wait time in seconds for the notebook to finish execution.  |
| `arguments`       | dict   | Dictionary of string key-value pairs to pass to the child notebook. |

---
## 🧩  2. %run Command in Databricks
- %run includes and executes another notebook inline in the current notebook’s context.
Unlike dbutils.notebook.run(), it is not used for modular pipeline control, but for reusing functions, variables, and shared code.

### example 
- helper_notebook (Path: /Shared/helper_notebook)
```python

# Define a reusable function
def greet_user(name):
    return f"Hello, {name} 👋"

# Define a shared variable
default_name = "Ayush"
```
- main_notebook

```python

# Include the helper_notebook
%run /Shared/helper_notebook

# Now you can use greet_user and default_name directly
print(greet_user(default_name))  # Output: Hello, Ayush 👋

```

| Feature                | `%run`                    | `dbutils.notebook.run()`             |
| ---------------------- | ------------------------- | ------------------------------------ |
| Purpose                | Share variables/functions | Run a notebook like a subprocess     |
| Parameter Passing      | ❌ Not supported           | ✅ Yes, via `arguments`               |
| Return Value           | ❌ None                    | ✅ Return string from child           |
| Variable Scope Sharing | ✅ Shared with caller      | ❌ Isolated between caller and callee |
| Execution Type         | Inline execution          | Separate, blocking call              |


---
## 🧩 3. dbutils.notebook.exit()
- Used in the child notebook to return a result to the parent notebook. or stop the notebook
```python

# Notebook exit command  
dbutils.notebook.exit(value: str)

# example 1
result_message = "Completed loading data successfully."
dbutils.notebook.exit(result_message)

# example 2

# run this from parent notebook)
response = dbutils.notebook.run("/Users/ayush/child_job", 120, {"param1": "val1"})
print("Child notebook returned:", response)

# code in child notebook 
dbutils.widgets.text("param1", "")
param1_val = dbutils.widgets.get("param1")
dbutils.notebook.exit(f"Received param1 = {param1_val}")


```


