
# Databricks Widgets: 
# Interactive Notebook Parameters

Databricks widgets let you add input controls to notebooks for dynamic, parameterized workflows.  
Common types include **Text**, **Dropdown**, **Combobox**, and **Multi-select**.

Create widgets with the `dbutils.widgets` API:

```python
dbutils.widgets.text("input_text", "default_value", "Input Text")
dbutils.widgets.dropdown("color", "red", ["red", "green", "blue"], "Choose Color")
dbutils.widgets.multiselect("fruits", "apple", ["apple", "banana", "cherry"], "Select Fruits")
```

Get widget values in python:

```python
selected_color: str = dbutils.widgets.get("color")
```

Or use widget values in SQL with the colon syntax:

```sql
SELECT * FROM my_catalog.my_schema.my_table WHERE color = ':color'
```

Remove widgets with `dbutils.widgets.remove("input_text")` or `dbutils.widgets.removeAll()`.  
See the [Databricks Widgets Documentation](https://docs.databricks.com/en/notebooks/widgets.html) for details.

In [0]:
from typing import Final

# Widget for volume type: managed or external
dbutils.widgets.dropdown(
    name="volume_type",
    defaultValue="managed",
    choices=["managed", "external"],
    label="Volume Type",
)

# Widget for catalog name (Unity Catalog) as dropdown based on existing catalogs
catalogs_df = spark.sql("SHOW CATALOGS")
catalog_names: list[str] = [row.catalog for row in catalogs_df.collect()]
dbutils.widgets.dropdown(
    name="catalog_name",
    defaultValue=catalog_names[0] if catalog_names else "",
    choices=catalog_names if catalog_names else [""],
    label="Catalog Name",
)

# Widget for schema name
dbutils.widgets.text(
    name="schema_name",
    defaultValue="",
    label="Schema Name",
)

# Widget for volume name
dbutils.widgets.text(
    name="volume_name",
    defaultValue="",
    label="Volume Name",
)

# Widget for external location (only needed for external volumes)
dbutils.widgets.text(
    name="external_location",
    defaultValue="",
    label="External Location (required for external volume)",
)

In [0]:
VOLUME_TYPE: Final[str] = dbutils.widgets.get("volume_type")
CATALOG_NAME: Final[str] = dbutils.widgets.get("catalog_name")
SCHEMA_NAME: Final[str] = dbutils.widgets.get("schema_name")
VOLUME_NAME: Final[str] = dbutils.widgets.get("volume_name")
EXTERNAL_LOCATION: Final[str] = dbutils.widgets.get("external_location")

# Create Unity Catalog Volume (If Not Exists)

This notebook cell will create a Unity Catalog volume in the specified catalog and schema, 
using the provided parameters. Volumes are used to manage access to non-tabular data 
(files, logs, checkpoints, etc.) in cloud storage with centralized governance.

- **Managed volumes**: Databricks manages the storage location.
- **External volumes**: You specify the cloud storage path.

If the volume already exists, the cell will not recreate it.  
Refer to [Databricks Volumes Documentation](https://docs.databricks.com/en/volumes/index.html) for details.

In [0]:
# Create volume only if it does not exist, using loaded arguments
volume_identifier: str = f"{CATALOG_NAME}.{SCHEMA_NAME}.{VOLUME_NAME}"

print(
    f"Preparing to create volume: {volume_identifier} \n"
    f"Type: {VOLUME_TYPE}"
)

if VOLUME_TYPE == "external":
    print(f"External location provided: {EXTERNAL_LOCATION}")
    sql_statement: str = (
        f"CREATE EXTERNAL VOLUME IF NOT EXISTS {volume_identifier} "
        f"LOCATION '{EXTERNAL_LOCATION}'"
    )
else:
    sql_statement: str = (
        f"CREATE VOLUME IF NOT EXISTS {volume_identifier}"
    )

print(f"Executing SQL statement:\n'{sql_statement}'")
spark.sql(sql_statement)
print("✅ Volume creation command executed.")

# Exporting Table Data to XML

To export data from a Unity Catalog table to XML:

1. **Read the table into a DataFrame**  
   Use Spark to load your table:  
   ```python
   data_frame = spark.table("catalog_name.schema_name.table_name")
   ```

2. **Write the DataFrame as XML**  
   Use the XML data source:  
   ```python
   data_frame.write \
       .format("xml") \
       .options(rootTag="trips", rowTag="trip") \
       .mode("overwrite") \
       .save("/Volumes/catalog_name/schema_name/volume_name/output.xml")
   ```

- Adjust `rootTag` and `rowTag` for your schema.
- The output XML file will be saved to your specified Unity Catalog volume.

See [Spark XML Documentation](https://spark.apache.org/docs/latest/sql-data-sources-xml.html) for details.

In [0]:
from typing import Final

output_filename: Final[str] = "nyctaxi_trips.xml"

print("Selecting data from samples.nyctaxi.trips table...")
nyctaxi_trips_df: Final = spark.sql("SELECT * FROM samples.nyctaxi.trips")
print("✅ Data selected.")

volume_path: Final[str] = f"/Volumes/{CATALOG_NAME}/{SCHEMA_NAME}/{VOLUME_NAME}/{output_filename}"
print(f"Writing XML data to volume: {volume_path} ...")
nyctaxi_trips_df.write \
  .format("xml") \
  .options(rootTag="trips", rowTag="trip") \
  .mode("overwrite") \
  .save(volume_path)
print("✅ XML export complete.")