# Secret Management in Databricks
## Azure Key Vault & Databricks Backed Scopes

**Objective:** Learn how to secure sensitive information like passwords, connection strings, and keys in Databricks using **Secret Scopes**, avoiding plain-text credentials in notebooks.

### What is a Secret Scope?
A **Secret Scope** is a secure logical container in Databricks where secrets (key-value pairs) are stored.
There are two types of Secret Scopes:
1.  **Azure Key Vault (AKV) Backed:** Secrets are managed in Azure Key Vault and referenced by Databricks.
2.  **Databricks Backed:** Secrets are stored in an encrypted database managed internally by Databricks.

## 1. Azure Key Vault (AKV) Backed Scope
This is the recommended approach for Azure Databricks users as it centralizes secret management.

### Setup Steps (Azure Portal):
1.  **Create Key Vault:** Go to Azure Portal -> Search "Key Vault" -> Create.
2.  **Access Policies/RBAC:**
    *   Go to **Access control (IAM)** in your Key Vault.
    *   Add Role Assignment -> **Key Vault Secrets User** (or Administrator).
    *   Assign access to **AzureDatabricks** application (Service Principal).
3.  **Networking:** Ensure "Allow trusted Microsoft services to bypass this firewall" is checked in Networking settings.
4.  **Create Secret:** Go to **Secrets** -> Generate/Import -> Name: `db-password`, Value: `YourSecretPassword`.

### Link Databricks to Key Vault:
1.  Copy the **DNS Name** (Vault URI) and **Resource ID** from the Key Vault Properties tab.
2.  Go to the special URL in your Databricks workspace:
    *   `https://<your-databricks-instance>#secrets/createScope`
3.  **Scope Name:** e.g., `akv-scope`
4.  **DNS Name:** Paste the Vault URI.
5.  **Resource ID:** Paste the Resource ID.
6.  Click **Create**.

In [None]:
# Step 1: List all available scopes to verify creation
# Note: dbutils.secrets.listScopes() returns a list of scopes configured in the workspace.

scopes = dbutils.secrets.listScopes()

print("Available Secret Scopes:")
for scope in scopes:
    print(f"- {scope.name}")

In [None]:
# Step 2: List secrets inside the Azure Key Vault backed scope
# Note: This might throw an error if your user does not have 'LIST' permission on the Key Vault secrets,
# even if the scope exists.

scope_name = "akv-scope" # Replace with your scope name created in step above

try:
    secrets = dbutils.secrets.list(scope_name)
    print(f"Secrets in '{scope_name}':")
    for secret in secrets:
        print(f"- Key: {secret.key}")
except Exception as e:
    print(f"Error listing secrets: {e}")

In [None]:
# Step 3: Accessing the Secret
# The value will be printed as [REDACTED] in the notebook output for security.

secret_key = "db-password" # The name of the secret created in Azure Key Vault

my_secret = dbutils.secrets.get(scope=scope_name, key=secret_key)

print(f"The value of the secret is: {my_secret}")

# Verification: Use simple logic to check if it retrieved something
if my_secret:
    print("Secret retrieved successfully (but hidden).")

## 2. Databricks Backed Scope
If you don't have an Azure Key Vault, you can store secrets directly in Databricks. **This requires the Databricks CLI.**

### CLI Setup & Commands:
1.  **Install CLI:** `winget install Databricks.DatabricksCLI` (Windows) or via `pip`.
2.  **Authenticate:**
    ```bash
    databricks auth login --host <workspace-url>
    ```
3.  **Create Scope:**
    ```bash
    databricks secrets create-scope <scope-name>
    # Example: databricks secrets create-scope db-scope
    ```
4.  **Add Secret:**
    ```bash
    databricks secrets put-secret <scope-name> <key-name> --string-value <secret-value>
    # Example: databricks secrets put-secret db-scope db-host --string-value "xyz.database.windows.net"
    ```
5.  **List Scopes:** `databricks secrets list-scopes`

In [None]:
# Accessing secrets from the Databricks Backed Scope
# Assuming you ran the CLI commands mentioned above to create 'db-scope' and 'db-host'

db_scope_name = "db-scope"
db_key_name = "db-host"

try:
    host_secret = dbutils.secrets.get(scope=db_scope_name, key=db_key_name)
    print(f"Retrieved secret from Databricks backed scope: {host_secret}")
except Exception as e:
    print(f"Could not retrieve secret. Ensure the scope '{db_scope_name}' was created via CLI.")

## 3. Practical Use Case: JDBC Connections
The most common use case is constructing connection strings without exposing passwords.

In [None]:
# Example: Connecting to a Database using Secrets

# Retrieve credentials securely
jdbc_hostname = dbutils.secrets.get(scope="db-scope", key="db-host")
jdbc_username = "admin_user"
jdbc_password = dbutils.secrets.get(scope="akv-scope", key="db-password")
jdbc_database = "employees_db"
jdbc_port = 1433

# Construct JDBC URL safely
jdbc_url = f"jdbc:sqlserver://{jdbc_hostname}:{jdbc_port};database={jdbc_database}"

connection_properties = {
    "user": jdbc_username,
    "password": jdbc_password, # The actual string is passed to the driver securely
    "driver": "com.microsoft.sqlserver.jdbc.SQLServerDriver"
}

print(f"Connecting to: {jdbc_url}")
print("Connection properties prepared with secure password.")

# spark.read.jdbc(url=jdbc_url, table="table_name", properties=connection_properties)

## 4. Helper Utilities
Databricks provides a help command to explore available secret utilities.

In [None]:
# Explore all secret commands
dbutils.secrets.help()

## Summary
*   **Never hardcode** passwords in notebooks.
*   Use **Azure Key Vault** backed scopes for enterprise-level management.
*   Use **Databricks Backed** scopes for quick, internal secret storage using the CLI.
*   Secrets are automatically **Redacted** in notebook outputs.