# Cluster Policies & Instance Pools
## Databricks Zero to Hero - Part 21

**Objective:** Learn how to manage compute resources effectively using **Cluster Policies** to enforce governance and cost controls, and **Instance Pools** to reduce cluster start-up times.

### Topics Covered:
1.  **Cluster Policies:** Creating custom policies to restrict user access to compute configurations (e.g., enforcing auto-termination, restricting instance types).
2.  **Enforcing Policies:** How to update policies and remediate non-compliant clusters.
3.  **Instance Pools:** Understanding "Warm" vs "Cold" instances to speed up job execution.

## 1. Cluster Policies

Cluster policies allow administrators to limit the compute creation permissions of users. This is crucial for:
*   **Cost Control:** Prevent users from creating massive, expensive clusters.
*   **Governance:** Enforce tags, runtime versions, or auto-termination rules.

### Use Case Scenario
We want to create a **"Custom Shared Compute"** policy with the following strict rules:
1.  **Fixed Auto-Termination:** 10 minutes (Users cannot change this).
2.  **Fixed Size:** Exactly 1 Worker (No Autoscaling).
3.  **Fixed Runtime:** Specific Databricks Runtime (e.g., 14.3 LTS).
4.  **Restricted Node Types:** Users can only select specific instance types (e.g., Standard_DS4_v2).

### Policy JSON Structure
Policies are defined using JSON. Below is the JSON configuration derived from the video insights to achieve the rules above.

In [None]:
// Copy this JSON into the Databricks Policy Editor (Compute -> Policies -> Create/Edit)

{
  "spark_conf.spark.databricks.cluster.profile": {
    "type": "fixed",
    "value": "serverless",
    "hidden": true
  },
  "autotermination_minutes": {
    "type": "fixed",
    "value": 10,
    "hidden": false
  },
  "num_workers": {
    "type": "fixed",
    "value": 1,
    "hidden": false
  },
  "autoscale.min_workers": {
    "type": "forbidden"
  },
  "autoscale.max_workers": {
    "type": "forbidden"
  },
  "spark_version": {
    "type": "fixed",
    "value": "14.3.x-scala2.12",
    "hidden": false
  },
  "node_type_id": {
    "type": "allowlist",
    "values": [
      "Standard_DS4_v2",
      "Standard_DS3_v2"
    ],
    "defaultValue": "Standard_DS4_v2"
  },
  "driver_node_type_id": {
    "type": "allowlist",
    "values": [
      "Standard_DS4_v2",
      "Standard_DS3_v2"
    ],
    "defaultValue": "Standard_DS4_v2"
  }
}

### Key Policy Attributes Explained:
*   `"type": "fixed"`: The user sees the value but cannot change it.
*   `"type": "forbidden"`: The setting (like autoscaling) is completely hidden or disabled.
*   `"type": "allowlist"`: Provides a specific dropdown of options the user must choose from.
*   `"hidden": true`: The user doesn't even see this setting in the UI.

### Policy Enforcement & Compliance
If you edit an existing policy (e.g., upgrade the `spark_version` to 15.4 LTS), clusters currently using that policy will be flagged as **Non-Compliant**.
*   **Action:** In the Policies UI, you can click **"Fix"** or **"Fix All"** to automatically update the existing clusters to match the new policy rules (this requires a restart of the clusters).

## 2. Instance Pools

Instance Pools are a set of idle, ready-to-use instances managed by Databricks. They allow you to reduce the start-up time of clusters from minutes to seconds.

### Concepts:
1.  **Min Idle (Warm Pool):** The minimum number of instances the pool keeps running *even when no jobs are using them*.
    *   *Benefit:* Immediate availability.
    *   *Cost:* You pay for these instances while they are idle (DBU cost is usually waived, but Cloud Provider VM cost applies).
2.  **Max Capacity:** The hard limit on the number of instances the pool can provision.
3.  **Idle Instance Auto Termination:** How long an instance stays in the pool after a job releases it before it is terminated by the cloud provider.

### Example Configuration (from Video)
*   **Pool Name:** `Demo Pool`
*   **Min Idle:** `1` (Keeps 1 node running 24/7 - enables "Warm" starts)
*   **Max Capacity:** `10`
*   **Idle Auto Termination:** `10 minutes`

### How to use?
When creating a Cluster (Compute), instead of selecting a "Worker Type" (like Standard_DS3_v2), you select the **Instance Pool** you created. The cluster will grab resources from that pool.

In [None]:
# Practical Tip: Using Python/API to list Instance Pools (Optional)
# This requires the Databricks SDK or proper authentication setup.

try:
    from databricks.sdk import WorkspaceClient
    w = WorkspaceClient()

    print("Listing available Instance Pools:")
    for pool in w.instance_pools.list():
        print(f"Pool Name: {pool.instance_pool_name}, ID: {pool.instance_pool_id}")
        print(f"  - Min Idle: {pool.min_idle_instances}")
        print(f"  - Max Capacity: {pool.max_capacity}")

except ImportError:
    print("Databricks SDK not installed. You can manage pools via the 'Compute' -> 'Pools' tab in the UI.")
except Exception as e:
    print(f"Note: This code block requires API configuration. Error: {e}")

## Summary

| Feature | Primary Goal | Key Action |
| :--- | :--- | :--- |
| **Cluster Policies** | Governance & Cost Control | Restrict settings via JSON (Fixed, Forbidden, Allowlist). |
| **Instance Pools** | Performance (Startup Time) | Maintain "Warm" instances (`Min Idle`) for rapid deployment. |

**Next Steps:** In the next session, we will start working with **Databricks Workflows (Jobs)** to orchestrate our pipelines.