# Setup Azure Databricks Workspace

**Objective:** Create a production-grade Azure Databricks environment. Instead of using the default automatic networking, we will perform **VNet Injection**. This means deploying the Databricks workspace into our own custom Azure Virtual Network (VNet) to ensure better security and network control.

---

## 1. Prerequisites
Before starting, ensure you have:
1.  **Azure Account:** A valid account with access to the Azure Portal.
2.  **Subscription:** A "Pay-As-You-Go" subscription.
    *   *Note: Free Trial subscriptions may have quota limits that prevent cluster creation.*
3.  **Permissions:** You must have at least **Contributor** access to the subscription/resource group.

---

## 2. Understanding Costs & Plans
Databricks pricing consists of two components:
1.  **Databricks Units (DBUs):** The processing capability charged by Databricks.
2.  **Cloud Infrastructure:** The cost of Virtual Machines (VMs), Storage, and Networking charged by Azure.

### Pricing Tiers
*   **Standard:** Basic functionality.
*   **Premium:** Includes advanced security (Role Based Access Control), Unity Catalog, and Delta Live Tables. **(Recommended for this course)**.
    *   *Tip: New accounts often get a 14-day free trial of Premium DBUs (you still pay for Azure VMs).*

## 3. Network Architecture (VNet Injection)

We will create a custom network topology. Databricks requires two subnets:
1.  **Public Subnet:** Used for communication with the Databricks Control Plane (Web App).
2.  **Private Subnet:** Used for the actual cluster nodes (VMs).

### CIDR Strategy
We will use a small network range for this demo.

| Resource | Name | CIDR Range | IP Count |
| :--- | :--- | :--- | :--- |
| **Virtual Network** | `adb-vnet` | `10.0.1.0/24` | 256 IPs |
| **Public Subnet** | `public-subnet` | `10.0.1.0/25` | 128 IPs |
| **Private Subnet** | `private-subnet` | `10.0.1.128/25` | 128 IPs |

## 4. Step-by-Step Setup Guide (UI Method)

Follow these steps in the [Azure Portal](https://portal.azure.com):

### Step A: Create Resource Group
1.  Search for **Resource groups**.
2.  Click **Create**.
3.  Name: `self-adb-rg`.
4.  Region: `Central India` (or your preferred region).

### Step B: Create Virtual Network
1.  Search for **Virtual networks**.
2.  Click **Create**.
3.  Resource Group: `self-adb-rg`.
4.  Name: `adb-vnet`.
5.  Go to **IP Addresses** tab:
    *   Edit default IPv4 address space to: `10.0.1.0/24`.
    *   Remove default subnet.
    *   **Add Subnet 1:** Name `public-subnet`, Range `10.0.1.0/25`.
    *   **Add Subnet 2:** Name `private-subnet`, Range `10.0.1.128/25`.
6.  Review & Create.

### Step C: Create Databricks Workspace
1.  Search for **Azure Databricks**.
2.  Click **Create**.
3.  Resource Group: `self-adb-rg`.
4.  Workspace Name: `self-adb`.
5.  Region: `Central India` (Must match VNet region).
6.  **Pricing Tier:** `Premium` (Trial).
7.  **Networking Tab (Crucial):**
    *   Deploy with Secure Cluster Connectivity (No Public IP): **Yes**.
    *   Deploy workspace in your own Virtual Network: **Yes**.
    *   Virtual Network: Select `adb-vnet`.
    *   Public Subnet Name: `public-subnet`.
    *   Public Subnet CIDR: `10.0.1.0/25`.
    *   Private Subnet Name: `private-subnet`.
    *   Private Subnet CIDR: `10.0.1.128/25`.
8.  Review & Create.

In [None]:
# Automation Script (Azure CLI)
# If you have Azure CLI installed, you can run this python script to automate the setup.
# Note: Ensure you are logged in via `az login` before running.

import os

# Configuration Variables
RG_NAME = "self-adb-rg"
LOCATION = "centralindia"
VNET_NAME = "adb-vnet"
WORKSPACE_NAME = "self-adb"

# Azure CLI Commands Generator
commands = [
    # 1. Create Resource Group
    f"az group create --name {RG_NAME} --location {LOCATION}",

    # 2. Create VNet with Address Prefix
    f"az network vnet create --resource-group {RG_NAME} --name {VNET_NAME} --address-prefix 10.0.1.0/24 --location {LOCATION}",

    # 3. Create Public Subnet
    f"az network vnet subnet create --resource-group {RG_NAME} --vnet-name {VNET_NAME} --name public-subnet --address-prefix 10.0.1.0/25",

    # 4. Create Private Subnet
    f"az network vnet subnet create --resource-group {RG_NAME} --vnet-name {VNET_NAME} --name private-subnet --address-prefix 10.0.1.128/25",

    # 5. Create Databricks Workspace (VNet Injected)
    # Note: This command is complex and usually requires networking delegation.
    # For simplicity, the UI method is preferred for beginners.
    f"echo 'Please use the Azure Portal UI to create the Workspace to handle Network Delegation automatically.'"
]

print("--- Azure Setup Commands ---")
for cmd in commands:
    print(f"Executing: {cmd}")
    # os.system(cmd) # Uncomment to execute if Azure CLI is configured

## 5. Launch and Verify

1.  Once the deployment is complete (it takes a few minutes), go to the resource.
2.  Click **Launch Workspace**.
3.  You will be redirected to `https://adb-<workspace-id>.<region>.azuredatabricks.net`.
4.  **Verification:** Check the top right corner. You should see your user email.

## 6. Databricks Account Console
While the workspace is where you work, the **Account Console** is where you manage multiple workspaces.
*   URL: `https://accounts.azuredatabricks.net`
*   You should see your newly created workspace listed here.

## Next Steps
In the next session, we will perform a **Walkthrough of the Workspace UI** to understand the different sections (Compute, Data, Workflows) and configure our user settings.