# User Management in Databricks
## Users, Groups & Service Principals

**Objective:**
In this session, we will learn how to manage access and identities within the Databricks Data Intelligence Platform. We will cover the creation of Users, Groups, and Service Principals, and understand the hierarchy of permissions from the Account level down to the Workspace level.

### Key Concepts
1.  **Account Console:** The central management pane for all Databricks workspaces (Unity Catalog level).
2.  **Workspace Admin vs. Account Admin:** Different levels of administrative control.
3.  **Service Principals:** Non-human "robot" accounts used for automation (Jobs, CI/CD).
4.  **SCIM (System for Cross-domain Identity Management):** Auto-provisioning users from Identity Providers (like Azure Entra ID).

### Prerequisites
To follow along with the administrative steps, you need:
*   **Cloud Admin Access** (e.g., Azure Active Directory / Entra ID role to create users).
*   **Databricks Account Admin Access**.
*   **Workspace Admin Access**.

## 1. Creating Users (Azure Entra ID)

Databricks on Azure relies on Microsoft Entra ID (formerly Azure AD) for authentication.

**Steps to create a user:**
1.  Navigate to **Azure Portal** -> **Microsoft Entra ID**.
2.  Select **Users** -> **New User** -> **Create new user**.
3.  **Principal Name:** e.g., `da` (becomes `da@domain.com`).
4.  **Display Name:** e.g., `Data Analyst`.
5.  **Password:** Create a password (ensure Account is enabled).
6.  Click **Review + Create**.

> **Note:** In a corporate environment, this is usually handled by IT/Admin teams. You would typically sync these users to Databricks automatically using SCIM.

## 2. Managing Users in Databricks Account Console

Once a user exists in the Cloud Provider (Azure), they need to be added to the Databricks Account.

**Manual Method:**
1.  Go to **Manage Account** (accounts.azuredatabricks.net).
2.  Navigate to **User Management** -> **Users** tab.
3.  Click **Add User** and enter the email/UPN (User Principal Name) from Azure.
4.  *Role Assignment:* You can make them an **Account Admin** here, but typically for standard users (Analysts/Engineers), you leave roles blank.

**Automated Method (SCIM):**
*   Go to **Settings** -> **User Provisioning**.
*   Configure SCIM to automatically sync users and groups from Entra ID to Databricks.
*   *Reference:* [Configure SCIM provisioning using Microsoft Entra ID](https://learn.microsoft.com/en-us/azure/databricks/admin/users-groups/scim/aad)

## 3. The Power of Groups

**Best Practice:** Avoid assigning permissions to individual users. Always create Groups.

**Scenario:**
We have a Data Analyst team. Instead of adding `User A`, `User B`, etc., to a workspace one by one:
1.  Go to **User Management** -> **Groups**.
2.  Create a group: `da_grp` (Data Analyst Group).
3.  Add the `Data Analyst` user to this group.
4.  In the future, any new analyst just needs to be added to this **Group**, and they inherit all permissions automatically.

## 4. Granting Workspace Access

Adding a user to the Account Console **does not** give them access to a Workspace (e.g., `adb-xxxx`). You must explicitly assign them.

**Steps:**
1.  In Account Console, go to **Workspaces**.
2.  Select your workspace (e.g., `self-adb`).
3.  Click **Permissions** -> **Add Permissions**.
4.  Search for the Group `da_grp` (not the individual user).
5.  Assign Role: **User** (or Admin if required).

**Entitlements (Personas):**
You can control *what* the user can do inside the workspace:
*   **Workspace Access:** Basic access (Data Engineering/ML).
*   **Databricks SQL Access:** Access to SQL Warehouses, Queries, and Dashboards.
*   **Allow Cluster Creation:** (Usually restricted to Admins/Lead Engineers).

*Example:* A Data Analyst might need **Databricks SQL Access** but does NOT need **Cluster Creation** rights.

In [None]:
# Check Current User Context
# You can verify which user is running the current notebook using Spark SQL or DBUtils.

# Method 1: Spark SQL
print("Current User (Spark):")
spark.sql("SELECT current_user()").show()

# Method 2: DBUtils
current_user = dbutils.notebook.entry_point.getDbutils().notebook().getContext().tags().get("user").get()
print(f"Current User (DBUtils): {current_user}")

## 5. Service Principals (Robot Accounts)

**Service Principals (SP)** are identities used by software, pipelines, or automated jobs, rather than humans.

### Why use Service Principals?
1.  **Production Stability:** If a human user (e.g., `John Doe`) leaves the company, their account is deleted, and any jobs running as `John Doe` will fail. SPs persist.
2.  **Security:** SPs can have specific, limited permissions.

### Creating an Azure Managed Service Principal:
1.  **Azure Portal** -> **App Registrations** -> **New Registration**.
2.  Name: `self-azure-sp`.
3.  Note down: **Application (Client) ID** and **Directory (Tenant) ID**.
4.  **Certificates & secrets** -> Create a **New client secret** (Copy this value immediately!).
5.  **Databricks Account Console** -> **User Management** -> **Service Principals**.
6.  **Add Service Principal** -> Select **Microsoft Entra ID Managed**.
7.  Input the Application ID and Name.

### Assigning SP to Workspace:
Just like a human user, the SP needs to be added to the Workspace permissions to function.

## 6. Running Jobs as Service Principal

When scheduling a Workflow (Job):
1.  Go to **Workflows**.
2.  Select your Job (e.g., `dlt_pipeline_trigger`).
3.  Look for **"Run as"**.
4.  Change the user from your personal account to the **Service Principal**.

*Outcome:* The job now runs independently of your personal identity.

## Summary & Best Practices

| Action | Best Practice |
| :--- | :--- |
| **User Creation** | Create in Cloud Identity Provider (Entra ID), sync via SCIM. |
| **Permissions** | Always assign permissions to **Groups**, not individual users. |
| **Workspace Access** | Restrict entitlements. Not everyone needs Cluster Creation access. |
| **Automation** | Always use **Service Principals** for Production Jobs and Pipelines. |
| **Administration** | Separate Account Admins (Governance) from Workspace Admins (Day-to-day operations). |