# Databricks User Group Assessment

This notebook checks whether specified user groups exist in Databricks Unity Catalog using PySpark.

**Author**: Jhonathan Pauca Joya  
**Role**: MLOPS Engineer  
**University**: Universidad Nacional Mayor de San Marcos  
**Program**: Master's  
**Last Updated**: August 2025  
**Version**: 1.0  

## Overview

- **Goal**: Verify that all required user groups are provisioned in Unity Catalog.
- **Input**: List of expected group names (provided via widget).
- **Output**: Report of missing groups, if any.

## Instructions

1. Set the expected group names using the widget below (comma-separated).
2. Run the assessment cell to check group existence.
3. Review the output for missing groups.

In [None]:
# Databricks User Group Assessment: Group Existence Check
"""
This cell uses Databricks widgets to accept a list of expected group names,
checks their existence in Unity Catalog, and reports any missing groups.
"""

# 1. Import required libraries
from pyspark.sql.utils import AnalysisException

# 2. Define Databricks widget for expected groups (comma-separated)
dbutils.widgets.text("expected_groups", "users,admins,user2", "Expected Groups (comma-separated)")
expected_groups = dbutils.widgets.get("expected_groups")
expected = set(g.strip() for g in expected_groups.split(",") if g.strip())

# 3. Read groups from Unity Catalog
groups_df = spark.sql("SHOW GROUPS")
existing = {row.name for row in groups_df.collect()}

# 4. Detect missing groups
missing = expected - existing
present = expected & existing

# 5. Report results
if missing:
    print("❌ The following groups are NOT present in Unity Catalog:")
    for g in sorted(missing):
        print(f" - {g}")
    print(f"{len(missing)} group(s) missing. Please provision them before proceeding.")
    print("\n✅ The following expected groups DO exist in Unity Catalog:")
    for g in sorted(present):
        print(f" - {g}")
else:
    print("✅ All expected groups exist. You may proceed with grants.")