# Monitor and Govern Databricks Workspaces

Use system tables to monitor usage, costs, and implement governance with Unity Catalog.

## What You'll Learn

âœ… Query system tables for observability  
âœ… Analyze billing and cost allocation  
âœ… Monitor workspace usage and performance  
âœ… Implement Unity Catalog security  
âœ… Create governance dashboards  

**Note**: Since students won't have access to actual system tables, we'll use synthetic data that matches the schema.

---

**References:**
- [System Tables](https://docs.databricks.com/aws/en/admin/system-tables/)
- [Billing Tables](https://docs.databricks.com/aws/en/admin/system-tables/billing)
- [Unity Catalog Governance](https://docs.databricks.com/aws/en/data-governance/unity-catalog/)
- [Observability Dashboards](https://github.com/CodyAustinDavis/dbsql_sme/tree/main/Observability%20Dashboards%20and%20DBA%20Resources)

## 1. System Tables Overview

### What are System Tables?

**System Tables** provide observability into:
- Billing and usage
- Query execution
- Warehouse performance
- Audit logs
- Lineage information

### Available Schemas

```
system.billing.*        - Cost and usage data
system.compute.*        - Cluster and warehouse metrics
system.query.*          - Query execution logs
system.audit.*          - Audit logs
system.lineage.*        - Data lineage
```

### Access Requirements

**In Production:**
- Account admin privileges
- Unity Catalog enabled
- System tables schema access

**In This Training:**
- We'll use synthetic data with matching schemas
- Demonstrates real-world queries and patterns


In [None]:
## 2. Cost Analysis Queries

Let's analyze our synthetic billing data to understand cost patterns. In a real production environment, you would use `system.billing.*` tables.

**IMPORTANT:** The system tables (system_billing, query_history, audit_logs, user_permissions) are in YOUR personal schema, not a shared "training" schema. The SQL queries below use `{CATALOG}.{WRITE_SCHEMA}` placeholders that you'll need to replace with your actual catalog and schema.

**For SQL cells below**: Replace `{CATALOG}.{WRITE_SCHEMA}` with your values from the configuration cell above.
- Example: If WRITE_SCHEMA is `jane_smith_company_com`, use `dwx_airops_insights_platform_dev_workspace.jane_smith_company_com`

**Or convert to Python**: Use `spark.sql(f"SELECT ... FROM {CATALOG}.{WRITE_SCHEMA}.system_billing ...")`

### Total Cost by Day

# Total Cost by Day
spark.sql(f"""
SELECT 
  usage_date,
  SUM(usage_quantity * list_price) as total_cost,
  COUNT(*) as num_operations
FROM {CATALOG}.{WRITE_SCHEMA}.system_billing
GROUP BY usage_date
ORDER BY usage_date DESC
LIMIT 10
""").display()

In [0]:
%sql
-- Total Cost by Day
SELECT 
  usage_date,
  SUM(usage_quantity * list_price) as total_cost,
  COUNT(*) as num_operations
FROM training.system_billing
GROUP BY usage_date
ORDER BY usage_date DESC
LIMIT 10;


# Cost by Workspace
spark.sql(f"""
SELECT 
  workspace_id,
  SUM(usage_quantity * list_price) as total_cost,
  COUNT(*) as num_operations,
  AVG(usage_quantity * list_price) as avg_cost_per_operation
FROM {CATALOG}.{WRITE_SCHEMA}.system_billing
WHERE usage_date >= CURRENT_DATE - 30
GROUP BY workspace_id
ORDER BY total_cost DESC
""").display()

In [0]:
%sql
-- Cost by Workspace
SELECT 
  workspace_id,
  SUM(usage_quantity * list_price) as total_cost,
  COUNT(*) as num_operations,
  AVG(usage_quantity * list_price) as avg_cost_per_operation
FROM training.system_billing
WHERE usage_date >= CURRENT_DATE - 30
GROUP BY workspace_id
ORDER BY total_cost DESC;


# Cost by SKU Type
spark.sql(f"""
SELECT 
  sku_name,
  SUM(usage_quantity) as total_dbus,
  SUM(usage_quantity * list_price) as total_cost,
  AVG(usage_quantity * list_price) as avg_cost_per_operation,
  COUNT(*) as operations
FROM {CATALOG}.{WRITE_SCHEMA}.system_billing
WHERE usage_date >= CURRENT_DATE - 30
GROUP BY sku_name
ORDER BY total_cost DESC
""").display()

In [0]:
%sql
-- Cost by SKU Type
SELECT 
  sku_name,
  SUM(usage_quantity) as total_dbus,
  SUM(usage_quantity * list_price) as total_cost,
  AVG(usage_quantity * list_price) as avg_cost_per_operation,
  COUNT(*) as operations
FROM training.system_billing
WHERE usage_date >= CURRENT_DATE - 30
GROUP BY sku_name
ORDER BY total_cost DESC;


# Cost by User
spark.sql(f"""
SELECT 
  usage_metadata.user,
  COUNT(*) as operations,
  SUM(usage_quantity * list_price) as total_cost,
  AVG(usage_quantity * list_price) as avg_cost_per_operation
FROM {CATALOG}.{WRITE_SCHEMA}.system_billing
WHERE usage_date >= CURRENT_DATE - 30
GROUP BY usage_metadata.user
ORDER BY total_cost DESC
LIMIT 10
""").display()

In [0]:
%sql
-- Cost by User
SELECT 
  usage_metadata.user,
  COUNT(*) as operations,
  SUM(usage_quantity * list_price) as total_cost,
  AVG(usage_quantity * list_price) as avg_cost_per_operation
FROM training.system_billing
WHERE usage_date >= CURRENT_DATE - 30
GROUP BY usage_metadata.user
ORDER BY total_cost DESC
LIMIT 10;


# Consolidated query performance summary
spark.sql(f"""
SELECT 
  query_type,
  COUNT(*) as query_count,
  AVG(execution_time_ms) as avg_duration_ms,
  AVG(compute_cost) as avg_cost,
  SUM(compute_cost) as total_cost
FROM {CATALOG}.{WRITE_SCHEMA}.query_history
WHERE query_start_time >= CURRENT_DATE - 7
GROUP BY query_type
ORDER BY total_cost DESC
""").display()

In [0]:
%sql
-- Consolidated query performance summary
SELECT 
  query_type,
  COUNT(*) as query_count,
  AVG(execution_time_ms) as avg_duration_ms,
  AVG(compute_cost) as avg_cost,
  SUM(compute_cost) as total_cost
FROM training.query_history
WHERE query_start_time >= CURRENT_DATE - 7
GROUP BY query_type
ORDER BY total_cost DESC;


**Key Insights:**
- Identify which query types are driving costs
- Spot performance optimization opportunities
- Track usage patterns across your team

For more detailed analysis, explore the [Observability Dashboards](https://github.com/CodyAustinDavis/dbsql_sme/tree/main/Observability%20Dashboards%20and%20DBA%20Resources) examples.


In [0]:
---

## 5. Unity Catalog Access Management

One of the most important aspects of data governance is controlling who can access what data. Unity Catalog provides fine-grained access control at multiple levels: catalogs, schemas, tables, columns, and even rows.

Let's walk through a practical example of setting up access controls for our IoT sensor data.


### Step 1: Create Groups

Groups make it easier to manage permissions at scale. Instead of granting access to individual users, you grant it to groups.

**Common groups for an IoT project:**
- `data_engineers` - Can read/write raw and processed data
- `data_analysts` - Can read processed data and create dashboards
- `ml_engineers` - Can read data and create/deploy ML models
- `executives` - Read-only access to dashboards and reports


In [0]:
%sql
-- Create a group for data analysts
-- Note: In Databricks, groups are typically managed at the account level
-- These commands would be run by an account admin

CREATE GROUP IF NOT EXISTS `data_analysts`;
CREATE GROUP IF NOT EXISTS `ml_engineers`;
CREATE GROUP IF NOT EXISTS `data_engineers`;


### Step 2: Add Users to Groups

Once groups are created, you can add users to them. Users inherit all permissions granted to the groups they belong to.


In [None]:
%sql
-- Add users to groups
-- Replace with actual user emails from your organization

ALTER GROUP `data_analysts` ADD USER `jane.smith@company.com`;
ALTER GROUP `data_analysts` ADD USER `john.doe@company.com`;

ALTER GROUP `ml_engineers` ADD USER `ml.engineer@company.com`;

ALTER GROUP `data_engineers` ADD USER `data.engineer@company.com`;

-- You can also remove users from groups
-- ALTER GROUP `data_analysts` REMOVE USER `user@company.com`;


### Step 3: Grant Table Access

Now let's grant the appropriate permissions on our IoT tables. Unity Catalog uses a hierarchical permission model:

**Hierarchy:** Catalog â†’ Schema â†’ Table

**Permission Types:**
- `USE CATALOG` - Required to see/access a catalog
- `USE SCHEMA` - Required to see/access a schema
- `SELECT` - Read data from tables
- `MODIFY` - Update/delete data
- `CREATE TABLE` - Create new tables in a schema
- `ALL PRIVILEGES` - Full control

Let's grant permissions for our sensor and inspection data:


In [None]:
%sql
-- First, grant catalog and schema level permissions
-- Data analysts need to USE the catalog and schema to access tables

GRANT USE CATALOG ON CATALOG training TO `data_analysts`;
GRANT USE SCHEMA ON SCHEMA training.default TO `data_analysts`;

-- Grant SELECT on sensor tables (from Day 1 transformation notebook)
GRANT SELECT ON TABLE training.sensor_bronze TO `data_analysts`;
GRANT SELECT ON TABLE training.sensor_silver TO `data_analysts`;

-- Grant SELECT on inspection data (used in semantic modeling)
GRANT SELECT ON TABLE training.inspection_bronze TO `data_analysts`;
GRANT SELECT ON TABLE training.inspection_gold TO `data_analysts`;

-- Grant SELECT on dimension tables
GRANT SELECT ON TABLE training.dim_factories TO `data_analysts`;
GRANT SELECT ON TABLE training.dim_devices TO `data_analysts`;
GRANT SELECT ON TABLE training.dim_models TO `data_analysts`;


### Step 4: Grant Permissions to ML Engineers

ML Engineers need broader access - they need to read data and create/modify tables for feature engineering and model storage:


In [None]:
%sql
-- Grant ML engineers read access to all data
GRANT USE CATALOG ON CATALOG training TO `ml_engineers`;
GRANT USE SCHEMA ON SCHEMA training.default TO `ml_engineers`;

GRANT SELECT ON TABLE training.sensor_silver TO `ml_engineers`;
GRANT SELECT ON TABLE training.inspection_gold TO `ml_engineers`;
GRANT SELECT ON TABLE training.dim_factories TO `ml_engineers`;
GRANT SELECT ON TABLE training.dim_devices TO `ml_engineers`;
GRANT SELECT ON TABLE training.dim_models TO `ml_engineers`;

-- Also allow them to create tables in a dedicated ML schema
-- GRANT CREATE TABLE ON SCHEMA training.ml_features TO `ml_engineers`;
-- GRANT MODIFY ON SCHEMA training.ml_features TO `ml_engineers`;


### Step 5: View Current Permissions

You can check what permissions have been granted using the `SHOW GRANTS` command:


In [None]:
%sql
-- View all grants on a specific table
SHOW GRANTS ON TABLE training.sensor_silver;

-- View all grants for a specific group
-- SHOW GRANTS TO `data_analysts`;

-- View grants on an entire schema
-- SHOW GRANTS ON SCHEMA training.default;


### Revoking Permissions

If you need to remove access, use the `REVOKE` command:

```sql
-- Revoke SELECT on a specific table
REVOKE SELECT ON TABLE training.sensor_bronze FROM `data_analysts`;

-- Revoke all privileges on a schema
REVOKE ALL PRIVILEGES ON SCHEMA training.default FROM `data_analysts`;
```

**Best Practices for Access Management:**

1. **Use groups, not individual users** - Makes management much easier
2. **Grant minimum necessary permissions** - Start with read-only, add write permissions only when needed
3. **Use separate schemas for different purposes** - e.g., `raw`, `processed`, `ml_features`, `production`
4. **Document your permission model** - Keep track of which groups have access to what
5. **Regular audits** - Review permissions quarterly to ensure they're still appropriate
6. **Use service principals for automation** - Don't use personal accounts for scheduled jobs


# View user permissions
spark.sql(f"""
SELECT * FROM {CATALOG}.{WRITE_SCHEMA}.user_permissions
""").display()

In [0]:
%sql
-- View user permissions
SELECT * FROM training.user_permissions;


### Grant Privileges (Examples)

**Note:** These are example commands. In production, you would grant privileges to actual user groups.

**Grant SELECT on schema:**
```sql
GRANT SELECT ON SCHEMA training TO `data-analysts`;
```

**Grant table access:**
```sql
GRANT SELECT ON TABLE training.system_billing TO `data-analysts`;
```

**Grant usage on catalog:**
```sql
GRANT USAGE ON CATALOG <your_catalog> TO `data-analysts`;
```

### Row-Level Security (Conceptual Example)

Unity Catalog supports row filters to restrict data based on user permissions. Here's how it works:

**Step 1: Create a filter function**
```sql
CREATE FUNCTION training.filter_by_region(region STRING)
RETURN region IN (
  SELECT region FROM training.user_permissions 
  WHERE user_email = current_user()
);
```

**Step 2: Apply the filter to a table**
```sql
ALTER TABLE <your_table>
SET ROW FILTER training.filter_by_region(region) ON (region);
```

This ensures users only see data for their authorized regions.

### Column Masking (Conceptual Example)

Mask sensitive columns based on user roles:

**Step 1: Create masking function**
```sql
CREATE FUNCTION training.mask_device_id(device_id STRING)
RETURN CASE 
  WHEN is_member('admin') THEN device_id
  ELSE CONCAT('***', RIGHT(device_id, 4))
END;
```

**Step 2: Apply mask to column**
```sql
ALTER TABLE <your_table>
ALTER COLUMN device_id
SET MASK training.mask_device_id;
```

Non-admin users will only see masked device IDs (e.g., "***1234").


# Create a row filter function that restricts data by region
# Note: This is a conceptual example - replace with your actual catalog/schema
spark.sql(f"""
CREATE OR REPLACE FUNCTION {CATALOG}.{WRITE_SCHEMA}.filter_by_region(region STRING)
RETURN region IN (
  SELECT region FROM {CATALOG}.{WRITE_SCHEMA}.user_permissions 
  WHERE user_email = current_user()
)
""").display()

print(f"âœ… Function created: {CATALOG}.{WRITE_SCHEMA}.filter_by_region")
print("\nTo apply the filter to a table:")
print(f"ALTER TABLE your_table")
print(f"SET ROW FILTER {CATALOG}.{WRITE_SCHEMA}.filter_by_region(region) ON (region);")

In [None]:
%sql
-- Create a row filter function that restricts data by region
CREATE OR REPLACE FUNCTION training.filter_by_region(region STRING)
RETURN region IN (
  SELECT region FROM training.user_permissions 
  WHERE user_email = current_user()
);

-- Apply the filter to the sensor table
-- ALTER TABLE training.sensor_silver
-- SET ROW FILTER training.filter_by_region(region) ON (region);

-- Now when users query sensor_silver, they only see data from their authorized regions!


# Create a masking function for device IDs
# Note: This is a conceptual example
spark.sql(f"""
CREATE OR REPLACE FUNCTION {CATALOG}.{WRITE_SCHEMA}.mask_device_id(device_id STRING)
RETURN CASE 
  WHEN is_member('data_engineers') OR is_member('ml_engineers') THEN device_id
  ELSE CONCAT('***', RIGHT(device_id, 4))
END
""").display()

print(f"âœ… Function created: {CATALOG}.{WRITE_SCHEMA}.mask_device_id")
print("\nTo apply the mask to a column:")
print(f"ALTER TABLE your_table")
print(f"ALTER COLUMN device_id")
print(f"SET MASK {CATALOG}.{WRITE_SCHEMA}.mask_device_id;")

In [None]:
%sql
-- Create a masking function for device IDs
CREATE OR REPLACE FUNCTION training.mask_device_id(device_id STRING)
RETURN CASE 
  WHEN is_member('data_engineers') OR is_member('ml_engineers') THEN device_id
  ELSE CONCAT('***', RIGHT(device_id, 4))
END;

-- Apply the mask to the device_id column
-- ALTER TABLE training.sensor_silver
-- ALTER COLUMN device_id
-- SET MASK training.mask_device_id;

-- Now data_analysts will see "***1234" instead of full device IDs!


**Why Use Row Filters and Column Masks?**

1. **Data Privacy Compliance** - Meet GDPR, HIPAA, and other regulatory requirements
2. **Multi-tenant Security** - Allow multiple teams to share the same table while only seeing their data
3. **Automatic Enforcement** - Security is enforced at the platform level, not in application logic
4. **Transparent to Users** - Users don't need to know about the filtering, they just query normally
5. **Centralized Management** - Change permissions in one place, affects all queries automatically

**Use Cases for IoT Project:**
- Regional managers only see factories in their region
- Contractors can't see device serial numbers (masked)
- Finance team sees cost data but not technical sensor readings
- Third-party auditors get read-only access with masked PII


# Track data access events
spark.sql(f"""
SELECT 
  event_time,
  user_email,
  action_name,
  table_full_name,
  workspace_id,
  source_ip
FROM {CATALOG}.{WRITE_SCHEMA}.audit_logs
WHERE action_name IN ('SELECT', 'UPDATE', 'DELETE', 'INSERT')
  AND event_date >= CURRENT_DATE - 7
ORDER BY event_time DESC
LIMIT 50
""").display()

In [0]:
%sql
-- Track data access events
SELECT 
  event_time,
  user_email,
  action_name,
  table_full_name,
  workspace_id,
  source_ip
FROM training.audit_logs
WHERE action_name IN ('SELECT', 'UPDATE', 'DELETE', 'INSERT')
  AND event_date >= CURRENT_DATE - 7
ORDER BY event_time DESC
LIMIT 50;


# Table access patterns by user
spark.sql(f"""
SELECT 
  user_email,
  table_full_name,
  COUNT(*) as access_count,
  COUNT(DISTINCT DATE(event_time)) as days_accessed,
  MAX(event_time) as last_accessed
FROM {CATALOG}.{WRITE_SCHEMA}.audit_logs
WHERE event_date >= CURRENT_DATE - 7
  AND action_name = 'SELECT'
GROUP BY user_email, table_full_name
ORDER BY access_count DESC
LIMIT 20
""").display()

In [0]:
%sql
-- Table access patterns by user
SELECT 
  user_email,
  table_full_name,
  COUNT(*) as access_count,
  COUNT(DISTINCT DATE(event_time)) as days_accessed,
  MAX(event_time) as last_accessed
FROM training.audit_logs
WHERE event_date >= CURRENT_DATE - 7
  AND action_name = 'SELECT'
GROUP BY user_email, table_full_name
ORDER BY access_count DESC
LIMIT 20;


# Track schema changes (CREATE, DROP, ALTER)
spark.sql(f"""
SELECT 
  event_time,
  user_email,
  action_name,
  table_full_name,
  workspace_id
FROM {CATALOG}.{WRITE_SCHEMA}.audit_logs
WHERE action_name IN ('CREATE_TABLE', 'DROP_TABLE', 'ALTER_TABLE')
  AND event_date >= CURRENT_DATE - 7
ORDER BY event_time DESC
""").display()

In [0]:
%sql
-- Track schema changes (CREATE, DROP, ALTER)
SELECT 
  event_time,
  user_email,
  action_name,
  table_full_name,
  workspace_id
FROM training.audit_logs
WHERE action_name IN ('CREATE_TABLE', 'DROP_TABLE', 'ALTER_TABLE')
  AND event_date >= CURRENT_DATE - 7
ORDER BY event_time DESC;


# Daily cost trend (optimized for line chart)
spark.sql(f"""
SELECT 
  usage_date,
  SUM(usage_quantity * list_price) as total_cost,
  SUM(CASE WHEN sku_name = 'JOBS_COMPUTE' THEN usage_quantity * list_price ELSE 0 END) as jobs_cost,
  SUM(CASE WHEN sku_name = 'SQL_COMPUTE' THEN usage_quantity * list_price ELSE 0 END) as sql_cost,
  SUM(CASE WHEN sku_name = 'ALL_PURPOSE_COMPUTE' THEN usage_quantity * list_price ELSE 0 END) as all_purpose_cost
FROM {CATALOG}.{WRITE_SCHEMA}.system_billing
GROUP BY usage_date
ORDER BY usage_date
""").display()

In [0]:
%sql
-- Daily cost trend (optimized for line chart)
SELECT 
  usage_date,
  SUM(usage_quantity * list_price) as total_cost,
  SUM(CASE WHEN sku_name = 'JOBS_COMPUTE' THEN usage_quantity * list_price ELSE 0 END) as jobs_cost,
  SUM(CASE WHEN sku_name = 'SQL_COMPUTE' THEN usage_quantity * list_price ELSE 0 END) as sql_cost,
  SUM(CASE WHEN sku_name = 'ALL_PURPOSE_COMPUTE' THEN usage_quantity * list_price ELSE 0 END) as all_purpose_cost
FROM training.system_billing
GROUP BY usage_date
ORDER BY usage_date;


# Forecast daily costs for the next 7 days using AI
# Note: This requires a SQL Warehouse connection
spark.sql(f"""
SELECT 
  usage_date,
  total_cost,
  forecast,
  lower_bound,
  upper_bound
FROM (
  SELECT 
    ai_forecast(
      (SELECT usage_date, SUM(usage_quantity * list_price) as total_cost
       FROM {CATALOG}.{WRITE_SCHEMA}.system_billing
       GROUP BY usage_date
       ORDER BY usage_date),
      horizon => 7
    )
  )
ORDER BY usage_date DESC
""").display()

In [None]:
%sql
-- Forecast daily costs for the next 7 days using AI
-- Note: This requires a SQL Warehouse connection

SELECT 
  usage_date,
  total_cost,
  forecast,
  lower_bound,
  upper_bound
FROM (
  SELECT 
    ai_forecast(
      (SELECT usage_date, SUM(usage_quantity * list_price) as total_cost
       FROM training.system_billing
       GROUP BY usage_date
       ORDER BY usage_date),
      horizon => 7
    )
  )
ORDER BY usage_date DESC


# Forecast costs by workspace for capacity planning
# Run this on a SQL Warehouse
spark.sql(f"""
SELECT 
  workspace_id,
  usage_date,
  forecast as predicted_cost
FROM (
  SELECT 
    workspace_id,
    ai_forecast(
      (SELECT usage_date, SUM(usage_quantity * list_price) as cost
       FROM {CATALOG}.{WRITE_SCHEMA}.system_billing
       WHERE workspace_id = ws.workspace_id
       GROUP BY usage_date
       ORDER BY usage_date),
      horizon => 7
    )
  FROM (SELECT DISTINCT workspace_id FROM {CATALOG}.{WRITE_SCHEMA}.system_billing) ws
)
WHERE usage_date >= CURRENT_DATE
ORDER BY workspace_id, usage_date
""").display()

In [None]:
%sql
-- Forecast costs by workspace for capacity planning
-- Run this on a SQL Warehouse

SELECT 
  workspace_id,
  usage_date,
  forecast as predicted_cost
FROM (
  SELECT 
    workspace_id,
    ai_forecast(
      (SELECT usage_date, SUM(usage_quantity * list_price) as cost
       FROM training.system_billing
       WHERE workspace_id = ws.workspace_id
       GROUP BY usage_date
       ORDER BY usage_date),
      horizon => 7
    )
  FROM (SELECT DISTINCT workspace_id FROM training.system_billing) ws
)
WHERE usage_date >= CURRENT_DATE
ORDER BY workspace_id, usage_date


### Use Cases for AI Forecast

**Cost Management:**
- Predict monthly spending for budget planning
- Identify cost spikes before they happen
- Allocate resources based on forecasted demand

**Capacity Planning:**
- Forecast compute usage by team/project
- Plan infrastructure scaling
- Optimize reserved capacity purchases

**Reference:** [AI Forecast Documentation](https://docs.databricks.com/aws/en/sql/language-manual/functions/ai_forecast)

**ðŸ’¡ Pro Tip:** Create a scheduled job that runs these forecasts daily and sends alerts when predicted costs exceed thresholds.

---

**Additional Resources:**
- [System Tables Guide](https://docs.databricks.com/aws/en/admin/system-tables/)
- [Unity Catalog Security](https://docs.databricks.com/aws/en/data-governance/unity-catalog/access-control)
- [Observability Examples](https://github.com/CodyAustinDavis/dbsql_sme/tree/main/Observability%20Dashboards%20and%20DBA%20Resources)
- [Cost Management](https://docs.databricks.com/aws/en/admin/account-settings/usage-detail-tags-aws)
- [Row Filters and Column Masks](https://docs.databricks.com/aws/en/data-governance/unity-catalog/row-and-column-filters)
- [AI Forecast Function](https://docs.databricks.com/aws/en/sql/language-manual/functions/ai_forecast)

---

**ðŸŽ‰ You've completed Day 3!** You now have the skills to build end-to-end data and ML pipelines, monitor costs, forecast future spending with AI, and govern your Databricks workspace effectively.
