# Dashboards and Genie: Data Visualization and AI-Powered Analysis

This notebook provides a hands-on introduction to two powerful Databricks features:
- **Dashboards** - Create interactive visualizations and reports
- **Genie** - Ask questions about your data in natural language

## What You'll Learn

✅ Create visualizations from SQL queries  
✅ Build interactive dashboards  
✅ Use filters and parameters  
✅ Set up Genie for natural language queries  
✅ Ask business questions without writing SQL  

---

## Table of Contents

1. [Dashboards Overview](#dashboards)
2. [Creating Your First Dashboard](#create-dashboard)
3. [Adding Visualizations](#visualizations)
4. [Interactive Features](#interactive)
5. [Genie Overview](#genie)
6. [Setting Up Genie](#genie-setup)
7. [Querying with Natural Language](#nl-queries)

---

**References:**
- [Create Dashboard Tutorial](https://docs.databricks.com/aws/en/dashboards/tutorials/create-dashboard)
- [Genie Setup Guide](https://docs.databricks.com/aws/en/genie/set-up)


In [None]:
# Configuration: Update with your catalog and schema
CATALOG = 'default'
SCHEMA = 'db_crash_course'

print(f"Using: {CATALOG}.{SCHEMA}")


## 1. Dashboards Overview <a id="dashboards"></a>

**Databricks Dashboards** let you:
- Visualize data from SQL queries
- Create interactive reports
- Share insights with stakeholders
- Schedule automatic refreshes
- Add filters and parameters for interactivity

### Key Features:

- **Multiple visualization types**: Bar, line, pie, scatter, counter, table, and more
- **Interactive filters**: Let users explore data dynamically
- **Real-time updates**: Dashboards reflect live data
- **Easy sharing**: Share with teams or publish externally
- **Schedule refreshes**: Keep dashboards up-to-date automatically

### Use Cases for Our IoT Dataset:

- Monitor device health across factories
- Track temperature anomalies in real-time
- Analyze defect rates by model and location
- Display KPIs for operations teams


## 2. Creating Your First Dashboard <a id="create-dashboard"></a>

Let's create a dashboard to monitor our IoT sensor data!

### Step 1: Create the Base Queries

First, we'll create several SQL queries that will power our dashboard visualizations.


In [None]:
### Query 1: Device Count by Factory

# Count active devices per factory
query_devices = f"""
SELECT 
    f.factory_name,
    f.region,
    COUNT(DISTINCT d.device_id) as device_count
FROM {CATALOG}.{SCHEMA}.dim_devices d
JOIN {CATALOG}.{SCHEMA}.dim_factories f ON d.factory_id = f.factory_id
WHERE d.status = 'active'
GROUP BY f.factory_name, f.region
ORDER BY device_count DESC
"""

spark.sql(query_devices).display()


In [None]:
### Query 2: Temperature Trends Over Time

# Average temperature by day
query_temp_trends = f"""
SELECT 
    DATE(timestamp) as date,
    AVG(temperature) as avg_temperature,
    MIN(temperature) as min_temperature,
    MAX(temperature) as max_temperature
FROM {CATALOG}.{SCHEMA}.sensor_bronze
WHERE timestamp IS NOT NULL
GROUP BY DATE(timestamp)
ORDER BY date
"""

spark.sql(query_temp_trends).display()


In [None]:
### Query 3: Anomaly Count by Factory

# Count anomalies per factory
query_anomalies = f"""
SELECT 
    f.factory_name,
    f.city,
    COUNT(*) as anomaly_count
FROM {CATALOG}.{SCHEMA}.anomaly_detected a
JOIN {CATALOG}.{SCHEMA}.dim_factories f ON a.factory_id = f.factory_id
GROUP BY f.factory_name, f.city
ORDER BY anomaly_count DESC
"""

spark.sql(query_anomalies).display()


In [None]:
### Query 4: Defect Rate by Model

# Calculate defect rate for each model
query_defects = f"""
SELECT 
    m.model_name,
    m.model_category,
    SUM(CASE WHEN ig.defect = 1 THEN ig.count ELSE 0 END) as defects,
    SUM(ig.count) as total_inspections,
    ROUND(100.0 * SUM(CASE WHEN ig.defect = 1 THEN ig.count ELSE 0 END) / SUM(ig.count), 2) as defect_rate_pct
FROM {CATALOG}.{SCHEMA}.inspection_gold ig
JOIN {CATALOG}.{SCHEMA}.dim_models m ON ig.model_id = m.model_id
GROUP BY m.model_name, m.model_category
HAVING total_inspections > 10
ORDER BY defect_rate_pct DESC
"""

spark.sql(query_defects).display()


In [None]:
### Query 5: Key Metrics (for Counter Visualizations)

# Overall KPIs
query_kpis = f"""
SELECT 
    COUNT(DISTINCT device_id) as total_devices,
    COUNT(*) as total_readings,
    (SELECT COUNT(*) FROM {CATALOG}.{SCHEMA}.anomaly_detected) as total_anomalies,
    ROUND(AVG(temperature), 2) as avg_temperature
FROM {CATALOG}.{SCHEMA}.sensor_bronze
"""

spark.sql(query_kpis).display()


## 3. Building the Dashboard in SQL Editor <a id="visualizations"></a>

Now let's create the actual dashboard using the Databricks UI:

### Step-by-Step Instructions:

#### 1. Create SQL Queries

1. Go to **SQL Editor** in Databricks
2. For each query above, create a new SQL query:
   - Click **Create** → **Query**
   - Paste the SQL (replace the f-string variables with actual values)
   - Click **Run** to test
   - Click **Save** and give it a descriptive name (e.g., "IoT Device Count by Factory")

#### 2. Create a New Dashboard

1. Click **Dashboards** in the left sidebar
2. Click **Create Dashboard**
3. Give it a name: "IoT Monitoring Dashboard"
4. Click **Create**

#### 3. Add Visualizations

For each saved query:

**Device Count by Factory** (Bar Chart):
1. Click **Add** → **Visualization**
2. Select your "Device Count by Factory" query
3. Choose **Bar** chart
4. Configure:
   - X-axis: `factory_name`
   - Y-axis: `device_count`
   - Group by: `region` (optional, for colored bars)
5. Click **Add to dashboard**

**Temperature Trends** (Line Chart):
1. Add another visualization
2. Select "Temperature Trends" query
3. Choose **Line** chart
4. Configure:
   - X-axis: `date`
   - Y-axis: `avg_temperature`, `min_temperature`, `max_temperature`
5. Add to dashboard

**Anomaly Count** (Pie Chart):
1. Add visualization
2. Select "Anomaly Count by Factory" query
3. Choose **Pie** chart
4. Configure:
   - Category: `factory_name`
   - Value: `anomaly_count`
5. Add to dashboard

**Defect Rate** (Table):
1. Add visualization
2. Select "Defect Rate by Model" query
3. Choose **Table**
4. Add to dashboard

**KPIs** (Counter Visualizations):
1. Add 4 separate counter visualizations
2. For each, select the KPIs query
3. Choose **Counter**
4. Select the appropriate column (total_devices, total_readings, etc.)
5. Add custom labels and formatting

#### 4. Arrange and Style

1. Drag visualizations to arrange them
2. Resize by dragging corners
3. Add text boxes for titles/descriptions
4. Choose a color scheme


## 4. Adding Interactive Features <a id="interactive"></a>

### Dashboard Filters

Make your dashboard interactive by adding filters:

**Example: Add a Factory Filter**

1. Edit your dashboard
2. Click **Add** → **Filter**
3. Select a query that has `factory_name`
4. Choose `factory_name` as the filter field
5. Apply the filter to relevant visualizations

Now users can select a factory and see all metrics filtered!

### Query Parameters

Add parameters to queries for dynamic filtering:

```sql
-- Add parameter to query
SELECT 
    factory_name,
    AVG(temperature) as avg_temp
FROM {CATALOG}.{SCHEMA}.sensor_bronze s
JOIN {CATALOG}.{SCHEMA}.dim_factories f ON s.factory_id = f.factory_id
WHERE f.region = {{region}}  -- Parameter syntax
GROUP BY factory_name
```

### Dashboard Features:

- **Schedule Refresh**: Keep data up-to-date automatically
- **Share**: Grant access to other users or groups
- **Subscribe**: Get email updates when data changes
- **Full Screen**: Present dashboards in meetings
- **Export**: Download as PDF or CSV


## 5. Genie Overview <a id="genie"></a>

**Databricks Genie** is an AI-powered analytics tool that lets you query your data using natural language. No SQL required!

### What is Genie?

Genie allows business users to:
- Ask questions in plain English
- Get instant answers with visualizations
- Explore data without knowing SQL
- Share insights with colleagues

### How It Works:

1. You describe your data sources and business terms (create a "Genie Space")
2. Users ask questions in natural language
3. Genie translates to SQL and executes queries
4. Results are presented with appropriate visualizations

### Example Questions You Can Ask:

- "What is the average temperature by factory?"
- "Show me devices with the most anomalies"
- "Which model has the highest defect rate?"
- "How many sensor readings do we have per day?"
- "What's the trend in rotation speed over time?"

### Benefits:

✅ **Democratize data access** - Anyone can query data  
✅ **Reduce analyst workload** - Self-service analytics  
✅ **Faster insights** - Get answers in seconds  
✅ **Iterative exploration** - Ask follow-up questions  
✅ **Automatic visualizations** - Genie picks the best chart type


## 6. Setting Up Genie <a id="genie-setup"></a>

### Prerequisites:

- Unity Catalog enabled workspace
- SQL warehouse with access to your data
- Appropriate permissions to create Genie Spaces

### Step 1: Create a Genie Space

1. Navigate to **Genie** in the Databricks sidebar
2. Click **Create Genie Space**
3. Give it a name: "IoT Sensor Analytics"
4. Add a description: "Analyze IoT sensor data, anomalies, and device performance"

### Step 2: Add Data Sources

Select the tables that Genie can query:

1. Click **Add tables**
2. Navigate to your schema
3. Select relevant tables:
   - `sensor_bronze`
   - `dim_factories`
   - `dim_models`
   - `dim_devices`
   - `inspection_gold`
   - `anomaly_detected`
4. Click **Add**

### Step 3: Define Business Terms (Semantic Layer)

Help Genie understand your domain by adding business terms:

**Example definitions:**

- **Anomaly**: "A sensor reading that exceeds normal operating thresholds"
- **Defect rate**: "The percentage of inspections that found defects"
- **High temperature**: "Temperature readings above 80 degrees"
- **Active device**: "A device with status = 'active'"

**To add definitions:**
1. In your Genie Space settings
2. Click **Add business term**
3. Enter term and definition
4. Save

### Step 4: Add Instructions (Optional)

Provide guidance to users:

```
This Genie Space helps you analyze IoT sensor data from our manufacturing facilities.

Key metrics:
- Temperature: Measured in Fahrenheit
- Rotation speed: RPM (rotations per minute)
- Air pressure: Measured in pascals

You can ask questions about:
- Device performance by factory or model
- Temperature trends and anomalies
- Defect rates and quality metrics
```

### Step 5: Set Permissions

1. Click **Share**
2. Add users or groups who can use this Genie Space
3. Set permissions (Can view, Can edit, etc.)


## 7. Querying with Natural Language <a id="nl-queries"></a>

### Try These Questions in Your Genie Space:

Once your Genie Space is set up, try asking these questions:

#### Basic Questions:

1. **"How many devices do we have?"**
   - Genie will count distinct devices

2. **"What is the average temperature?"**
   - Genie will aggregate sensor readings

3. **"Show me all factories"**
   - Genie will query the factories dimension table

#### Intermediate Questions:

4. **"What is the average temperature by factory?"**
   - Genie will join sensors with factories and group by

5. **"Which device has the most anomalies?"**
   - Genie will query anomaly_detected and rank devices

6. **"Show me defect rate by model category"**
   - Genie will calculate defect percentages grouped by category

#### Advanced Questions:

7. **"Show me the trend in average temperature over time"**
   - Genie will create a line chart of temperature trends

8. **"Which factories have the highest defect rates and show me their locations?"**
   - Genie will join multiple tables and sort results

9. **"Compare rotation speed between SkyJet and EcoJet models"**
   - Genie will filter and compare model families

10. **"What percentage of readings are anomalies?"**
    - Genie will calculate the ratio

### Tips for Better Results:

✅ **Be specific** - Include table names, column names if Genie struggles  
✅ **Use business terms** - Reference the terms you defined  
✅ **Ask follow-ups** - Refine your question based on results  
✅ **Request visualizations** - "Show me as a bar chart"  
✅ **Provide context** - "In the last 30 days..." or "For factory A06..."  

### Example Genie Conversation:

**User:** "What is the average temperature by factory?"

**Genie:** [Generates query, shows bar chart with avg temp per factory]

**User:** "Now show only factories with average temperature above 70"

**Genie:** [Filters previous results]

**User:** "Add the number of devices per factory to that"

**Genie:** [Adds device count column]


## Practice Exercises

### Exercise 1: Create a Custom Dashboard

Build a dashboard that answers these business questions:
1. What is our overall equipment effectiveness?
2. Which factories need attention?
3. Are there seasonal patterns in defects?
4. Which models are most reliable?

### Exercise 2: Use Genie for Ad-hoc Analysis

Open your Genie Space and answer:
1. What is the correlation between temperature and defects?
2. Which time of day has the most anomalies?
3. How does performance vary by device age?
4. What is the failure rate for each model family?

### Exercise 3: Create a Real-time Monitoring Dashboard

1. Create queries that show:
   - Current device status (up/down)
   - Recent anomalies (last hour)
   - Live sensor readings
2. Add filters for factory and model
3. Schedule automatic refresh every 5 minutes
4. Share with operations team


## Summary

In this notebook, you learned:

✅ **Create Dashboards** - Build interactive visualizations from SQL queries  
✅ **Visualization types** - Bar, line, pie, counter, table, and more  
✅ **Interactive features** - Filters, parameters, scheduled refreshes  
✅ **Genie Spaces** - Set up AI-powered natural language querying  
✅ **Natural language queries** - Ask questions without writing SQL  

### Key Takeaways:

1. **Dashboards** provide visual monitoring and reporting
2. **SQL queries** power dashboard visualizations
3. **Filters and parameters** make dashboards interactive
4. **Genie** democratizes data access for non-technical users
5. **Business terms** help Genie understand your domain

### Best Practices:

**For Dashboards:**
- Start with key questions you need to answer
- Use appropriate visualization types for each metric
- Add filters for user interactivity
- Schedule refreshes to keep data current
- Document what each visualization shows

**For Genie:**
- Carefully select relevant tables
- Define clear business terms
- Provide usage instructions
- Test with common questions
- Iterate based on user feedback

### Next Steps:

- Build a production dashboard for your team
- Create a Genie Space for business users
- Explore **AutoML** to add predictive insights
- Combine dashboards with **Alerts** for proactive monitoring

---

**Additional Resources:**
- [Dashboards Documentation](https://docs.databricks.com/aws/en/dashboards/)
- [Genie Documentation](https://docs.databricks.com/aws/en/genie/)
- [Visualization Best Practices](https://docs.databricks.com/aws/en/dashboards/tutorials/)
