# Databricks Genie API - SDK Methods

This notebook demonstrates the **proper way** to use the Databricks Genie API using the official SDK methods from the [Databricks SDK documentation](https://databricks-sdk-py.readthedocs.io/en/stable/workspace/dashboards/genie.html).

## Key Advantages of SDK Methods:
- ✅ **Built-in waiting**: No manual polling required
- ✅ **Type safety**: Proper return types instead of raw JSON  
- ✅ **Error handling**: Better error management
- ✅ **Simpler code**: Much cleaner and easier to use
- ✅ **Official support**: These are the recommended methods


## Setup and Configuration

### Prerequisites
- Access to Azure Databricks workspace with Databricks SQL entitlement
- CAN USE privileges on a SQL pro or serverless SQL warehouse  
- Well-curated Genie space
- Proper authentication configured in `~/.databrickscfg`

### Configuration
- **Workspace URL**: `https://adb-984752964297111.11.azuredatabricks.net`
- **Genie Space ID**: `01f06a3068a81406a386e8eaefc74545`
- **Profile**: `DEFAULT_azure`
- **Environment**: `azure_databricks` conda environment


In [None]:
# Import required libraries
import os
from databricks.sdk import WorkspaceClient

# Set the correct profile
os.environ['DATABRICKS_CONFIG_PROFILE'] = 'DEFAULT_azure'

# Initialize the Databricks client
client = WorkspaceClient()

print("✅ Databricks client initialized successfully")
print(f"Host: {client.config.host}")
print(f"Profile: {client.config.profile}")
print(f"Auth Type: {client.config.auth_type}")

# Initialize Genie API
genie_api = client.genie
print("✅ Genie API initialized successfully")


## Method 1: Simple Query with Built-in Waiting

The easiest way to use Genie API is with `start_conversation_and_wait()` which handles all the polling automatically.


In [None]:
# Configuration
space_id = "01f06a3068a81406a386e8eaefc74545"
test_question = "What is the distribution of total charges for claims?"

print(f"🔮 Querying Genie with SDK method...")
print(f"Question: {test_question}")

try:
    # This method handles all the polling automatically!
    message = genie_api.start_conversation_and_wait(space_id, test_question)
    
    print("✅ Message completed successfully!")
    print(f"Status: {message.status}")
    print(f"Conversation ID: {message.conversation_id}")
    print(f"Message ID: {message.id}")
    
    # Check if there are attachments (query results)
    if message.attachments:
        print(f"\n📎 Found {len(message.attachments)} attachments:")
        for i, attachment in enumerate(message.attachments):
            print(f"Attachment {i+1}:")
            print(f"  - Type: {attachment.type}")
            print(f"  - Attachment ID: {attachment.attachment_id}")
            if hasattr(attachment, 'query') and attachment.query:
                print(f"  - SQL Query: {attachment.query.query}")
                print(f"  - Description: {attachment.query.description}")
    else:
        print("📎 No attachments found")
        
except Exception as e:
    print(f"❌ Error: {e}")
    import traceback
    traceback.print_exc()


## Method 2: Get Query Results

Once you have a message with attachments, you can get the actual query results using the SDK method.


In [None]:
# Get query results if we have attachments
if 'message' in locals() and message.attachments:
    attachment = message.attachments[0]  # Get first attachment
    attachment_id = attachment.attachment_id
    
    print(f"🔍 Getting query results for attachment: {attachment_id}")
    
    try:
        # Use the SDK method to get query results
        query_results = genie_api.get_message_attachment_query_result(
            space_id, 
            message.conversation_id, 
            message.id, 
            attachment_id
        )
        
        print("✅ Query results retrieved successfully!")
        
        # Extract and display the data
        if hasattr(query_results, 'statement_response') and query_results.statement_response:
            result = query_results.statement_response.result
            if hasattr(result, 'data_array') and result.data_array:
                print(f"\n📊 Data rows: {len(result.data_array)}")
                for i, row in enumerate(result.data_array):
                    print(f"Row {i+1}: {row}")
            else:
                print("📊 No data array found in results")
        else:
            print("📊 No statement response found")
            
    except Exception as e:
        print(f"❌ Error getting query results: {e}")
        import traceback
        traceback.print_exc()
else:
    print("❌ No attachments available to get query results")


## Method 3: Complete Workflow Function

Here's a complete function that uses the proper SDK methods for the entire Genie API workflow.


In [None]:
def query_genie_sdk(question, space_id, timeout_minutes=2):
    """
    Complete Genie API workflow using proper SDK methods
    
    Args:
        question (str): The question to ask Genie
        space_id (str): The Genie space ID
        timeout_minutes (int): Timeout in minutes (default: 2)
    
    Returns:
        dict: Complete response with question, message, and results
    """
    
    print(f"🔮 Querying Genie: {question}")
    
    try:
        # Step 1: Start conversation and wait for completion
        message = genie_api.start_conversation_and_wait(
            space_id, 
            question,
            timeout=timedelta(minutes=timeout_minutes)
        )
        
        print(f"✅ Message completed: {message.status}")
        
        # Step 2: Get query results if attachments exist
        results = None
        if message.attachments:
            attachment = message.attachments[0]
            attachment_id = attachment.attachment_id
            
            print(f"🔍 Getting query results...")
            query_results = genie_api.get_message_attachment_query_result(
                space_id,
                message.conversation_id,
                message.id,
                attachment_id
            )
            results = query_results
        
        return {
            'question': question,
            'status': message.status,
            'conversation_id': message.conversation_id,
            'message_id': message.id,
            'attachments': message.attachments,
            'query_results': results
        }
        
    except Exception as e:
        print(f"❌ Genie API error: {e}")
        return None

# Test the complete function
from datetime import timedelta

test_questions = [
    "What is the distribution of total charges for claims?",
    "Show me the top 5 claims by amount",
    "What is the average claim amount by provider?"
]

for question in test_questions:
    print(f"\n{'='*60}")
    result = query_genie_sdk(question, space_id)
    if result:
        print(f"✅ Successfully processed: {question}")
        print(f"Status: {result['status']}")
        if result['query_results']:
            print("📊 Query results available")
        else:
            print("📊 No query results")
    else:
        print(f"❌ Failed to process: {question}")
    print(f"{'='*60}")


## Available SDK Methods

Based on the [Databricks SDK documentation](https://databricks-sdk-py.readthedocs.io/en/stable/workspace/dashboards/genie.html), here are the key methods available:

### Core Methods
- `start_conversation_and_wait()` - Start conversation and wait for completion
- `get_message_attachment_query_result()` - Get query results from attachments
- `list_spaces()` - List all Genie spaces
- `get_space()` - Get details of a specific space

### Conversation Management
- `list_conversations()` - List conversations in a space
- `list_conversation_messages()` - List messages in a conversation
- `delete_conversation()` - Delete a conversation
- `delete_conversation_message()` - Delete a specific message

### Advanced Methods
- `create_message()` - Create new message in existing conversation
- `execute_message_attachment_query()` - Re-execute expired queries
- `send_message_feedback()` - Send feedback on messages


## Comparison: SDK Methods vs Manual API Calls

| Aspect | SDK Methods | Manual API Calls |
|--------|-------------|------------------|
| **Code Complexity** | ✅ Simple, clean | ❌ Complex, verbose |
| **Polling** | ✅ Built-in waiting | ❌ Manual polling required |
| **Error Handling** | ✅ Automatic | ❌ Manual implementation |
| **Type Safety** | ✅ Strong typing | ❌ Raw JSON |
| **Maintenance** | ✅ Official support | ❌ Custom implementation |
| **Documentation** | ✅ Well documented | ❌ Limited examples |

## Key Advantages of SDK Methods:

1. **`start_conversation_and_wait()`** - Handles all polling automatically
2. **Type safety** - Proper return types instead of raw JSON
3. **Error handling** - Built-in error management
4. **Official support** - These are the recommended methods
5. **Future-proof** - Updates automatically with SDK updates


## Conclusion

The **SDK methods are the recommended approach** for using the Databricks Genie API. They provide:

- ✅ **Simpler code** - Much cleaner and easier to understand
- ✅ **Built-in functionality** - No manual polling or status checking
- ✅ **Better error handling** - Automatic error management
- ✅ **Type safety** - Proper return types and validation
- ✅ **Official support** - These are the documented methods
- ✅ **Future-proof** - Updates automatically with SDK

### When to Use Each Approach:

- **Use SDK Methods** - For all new implementations and production code
- **Use Manual API Calls** - Only for learning purposes or when you need very specific control

---

**Reference**: [Databricks SDK Documentation - Genie](https://databricks-sdk-py.readthedocs.io/en/stable/workspace/dashboards/genie.html)  
**Last Updated**: September 28, 2025
