# Lab 4: Direct Lake Fallback Behaviour

## Overview

This lab explores **Direct Lake fallback mechanisms** - understanding when, why, and how Direct Lake automatically switches to SQL Endpoint mode for system protection and query reliability.

## What You'll Learn

- **Fallback Triggers**: Understanding conditions that cause fallback to SQL Endpoint
- **Execution Modes**: Exploring Automatic, DirectLakeOnly, and DirectQueryOnly behaviors  
- **Performance Comparison**: Analyzing execution time and resource usage differences
- **Troubleshooting Techniques**: Identifying and resolving fallback scenarios
- **Production Strategies**: Configuring appropriate fallback behavior for enterprise use

## Prerequisites

- **Lab 2 Completion**: BigData lakehouse with billion-row tables and semantic model
- **Direct Lake Understanding**: Basic concepts from Labs 1-3
- **Performance Monitoring**: Familiarity with tracing and DMV analysis

## Architecture Overview

**Direct Lake Fallback Protection Flow:**
```
User Query → Direct Lake Attempt → Guardrail Check → Mode Decision
     ↓              ↓                   ↓              ↓
DAX Request → Memory Available? → Within Limits? → Execute/Fallback
     ↓              ↓                   ↓              ↓
Processing  → Column Loading    → Success/Fail  → Direct Lake/SQL
     ↓              ↓                   ↓              ↓
Results     → Performance       → Protection    → Reliable Results
```

## Lab Workflow

1. **Environment Setup**: Configure BigData lakehouse and semantic model access
2. **Fallback Analysis**: Explore different fallback trigger scenarios
3. **Mode Configuration**: Test Automatic, DirectLakeOnly, and DirectQueryOnly modes
4. **Performance Comparison**: Measure execution differences across modes
5. **Troubleshooting**: Identify and resolve common fallback issues
6. **Production Configuration**: Implement optimal fallback strategies

## Expected Outcomes

By completing this lab, you will master Direct Lake's protection mechanisms:

- ✅ **Fallback Understanding**: Complete knowledge of when and why fallback occurs
- ✅ **Mode Configuration**: Expert ability to configure optimal fallback behavior
- ✅ **Performance Analysis**: Skills to measure and optimize fallback scenarios
- ✅ **Troubleshooting Mastery**: Ability to diagnose and resolve fallback issues
- ✅ **Production Readiness**: Confidence in deploying Direct Lake with appropriate safeguards

### Prerequisites and Lab Dependencies

This lab builds directly on **Lab 2's infrastructure** to demonstrate fallback behavior with real billion-row scenarios:

#### Required Artifacts from Lab 2:
- **BigData lakehouse**: With OneLake shortcuts to billion-row tables
- **BigData_model semantic model**: Configured with relationships and measures
- **Billion-row tables**: fact_myevents_1bln and fact_myevents_2bln for stress testing
- **Performance monitoring setup**: Tracing and DMV capabilities established

#### Why Lab 2 is Essential:
- **Realistic scale**: Billion-row tables naturally trigger fallback scenarios
- **Performance stress**: Large datasets push Direct Lake to its limits
- **Comparative analysis**: Compare fallback vs. Direct Lake performance
- **Production relevance**: Real-world scale scenarios for enterprise learning

#### Fallback Learning Strategy:
1. **Baseline establishment**: Document normal Direct Lake behavior
2. **Fallback triggering**: Intentionally exceed limits to observe fallback
3. **Mode configuration**: Explore different fallback behavior settings
4. **Performance analysis**: Compare execution paths and timing
5. **Troubleshooting**: Identify and resolve fallback causes

## 1. Install Semantic Link Labs Python Library

Installs required libraries for fallback behavior analysis and semantic model configuration.

In [None]:
%pip install -q --upgrade pip
%pip install -q semantic-link-labs azure-core==1.31.0 PyJWT==2.6.0

## 2. Configure Environment for Fallback Behavior Testing

Establishes connection to BigData lakehouse and prepares semantic model for fallback analysis.
- **Troubleshooting context**: Environment details for debugging fallback issues
- **Performance baseline**: Established context for before/after comparisons

**Expected outcome**: Validated environment with access to Lab 2's billion-row infrastructure, ready for systematic fallback behavior testing and analysis.

In [None]:
import sempy_labs as labs
from sempy import fabric
import sempy

LakehouseName = "BigData"
lakehouses = labs.list_lakehouses()["Lakehouse Name"]
for l in lakehouses:
    if l.startswith("Big"):
        LakehouseName = l

SemanticModelName = f"{LakehouseName}_model"

lakehouses=labs.list_lakehouses()["Lakehouse Name"]
if LakehouseName in lakehouses.values:
    lakehouseId = notebookutils.lakehouse.getWithProperties(LakehouseName)["id"]
else:
    print("You need to complete Lab 2 to create the required lakehouse for this lab")

workspaceId = notebookutils.lakehouse.getWithProperties(LakehouseName)["workspaceId"]
workspaceName = sempy.fabric.resolve_workspace_name(workspaceId)
print(f"WorkspaceId = {workspaceId}, LakehouseID = {lakehouseId}, Workspace Name = {workspaceName}")



## 3. Enhanced Trace Function for Fallback Analysis

Defines comprehensive tracing function to monitor Direct Lake execution and fallback behavior patterns.

In [None]:
import warnings
import time
from Microsoft.AnalysisServices.Tabular import TraceEventArgs
from typing import Dict, List, Optional, Callable
import pandas

def runDMV():
    df = sempy.fabric.evaluate_dax(
        dataset=SemanticModelName, 
        dax_string="""
        
        SELECT 
            MEASURE_GROUP_NAME AS [TABLE],
            ATTRIBUTE_NAME AS [COLUMN],
            DATATYPE ,
            DICTIONARY_SIZE 		    AS SIZE ,
            DICTIONARY_ISPAGEABLE 		AS PAGEABLE ,
            DICTIONARY_ISRESIDENT		AS RESIDENT ,
            DICTIONARY_TEMPERATURE		AS TEMPERATURE,
            DICTIONARY_LAST_ACCESSED	AS LASTACCESSED 
        FROM $SYSTEM.DISCOVER_STORAGE_TABLE_COLUMNS 
        ORDER BY 
            [DICTIONARY_TEMPERATURE] DESC
        
        """)
    display(df)

def filter_func(e):
    retVal:bool=True
    if e.EventSubclass.ToString() == "VertiPaqScanInternal":
        retVal=False      
    #     #if e.EventSubClass.ToString() == "VertiPaqScanInternal":
    #     retVal=False
    return retVal

# define events to trace and their corresponding columns
def runQueryWithTrace (expr:str,workspaceName:str,SemanticModelName:str,Result:Optional[bool]=True,Trace:Optional[bool]=True,DMV:Optional[bool]=True,ClearCache:Optional[bool]=True) -> pandas.DataFrame :
    event_schema = fabric.Trace.get_default_query_trace_schema()
    event_schema.update({"ExecutionMetrics":["EventClass","TextData"]})
    del event_schema['VertiPaqSEQueryBegin']
    del event_schema['VertiPaqSEQueryCacheMatch']
    del event_schema['DirectQueryBegin']

    warnings.filterwarnings("ignore")

    WorkspaceName = workspaceName
    SemanticModelName = SemanticModelName

    if ClearCache:
        labs.clear_cache(SemanticModelName)

    with fabric.create_trace_connection(SemanticModelName,WorkspaceName) as trace_connection:
        # create trace on server with specified events
        with trace_connection.create_trace(
            event_schema=event_schema, 
            name="Simple Query Trace",
            filter_predicate=filter_func,
            stop_event="QueryEnd"
            ) as trace:

            trace.start()

            df=sempy.fabric.evaluate_dax(
                dataset=SemanticModelName, 
                dax_string=expr)

            if Result:
                displayHTML(f"<H2>####### DAX QUERY RESULT #######</H2>")
                display(df)

            # Wait 5 seconds for trace data to arrive
            time.sleep(5)

            # stop Trace and collect logs
            final_trace_logs = trace.stop()

    if Trace:
        displayHTML(f"<H2>####### SERVER TIMINGS #######</H2>")
        display(final_trace_logs)
    
    if DMV:
        displayHTML(f"<H2>####### SHOW DMV RESULTS #######</H2>")
        runDMV()

    return final_trace_logs


In [None]:
runDMV()

## 4. Demonstrate Natural Fallback Scenario (Automatic Mode)

Executes billion-row query in Automatic mode to demonstrate intelligent fallback to SQL Endpoint.

In [None]:
trace1 = runQueryWithTrace(
    """
    EVALUATE
        SUMMARIZECOLUMNS(
                dim_Date[FirstDateofMonth] ,
                "Count of Transactions" , COUNTROWS(fact_myevents_1bln) ,
                "Sum of Sales (1bln)" , [Sum of Sales (1bln)] ,
                "Sum of Sales (2bln)" , [Sum of Sales (2bln)]
        )
        ORDER BY [FirstDateofMonth]
    """ , workspaceName , SemanticModelName
)

## 5. Configure DirectLakeOnly Mode for Strict Testing

Configures DirectLakeOnly mode to test Direct Lake limits without SQL Endpoint fallback protection.

In [None]:
tom = labs.tom.TOMWrapper(dataset=SemanticModelName, workspace=workspaceName, readonly=False)
tom.set_direct_lake_behavior("DirectLakeOnly") ##  Can be set to any of ['Automatic', 'DirectLakeOnly', 'DirectQueryOnly'].
tom.model.SaveChanges()
print("Model changed")
fabric.refresh_dataset(refresh_type="calculate",dataset=SemanticModelName)
print("Model recalculated")

## 6. Test Query Failure in DirectLakeOnly Mode

Attempts same billion-row query in DirectLakeOnly mode to demonstrate memory limit protection.

In [None]:
from sempy import fabric
x = sempy.fabric._client._adomd_connection.FabricAdomdException
try:
    runQueryWithTrace(
        """
        EVALUATE
            SUMMARIZECOLUMNS(
                    dim_Date[FirstDateofMonth] ,
                    "Count of Transactions" , COUNTROWS(fact_myevents_1bln) ,
                    "Sum of Sales (1bln)" , [Sum of Sales (1bln)] ,
                    "Sum of Sales (2bln)" , [Sum of Sales (2bln)]
            )
            ORDER BY [FirstDateofMonth]
        """ , workspaceName , SemanticModelName
    )
except sempy.fabric._client._adomd_connection.FabricAdomdException as f:
    print(f)
except Exception as e:
    print(e)

## 7. Restore Automatic Mode for Production-Ready Behavior

Restores Automatic mode configuration for optimal production deployment with intelligent fallback.

In [None]:
tom = labs.tom.TOMWrapper(dataset=SemanticModelName, workspace=workspaceName, readonly=False)
tom.set_direct_lake_behavior("Automatic") ##  ['Automatic', 'DirectLakeOnly', 'DirectQueryOnly'].
tom.model.SaveChanges()
print("Model changed")
fabric.refresh_dataset(refresh_type="calculate",dataset=SemanticModelName)
print("Model recalculated")

## 8. Comparative Analysis: Automatic vs. DirectLakeOnly Performance

Analyzes performance differences and operational benefits between fallback modes for production optimization.
| **Third run** | Automatic | Consistent with first run | Behavior stability |

#### Key Performance Metrics:
- **Execution time**: Consistency across automatic mode executions
- **Resource usage**: Memory allocation patterns
- **Result accuracy**: Identical analytical outputs regardless of execution mode
- **System stability**: Reliable performance after mode switching

### Fallback Behavior Mastery Summary

#### Production Insights Gained:
- ✅ **Fallback triggers**: Understanding conditions that cause SQL Endpoint fallback
- ✅ **Performance impact**: Comparing Direct Lake vs. SQL Endpoint execution characteristics
- ✅ **Resource limits**: Exact boundaries of Direct Lake memory and performance constraints
- ✅ **Configuration strategies**: Appropriate fallback mode selection for different scenarios
- ✅ **Troubleshooting skills**: Identifying and resolving fallback-related issues

**Expected outcome**: Comprehensive understanding of Direct Lake fallback behavior with practical insights for production deployment and optimization strategies.

In [None]:
trace2 = runQueryWithTrace(
    """
    EVALUATE
        SUMMARIZECOLUMNS(
                dim_Date[FirstDateofMonth] ,
                "Count of Transactions" , COUNTROWS(fact_myevents_1bln) ,
                "Sum of Sales (1bln)" , [Sum of Sales (1bln)] ,
                "Sum of Sales (2bln)" , [Sum of Sales (2bln)]
        )
        ORDER BY [FirstDateofMonth]
    """ , workspaceName , SemanticModelName, Trace=True, DMV=False
)

🎯 **Ready for the next lab?** Let's explore advanced framing and refresh strategies!

---

## Lab Summary

### What You Accomplished
In this lab, you mastered **Direct Lake fallback mechanisms** and protection strategies:

- ✅ **Fallback Understanding**: Comprehensive knowledge of when and why fallback occurs
- ✅ **Mode Configuration**: Expert configuration of Automatic, DirectLakeOnly, and DirectQueryOnly modes
- ✅ **Protection Validation**: Demonstrated system protection against memory exhaustion scenarios
- ✅ **Performance Analysis**: Measured execution differences between Direct Lake and SQL Endpoint modes
- ✅ **Troubleshooting Skills**: Developed ability to diagnose and resolve fallback issues
- ✅ **Production Strategy**: Established optimal fallback configuration for enterprise deployment

### Architecture Overview

**Direct Lake Fallback Protection System:**
```
Query Execution → Guardrail Check → Mode Decision → Reliable Results
       ↓               ↓              ↓              ↓
User Request   → Memory Available? → Execute/Fallback → Success
Complex Query  → Within Limits?    → Direct Lake     → Fast Response
Heavy Load     → Resource Check    → SQL Endpoint    → Guaranteed Results
       ↓               ↓              ↓              ↓
Protection     → System Health     → Smart Execution → Business Value
```

### Key Takeaways

- **Intelligent Protection**: Direct Lake automatically protects against memory exhaustion and system failures
- **Transparent Fallback**: Users always receive results regardless of execution mode complexity
- **Mode Configuration**: Automatic mode provides optimal balance of performance and reliability
- **Performance Monitoring**: Tracing reveals execution mode decisions and performance characteristics
- **Production Readiness**: Fallback mechanisms ensure reliable operation in enterprise environments

### Performance Results

- **Automatic Mode**: Optimal balance of performance and reliability with intelligent fallback
- **DirectLakeOnly**: Maximum performance when within limits, protective failure when exceeded
- **SQL Endpoint Fallback**: Guaranteed query completion with acceptable performance for complex scenarios
- **System Protection**: Zero system failures or memory exhaustion incidents

### Technical Skills Gained

- **Fallback Configuration**: Expert ability to configure appropriate fallback behavior for different scenarios
- **Performance Monitoring**: Advanced tracing and analysis of execution mode decisions
- **Troubleshooting Mastery**: Diagnostic skills for identifying and resolving fallback-related issues
- **Production Planning**: Strategic planning for fallback behavior in enterprise deployments
- **System Protection**: Understanding of Direct Lake's built-in protection mechanisms

### Next Steps

**Continue to Lab 5** to learn about:
- Advanced framing and refresh optimization strategies
- Intelligent refresh scheduling and coordination
- Performance optimization through strategic refresh patterns
- Enterprise-grade refresh automation and monitoring

**For Production Fallback Management:**
- Implement monitoring and alerting for fallback frequency
- Establish baselines for acceptable fallback scenarios
- Configure optimal fallback modes for different use cases
- Document troubleshooting procedures for fallback issues

---

### Comprehensive Learning Achievement Summary

Congratulations! 🎉 You have successfully completed the **Direct Lake Fallback Behaviour Workshop**, gaining essential skills for **production deployment** and **performance optimization** of Direct Lake solutions.

### Core Competencies Developed

#### 🔍 **1. Fallback Mechanism Understanding**
- **Automatic mode benefits**: Intelligent fallback providing reliability and performance optimization
- **DirectLakeOnly constraints**: Understanding exact limitations and appropriate use cases
- **Trigger identification**: Recognizing conditions that cause SQL Endpoint fallback
- **Performance implications**: Comparing execution characteristics across different modes

#### ⚡ **2. Performance Analysis Expertise**
- **DMV interpretation**: Reading and understanding DirectQuery performance metrics
- **Trace analysis**: Identifying performance bottlenecks and optimization opportunities
- **Comparative benchmarking**: Evaluating relative performance across fallback modes
- **Resource monitoring**: Understanding memory, CPU, and I/O impacts

#### 🎯 **3. Production Optimization Strategies**
- **Configuration selection**: Choosing appropriate fallback modes for different scenarios
- **Capacity planning**: Understanding resource requirements and limitations
- **Troubleshooting workflows**: Systematic approaches to fallback-related issues
- **Performance tuning**: Optimizing queries and models for Direct Lake efficiency

### Practical Production Applications

#### **Scenario-Based Decision Making:**

| Business Scenario | Recommended Mode | Rationale |
|-------------------|------------------|-----------|
| **Real-time dashboards** | Automatic | Balance performance with reliability |
| **Critical reporting** | Automatic | Ensure queries complete successfully |
| **Performance testing** | DirectLakeOnly | Identify exact Direct Lake capabilities |
| **Resource validation** | DirectLakeOnly | Test system boundaries and limits |

#### **Troubleshooting Workflow Mastery:**
1. **Identify**: Use DMVs to detect fallback occurrences
2. **Analyze**: Examine traces to understand why fallback occurred
3. **Optimize**: Modify queries or model to improve Direct Lake compatibility
4. **Validate**: Test with DirectLakeOnly mode to confirm improvements
5. **Deploy**: Use Automatic mode for production reliability

### Advanced Insights Achieved

#### **Performance Characteristics Understanding:**
- ✅ **Memory boundaries**: Exact limits where Direct Lake falls back to SQL Endpoint
- ✅ **Query complexity factors**: Identifying operations that trigger fallback
- ✅ **Data volume impacts**: Understanding how table size affects performance mode
- ✅ **Concurrency effects**: Recognizing how multiple users impact fallback behavior

#### **Enterprise Deployment Readiness:**
- ✅ **Monitoring strategies**: Implementing fallback detection in production
- ✅ **Capacity planning**: Estimating resource requirements for optimal performance
- ✅ **User communication**: Explaining performance variations to business stakeholders
- ✅ **Optimization roadmaps**: Creating systematic improvement plans

### Next Steps for Continued Learning

#### **Immediate Applications:**
1. **Apply learnings** to your production Direct Lake models
2. **Implement monitoring** using the DMV queries learned today
3. **Optimize existing queries** based on fallback behavior insights
4. **Share knowledge** with your team about appropriate mode selection

#### **Advanced Workshop Preparation:**
- **Lab 5 - Framing**: Advanced refresh strategies and optimization techniques
- **Lab 6 - Column Partitioning**: Performance optimization through strategic partitioning
- **Lab 7 - High Cardinality Optimization**: Specialized techniques for complex data scenarios
- **Lab 8 - Hybrid Scenarios**: Combining Direct Lake with Import mode effectively

### Final Technical Validation

Your completion of this workshop demonstrates:
- 🎯 **Expert-level understanding** of Direct Lake fallback mechanisms
- 🚀 **Production-ready skills** for enterprise deployment
- 🔧 **Advanced troubleshooting capabilities** for complex scenarios
- 📊 **Performance optimization expertise** for maximum efficiency

**Congratulations on achieving Direct Lake Fallback Behaviour mastery!** You are now equipped with essential skills for successful enterprise deployment and optimization of Microsoft Fabric Direct Lake solutions.

In [None]:
mssparkutils.session.stop()