# Lab 8: Direct Lake over One Lake with Import Mode

This lab demonstrates how to combine Direct Lake and Import mode tables within a single semantic model. You will clone an existing model, convert it from Direct Lake over SQL to Direct Lake over One Lake, switch one table to Import mode, refresh it, and observe how guardrails and fallback behaviour differ between the two Direct Lake variants.

**Prerequisites:** Lab 2 must be completed (Big Data lakehouse and semantic model).

---

*Dieses Lab zeigt, wie Sie Direct Lake- und Import-Modus-Tabellen in einem einzigen Semantic Model kombinieren koennen. Sie klonen ein bestehendes Modell, konvertieren es von Direct Lake ueber SQL zu Direct Lake ueber One Lake, stellen eine Tabelle auf den Import-Modus um, aktualisieren sie und beobachten, wie sich Guardrails und Fallback-Verhalten zwischen den beiden Direct Lake-Varianten unterscheiden.*

*Voraussetzung: Lab 2 muss abgeschlossen sein (Big Data Lakehouse und Semantic Model).*

## Step 1: Install Semantic Link Labs

Install the Semantic Link Labs Python library used throughout this lab.

---

*Installieren Sie die Semantic Link Labs Python-Bibliothek, die in diesem Lab verwendet wird.*

In [None]:
%pip install -q semantic-link-labs

## Step 2: Load Libraries and Setup Parameters

Import required libraries and configure workspace, lakehouse, and model identifiers.

---

*Importieren Sie die erforderlichen Bibliotheken und konfigurieren Sie Workspace-, Lakehouse- und Modellbezeichner.*

In [None]:
# Import libraries for hybrid storage mode operations and model management
import sempy_labs as labs
import sempy
from sempy import fabric
import pandas as pd
import json
import time
import uuid
from sempy_labs.tom._model import TOMWrapper, connect_semantic_model

# Import specialized helper functions for advanced operations
from sempy_labs._helper_functions import (
    format_dax_object_name,
    generate_guid,
    _make_list_unique,
    resolve_dataset_name_and_id,
    resolve_workspace_name_and_id,
    _base_api,
    resolve_workspace_id,
    resolve_item_id,
    resolve_lakehouse_id,
    resolve_lakehouse_name_and_id
)

# Initialize Analysis Services for advanced model operations
fabric._client._utils._init_analysis_services()
import Microsoft.AnalysisServices.Tabular as TOM
import Microsoft.AnalysisServices
import warnings
from Microsoft.AnalysisServices.Tabular import TraceEventArgs
from typing import Dict, List, Optional, Callable

# Configure model names for hybrid storage mode testing
LakehouseName = "BigData"
SemanticModelName = f"{LakehouseName}_model"
ClonedModelName = SemanticModelName + "_clone"
workspace = None

(workspace_name, workspace_id) = resolve_workspace_name_and_id(workspace)
(lakehouse_name, lakehouse_id) = resolve_lakehouse_name_and_id(lakehouse=LakehouseName, workspace=workspace)

#### Generate Unique Trace Name - Start ####
import json, base64, re
token = notebookutils.credentials.getToken("pbi")
payload = token.split(".")[1]
payload += "=" * (4 - len(payload) % 4)
upn = json.loads(base64.b64decode(payload)).get("upn")

# Extract just the user part (e.g. "SQLKDL.user39")
user_id = upn.split("@")[0]
lab_number = 8  # set per lab

# Remove characters not allowed in trace names: . , ; ' ` : / \ * | ? " & % $ ! + = ( ) [ ] { } < >
user_id_clean = re.sub(r"[.,;'`:/\\*|?\"&%$!+=(){}\[\]<>]", "_", user_id)
trace_name = f"Lab{lab_number}_{user_id_clean}"
#### Generate Unique Trace Name - End ####

def runDMV():
    df = sempy.fabric.evaluate_dax(
        dataset=SemanticModelName, 
        dax_string="""
        
        SELECT 
            MEASURE_GROUP_NAME AS [TABLE],
            ATTRIBUTE_NAME AS [COLUMN],
            DATATYPE ,
            DICTIONARY_SIZE 		    AS SIZE ,
            DICTIONARY_ISPAGEABLE 		AS PAGEABLE ,
            DICTIONARY_ISRESIDENT		AS RESIDENT ,
            DICTIONARY_TEMPERATURE		AS TEMPERATURE,
            DICTIONARY_LAST_ACCESSED	AS LASTACCESSED 
        FROM $SYSTEM.DISCOVER_STORAGE_TABLE_COLUMNS 
        ORDER BY 
            [DICTIONARY_TEMPERATURE] DESC
        
        """)
    display(df)

def filter_func(e):
    retVal:bool=True
    if e.EventSubclass.ToString() == "VertiPaqScanInternal":
        retVal=False      
    #     #if e.EventSubClass.ToString() == "VertiPaqScanInternal":
    #     retVal=False
    return retVal

# define events to trace and their corresponding columns
def runQueryWithTrace (expr:str,workspaceName:str,SemanticModelName:str,Result:Optional[bool]=True,Trace:Optional[bool]=True,DMV:Optional[bool]=True,ClearCache:Optional[bool]=True) -> pd.DataFrame :
    event_schema = fabric.Trace.get_default_query_trace_schema()
    event_schema.update({"ExecutionMetrics":["EventClass","TextData"]})
    del event_schema['VertiPaqSEQueryBegin']
    del event_schema['VertiPaqSEQueryCacheMatch']
    del event_schema['DirectQueryBegin']

    warnings.filterwarnings("ignore")

    WorkspaceName = workspaceName
    SemanticModelName = SemanticModelName

    if ClearCache:
        labs.clear_cache(SemanticModelName)

    with fabric.create_trace_connection(SemanticModelName,WorkspaceName) as trace_connection:
        # create trace on server with specified events
        with trace_connection.create_trace(
            event_schema=event_schema, 
            name=trace_name,
            filter_predicate=filter_func,
            stop_event="QueryEnd"
            ) as trace:

            trace.start()

            df=sempy.fabric.evaluate_dax(
                dataset=SemanticModelName, 
                dax_string=expr)

            if Result:
                displayHTML(f"<H2>####### DAX QUERY RESULT #######</H2>")
                display(df)

            # Wait 5 seconds for trace data to arrive
            time.sleep(5)

            # stop Trace and collect logs
            final_trace_logs = trace.stop()

    if Trace:
        displayHTML(f"<H2>####### SERVER TIMINGS #######</H2>")
        display(final_trace_logs)
    
    if DMV:
        displayHTML(f"<H2>####### SHOW DMV RESULTS #######</H2>")
        runDMV()
    
    return final_trace_logs

## Step 3: Clone the BigData Semantic Model

Create a copy of the existing BigData semantic model to use for hybrid storage mode experiments without affecting the original.

---

*Erstellen Sie eine Kopie des bestehenden BigData Semantic Models fuer hybride Speichermodus-Experimente, ohne das Original zu beeinflussen.*

In [None]:
#Clear any existing cloned model if re-running
df = fabric.list_items()
if ClonedModelName in df.values:
    model_id = df.at[df[df['Display Name'] == ClonedModelName].index[0], 'Id']
    fabric.delete_item(model_id)
    print("Cloned model deleted")

with labs.tom.connect_semantic_model(dataset=SemanticModelName, readonly=False) as tom:
    newDB = tom._tom_server.Databases.GetByName(SemanticModelName).Clone()
    newModel = tom._tom_server.Databases.GetByName(SemanticModelName).Model.Clone()
    newDB.Name = ClonedModelName
    newDB.ID = str(uuid.uuid4())
    #newDB.Model = newModel
    newModel.CopyTo(newDB.Model)
    tom._tom_server.Databases.Add(newDB)

    newDB.Update(Microsoft.AnalysisServices.UpdateOptions.ExpandFull)

## Step 4: Frame the Cloned Model

Reframe the cloned model to initialise all data connections and relationships.

---

*Reframen Sie das geklonte Modell, um alle Datenverbindungen und Beziehungen zu initialisieren.*

In [None]:
# Refresh the cloned model to initialize data connections
labs.refresh_semantic_model(dataset=ClonedModelName)

## Step 5: Clear Schema Name

Temporary step to clear the schema name property and resolve a known issue.

---

*Temporaerer Schritt zum Loeschen der Schema-Name-Eigenschaft, um ein bekanntes Problem zu beheben.*

In [None]:
with labs.tom.connect_semantic_model(dataset=ClonedModelName, readonly=False) as tom:
    for t in tom.model.Tables:
        for p in t.Partitions:
            if isinstance(p.Source,Microsoft.AnalysisServices.Tabular.EntityPartitionSource):
                p.Source.SchemaName=None

## Step 6: Reframe After Schema Change

Reframe the model again so it picks up the schema name change from the previous step.

---

*Reframen Sie das Modell erneut, damit es die Schemanamenaenderung aus dem vorherigen Schritt uebernimmt.*

In [None]:
# Refresh the cloned model to initialize data connections
labs.refresh_semantic_model(dataset=ClonedModelName)

## Step 7: Check Direct Lake Version

Identify whether the model currently uses Direct Lake over SQL (Sql.Database) or Direct Lake over One Lake (Azure.Lakehouse).

- `let database = Sql.Database(...)` = **DL/SQL**
- `let database = Azure.Lakehouse(...)` = **DL/OL**

---

*Stellen Sie fest, ob das Modell derzeit Direct Lake ueber SQL (Sql.Database) oder Direct Lake ueber One Lake (Azure.Lakehouse) verwendet.*

- *Sql.Database(...) = DL/SQL*
- *Azure.Lakehouse(...) = DL/OL*

In [None]:
with labs.tom.connect_semantic_model(dataset=ClonedModelName, readonly=False) as tom:
    for e in tom.model.Expressions:
        print(e.Expression)

## Step 8: Show Storage Mode for Each Table

Display the current storage mode of every table in the cloned model.

---

*Zeigen Sie den aktuellen Speichermodus jeder Tabelle im geklonten Modell an.*

In [None]:
objects = {}
with labs.tom.connect_semantic_model(dataset=ClonedModelName, readonly=False) as tom:
    for t in tom.model.Tables:
        #print(t.Name)
        for p in t.Partitions:
            #print(p.Mode)
            objects[t.Name] = str(p.Mode)
 
df=pd.DataFrame([objects])
display(df)

## Step 9: Attempt Import Conversion (DL/SQL — Will Fail)

Try to convert a Direct Lake table to Import mode while the model is still using DL/SQL. This is expected to fail, demonstrating that Import conversion requires DL/OL.

---

*Versuchen Sie, eine Direct Lake-Tabelle in den Import-Modus zu konvertieren, waehrend das Modell noch DL/SQL verwendet. Dies wird erwartungsgemaess fehlschlagen und zeigt, dass die Import-Konvertierung DL/OL erfordert.*

In [None]:
# Attempt to convert Direct Lake table to Import mode (will fail if Direct Lake over SQL)
try:
    with labs.tom.connect_semantic_model(dataset=ClonedModelName, readonly=False) as tom:
        tom.convert_direct_lake_to_import(
            table_name="dim_Date" ,
            entity_name="dim_Date" ,
            source="BigData",
            source_type = "Lakehouse"
        )
except Exception as e:
    print(e)

## Step 10: Convert to Direct Lake over One Lake

Switch the cloned model from DL/SQL to DL/OL so that Import mode conversion becomes possible.

---

*Stellen Sie das geklonte Modell von DL/SQL auf DL/OL um, damit die Konvertierung in den Import-Modus moeglich wird.*

In [None]:
with labs.tom.connect_semantic_model(dataset=ClonedModelName, readonly=False) as tom:

    for e in tom.model.Expressions:
        e.Expression = f"""
        let
            Source = AzureStorage.DataLake("https://onelake.dfs.fabric.microsoft.com/{workspace_id}/{lakehouse_id}", [HierarchicalNavigation=true])
        in
            Source"""
        
print("Converted semantic model to use DirectLake over One Lake")

## Step 11: Convert Direct Lake Table to Import (DL/OL — Will Succeed)

Now that the model uses DL/OL, convert a table to Import mode. This creates a hybrid model with both Direct Lake and Import tables.

---

*Nachdem das Modell DL/OL verwendet, konvertieren Sie eine Tabelle in den Import-Modus. Dadurch entsteht ein hybrides Modell mit sowohl Direct Lake- als auch Import-Tabellen.*

In [None]:
# Convert Direct Lake table to Import mode (should work with Direct Lake over One Lake)
with labs.tom.connect_semantic_model(dataset=ClonedModelName, readonly=False) as tom:
    tom.convert_direct_lake_to_import(
        table_name="dim_Date" ,
        entity_name="dim_Date" ,
        source="BigData",
        source_type = "Lakehouse"
    )

## Step 12: Show Storage Modes After Conversion

Display the updated storage modes, confirming the hybrid configuration with both Direct Lake and Import tables.

---

*Zeigen Sie die aktualisierten Speichermodi an und bestaetigen Sie die hybride Konfiguration mit sowohl Direct Lake- als auch Import-Tabellen.*

In [None]:
objects = {}
with labs.tom.connect_semantic_model(dataset=ClonedModelName, readonly=False) as tom:
    for t in tom.model.Tables:
        #print(t.Name)
        for p in t.Partitions:
            #print(p.Mode)
            objects[t.Name] = str(p.Mode)
 
df=pd.DataFrame([objects])
display(df)

## Step 13: Set Credentials and Enable Large Model Format (Manual Step)

This must be done in the Power BI service, not via a notebook. Create a new Shared Cloud Connection for refreshing the Import table. Use OAuth2 for authentication in this lab (Service Principal is recommended for production). Ensure the Privacy Level is set to Organizational (default).

---

*Dieser Schritt muss im Power BI Service durchgefuehrt werden, nicht ueber ein Notebook. Erstellen Sie eine neue Shared Cloud Connection zum Aktualisieren der Import-Tabelle. Verwenden Sie OAuth2 fuer die Authentifizierung in diesem Lab (Service Principal wird fuer die Produktion empfohlen). Stellen Sie sicher, dass die Datenschutzebene auf Organisatorisch (Standard) eingestellt ist.*

## Step 14: Refresh the Import Table

Trigger a refresh of the Import mode table so it loads data from the lakehouse.

---

*Loesen Sie eine Aktualisierung der Import-Modus-Tabelle aus, damit sie Daten aus dem Lakehouse laedt.*

In [None]:
labs.refresh_semantic_model(dataset=ClonedModelName)

## Step 16: Verify Direct Lake Version

Confirm which Direct Lake variant the model is currently using.

---

*Bestaetigen Sie, welche Direct Lake-Variante das Modell derzeit verwendet.*

In [None]:
with labs.tom.connect_semantic_model(dataset=ClonedModelName, readonly=False) as tom:
    for e in tom.model.Expressions:
        print(e.Expression)

## Step 18: Query the 1 Billion Row Table

Run a DAX query against the table with approximately one billion rows to verify the hybrid model works.

---

*Fuehren Sie eine DAX-Abfrage gegen die Tabelle mit ungefaehr einer Milliarde Zeilen aus, um zu ueberpruefen, ob das hybride Modell funktioniert.*

In [None]:
df = runQueryWithTrace("""
        EVALUATE
	        SUMMARIZECOLUMNS(
		        dim_Date[DateKey],
		        "Quantity" , [Sum of Sales (1bln)]
		        )
""",workspace_name,ClonedModelName,DMV=False)

display(df)

## Step 19: Query the 2 Billion Row Table

Run a DAX query against the table with approximately two billion rows. This query is expected to fail due to Direct Lake guardrail limits.

---

*Fuehren Sie eine DAX-Abfrage gegen die Tabelle mit ungefaehr zwei Milliarden Zeilen aus. Diese Abfrage wird voraussichtlich aufgrund von Direct Lake Guardrail-Limits fehlschlagen.*

In [None]:
try:
	df = runQueryWithTrace("""
			EVALUATE
				SUMMARIZECOLUMNS(
					dim_Date[DateKey],
					"Quantity" , [Sum of Sales (2bln)]
					)
	""",workspace_name,ClonedModelName,DMV=False)

	display(df)
except Exception as e:
    print(e)

## Step 20: Convert Back to DL/SQL

Switch the cloned model back to Direct Lake over SQL to observe the difference in fallback behaviour.

---

*Stellen Sie das geklonte Modell zurueck auf Direct Lake ueber SQL, um den Unterschied im Fallback-Verhalten zu beobachten.*

In [None]:
df=pd.DataFrame(labs.list_lakehouses())
endpointid = df[df['Lakehouse Name']==LakehouseName]['SQL Endpoint ID'].iloc[0]
server = df[df['Lakehouse Name']==LakehouseName]['SQL Endpoint Connection String'].iloc[0]

with labs.tom.connect_semantic_model(dataset=ClonedModelName, readonly=False) as tom:

    #Convert import tables to Direct Lake
    for t in tom.model.Tables:
        for p in t.Partitions:
            if(p.Mode==TOM.ModeType.Import):
                t.Partitions.Remove(p)
                tom.add_entity_partition(table_name=t.Name,entity_name=t.Name)
                print(f"Table {t.Name} converted")
            p.Source.SchemaName=None

    #Switch Model to Direct Lake over SQL
    for e in tom.model.Expressions:
        e.Expression = f"""
        let
            Source = Sql.Database("{server}", "{endpointid}")
        in
            Source"""

print("Converted to Direct Lake over SQL")

## Step 21: Confirm Direct Lake Version

Verify the model is now back on DL/SQL.

- `Sql.Database(...)` = DL/SQL
- `Azure.Lakehouse(...)` = DL/OL

---

*Ueberpruefen Sie, dass das Modell jetzt wieder auf DL/SQL laeuft.*

- *Sql.Database(...) = DL/SQL*
- *Azure.Lakehouse(...) = DL/OL*

In [None]:
with labs.tom.connect_semantic_model(dataset=ClonedModelName, readonly=False) as tom:
    for e in tom.model.Expressions:
        print(e.Expression)

## Step 22: Show Storage Modes

Display the storage mode of each table after converting back to DL/SQL.

---

*Zeigen Sie den Speichermodus jeder Tabelle an, nachdem Sie zurueck auf DL/SQL konvertiert haben.*

In [None]:
objects = {}
with labs.tom.connect_semantic_model(dataset=ClonedModelName, readonly=False) as tom:
    for t in tom.model.Tables:
        #print(t.Name)
        for p in t.Partitions:
            #print(p.Mode)
            objects[t.Name] = str(p.Mode)
 
df=pd.DataFrame([objects])
display(df)

## Step 23: Query the 2 Billion Row Table (DL/SQL with Fallback)

Re-run the 2 billion row query now that the model is on DL/SQL. The query should succeed by falling back to the SQL Endpoint. Run twice if the first attempt returns an error.

---

*Fuehren Sie die 2-Milliarden-Zeilen-Abfrage erneut aus, nachdem das Modell auf DL/SQL laeuft. Die Abfrage sollte durch Fallback auf den SQL-Endpoint erfolgreich sein. Fuehren Sie sie zweimal aus, falls der erste Versuch einen Fehler liefert.*

In [None]:
df = runQueryWithTrace("""
        EVALUATE
	        SUMMARIZECOLUMNS(
		        dim_Date[DateKey],
		        "Quantity" , [Sum of Sales (2bln)]
		        )
""",workspace_name,ClonedModelName,DMV=False)

display(df)

## Step 24: Show TMSL for the Cloned Model

Display the full TMSL definition of the cloned model for reference.

---

*Zeigen Sie die vollstaendige TMSL-Definition des geklonten Modells als Referenz an.*

In [None]:
import json
with labs.tom.connect_semantic_model(dataset=ClonedModelName, readonly=False) as tom:
    x= tom.get_bim()

    formatted_json = json.dumps(x, indent=4)
    print(formatted_json)

## Step 25: Stop the Spark Session

---

*Spark-Sitzung beenden.*

In [None]:
mssparkutils.session.stop()

## Lab Summary

In this lab you explored hybrid storage modes by cloning a Direct Lake model, converting it between DL/SQL and DL/OL, and switching one table to Import mode. You observed that Import conversion is only possible under DL/OL, tested guardrail behaviour on tables with one and two billion rows, and saw how DL/SQL provides automatic SQL Endpoint fallback when guardrails are exceeded.

**Key Takeaways**
- Direct Lake has two variants: DL/SQL (Sql.Database) and DL/OL (Azure.Lakehouse). The variant affects which features are available.
- Converting a table from Direct Lake to Import mode requires the model to be on DL/OL. The attempt fails under DL/SQL.
- Hybrid models (Direct Lake + Import tables) allow you to keep most tables on Direct Lake while importing specific tables that benefit from in-memory storage.
- After refreshing an Import table, relationship indexes must be recalculated for cross-storage-mode joins to work.
- DL/OL does not support automatic fallback to the SQL Endpoint — queries that hit guardrails will fail outright.
- DL/SQL supports automatic fallback, so the same two-billion-row query that fails under DL/OL succeeds under DL/SQL by routing through the SQL Endpoint.
- Shared Cloud Connections with OAuth2 (or Service Principal for production) are required to refresh Import tables in the service.

**Workshop Complete.** You have now covered all core Direct Lake topics: model creation, big data, Delta analysis, fallback behaviour, framing, column partitioning, high cardinality optimisation, and hybrid storage modes.

---

*In diesem Lab haben Sie hybride Speichermodi erkundet, indem Sie ein Direct Lake-Modell geklont, es zwischen DL/SQL und DL/OL konvertiert und eine Tabelle in den Import-Modus umgestellt haben. Sie haben beobachtet, dass die Import-Konvertierung nur unter DL/OL moeglich ist, das Guardrail-Verhalten bei Tabellen mit einer und zwei Milliarden Zeilen getestet und gesehen, wie DL/SQL automatischen SQL-Endpoint-Fallback bietet, wenn Guardrails ueberschritten werden.*

**Wichtige Erkenntnisse**
- Direct Lake hat zwei Varianten: DL/SQL (Sql.Database) und DL/OL (Azure.Lakehouse). Die Variante bestimmt, welche Funktionen verfuegbar sind.
- Die Konvertierung einer Tabelle von Direct Lake in den Import-Modus erfordert, dass das Modell auf DL/OL laeuft. Der Versuch schlaegt unter DL/SQL fehl.
- Hybride Modelle (Direct Lake + Import-Tabellen) ermoeglichen es, die meisten Tabellen auf Direct Lake zu belassen, waehrend bestimmte Tabellen importiert werden, die von der In-Memory-Speicherung profitieren.
- Nach der Aktualisierung einer Import-Tabelle muessen die Beziehungsindizes neu berechnet werden, damit speichermodusuebergreifende Joins funktionieren.
- DL/OL unterstuetzt keinen automatischen Fallback auf den SQL-Endpoint — Abfragen, die Guardrails erreichen, schlagen direkt fehl.
- DL/SQL unterstuetzt automatischen Fallback, sodass dieselbe Zwei-Milliarden-Zeilen-Abfrage, die unter DL/OL fehlschlaegt, unter DL/SQL ueber den SQL-Endpoint erfolgreich ist.
- Shared Cloud Connections mit OAuth2 (oder Service Principal fuer die Produktion) sind erforderlich, um Import-Tabellen im Service zu aktualisieren.

**Workshop abgeschlossen.** Sie haben jetzt alle zentralen Direct Lake-Themen behandelt: Modellerstellung, Big Data, Delta-Analyse, Fallback-Verhalten, Framing, Spaltenpartitionierung, Hochkardinalitaetsoptimierung und hybride Speichermodi.