## Query source LH, save to target LH + adjust to semantic model
This notebook queries the source lakehouse based on a defined SQL query. After that, it saves the result set to the target lakehouse. 

Next, it uses Semantic Link Labs to add the resulting table from the target lakehouse to the defined semantic model. 

In [1]:
%pip install semantic-link-labs
import sempy_labs as labs

StatementMeta(, a65d3d2e-7de5-4111-8815-30f7540cf765, 8, Finished, Available, Finished)

Collecting semantic-link-labs
  Downloading semantic_link_labs-0.9.11-py3-none-any.whl.metadata (26 kB)
Collecting semantic-link-sempy>=0.10.2 (from semantic-link-labs)
  Downloading semantic_link_sempy-0.10.2-py3-none-any.whl.metadata (10 kB)
Collecting anytree (from semantic-link-labs)
  Downloading anytree-2.13.0-py3-none-any.whl.metadata (8.0 kB)
Collecting polib (from semantic-link-labs)
  Downloading polib-1.2.0-py2.py3-none-any.whl.metadata (15 kB)
Collecting jsonpath_ng (from semantic-link-labs)
  Downloading jsonpath_ng-1.7.0-py3-none-any.whl.metadata (18 kB)
Downloading semantic_link_labs-0.9.11-py3-none-any.whl (713 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m713.5/713.5 kB[0m [31m32.5 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading semantic_link_sempy-0.10.2-py3-none-any.whl (3.2 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.2/3.2 MB[0m [31m133.8 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading anytree-2.13.0-py3-none-any.whl (

In [2]:
# Define source and target lakehouses (schemas/databases)
source_lakehouse = "LH_STORE_Silver"
target_lakehouse = "LH_STORE_Gold"

StatementMeta(, a65d3d2e-7de5-4111-8815-30f7540cf765, 10, Finished, Available, Finished)

#### Create table in destination lakehouse

In [3]:
# Define source table and target table names
source_table = "product"
target_table = "product"  # new table in Gold to save results

StatementMeta(, a65d3d2e-7de5-4111-8815-30f7540cf765, 11, Finished, Available, Finished)

In [4]:
sql_query = f"""
SELECT 
    ProductKey, 
    EnglishProductName AS ProductName, 
    EnglishDescription AS Description, 
    Color
FROM {source_lakehouse}.{source_table}
"""

# Run the SQL query and get a DataFrame
df = spark.sql(sql_query)

# Show first 5 rows of the resulting DataFrame
df.show(5)


StatementMeta(, a65d3d2e-7de5-4111-8815-30f7540cf765, 12, Finished, Available, Finished)

+----------+--------------------+--------------------+------+
|ProductKey|         ProductName|         Description| Color|
+----------+--------------------+--------------------+------+
|        21|           Freewheel|                NULL|Silver|
|       397|LL Mountain Handl...|All-purpose bar f...|    NA|
|       398|LL Mountain Handl...|All-purpose bar f...|    NA|
|       410|LL Mountain Front...|Replacement mount...| Black|
|       411|ML Mountain Front...|Replacement mount...| Black|
+----------+--------------------+--------------------+------+
only showing top 5 rows



In [5]:
# Write the DataFrame as a new table in the target lakehouse
df.write.mode("overwrite").saveAsTable(f"{target_lakehouse}.{target_table}")

print(f"✅ Saved filtered data to table '{target_table}' in lakehouse '{target_lakehouse}'")

StatementMeta(, a65d3d2e-7de5-4111-8815-30f7540cf765, 13, Finished, Available, Finished)

✅ Saved filtered data to table 'product' in lakehouse 'LH_STORE_Gold'


#### Update Semantic Model

In [6]:
semanticmodel_name = "Sales Analysis"

StatementMeta(, a65d3d2e-7de5-4111-8815-30f7540cf765, 14, Finished, Available, Finished)

In [7]:
labs.directlake.add_table_to_direct_lake_semantic_model(
    dataset= semanticmodel_name,
    table_name= target_table, # reusing the target table name, as the gold lakehouse should already be self explanatory
    lakehouse_table_name= target_table,
    refresh= False,
    workspace= None # if not specified, will be the same workspace as the notebook runs
)

StatementMeta(, a65d3d2e-7de5-4111-8815-30f7540cf765, 15, Finished, Available, Finished)

🟢 The 'product' table has been added to the 'Sales Analysis' semantic model within the 'Reaching maximum automation' workspace.
🟢 The 'product' partition has been added to the 'product' table in the 'Sales Analysis' semantic model within the 'Reaching maximum automation' workspace.
🟢 The 'ProductKey' column has been added to the 'product' table as a 'Int64' data type in the 'Sales Analysis' semantic model within the 'Reaching maximum automation' workspace.
🟢 The 'ProductName' column has been added to the 'product' table as a 'String' data type in the 'Sales Analysis' semantic model within the 'Reaching maximum automation' workspace.
🟢 The 'Description' column has been added to the 'product' table as a 'String' data type in the 'Sales Analysis' semantic model within the 'Reaching maximum automation' workspace.
🟢 The 'Color' column has been added to the 'product' table as a 'String' data type in the 'Sales Analysis' semantic model within the 'Reaching maximum automation' workspace.
