# Refresh specific partitions in Semantic Model
This notebook makes use of Fabric Semantic-Link (sempy) to query semantic model meta data, followed by finding the partitions belonging to the current year (dynamically). Once the partitions are identified, these will be refreshed bypassing potential incremental refresh policies. Therefore, this notebook will help to run full refreshes of models or selected partitions, ignoring policies to update data changes in historical partitions for example. 

Throughout this notebook, the term dataset is still used intentionally, given parameters in Semantic Link module still use this.
For more detail on this notebook, it is recommended to review this blog: https://data-marc.com/2024/05/28/dynamically-refreshing-historical-partitions-in-power-bi-incremental-refresh-semantic-models-using-fabric-semantic-link/ 


In [1]:
# Set the bases
workspace_name = "Semantic Link for Power BI folks" # Fill in your workspace name here.
dataset_name = "IncrementalRefreshPartitioning" # Fill in your semantic model name here. 

StatementMeta(, , -1, Cancelled, , Cancelled)

In [7]:
# import libraries
import sempy.fabric as fabric
import pandas as pd
import datetime
import json

StatementMeta(, 42cb5c09-3eb0-4ce4-bf39-4dd9363a8bcc, 11, Finished, Available)

#### Read meta data
Below section reads semantic model meta data using the [evaluate_dax](https://learn.microsoft.com/en-us/python/api/semantic-link-sempy/sempy.fabric?view=semantic-link-python#sempy-fabric-evaluate-dax) function in Semantic Link. Based on this function, Dynamic Management Views can be queries, such as Tables and Partitions. 

In [8]:
## Get tables through DMV
dftablesraw = (fabric
    .evaluate_dax(
        dataset = dataset_name,
        dax_string=
        """
        select * from $SYSTEM.TMSCHEMA_TABLES
        """  
       )  
)
dftablesraw.rename(columns={"Name": "TableName"}, inplace=True)
dftables = dftablesraw[["ID", "TableName", "Description"]]

dftables.head(20)

StatementMeta(, 42cb5c09-3eb0-4ce4-bf39-4dd9363a8bcc, 12, Finished, Available)

Unnamed: 0,ID,TableName,Description
0,14,IncrementalRefreshDemo,


In [9]:
## Get tables partitions through DMV
dfpartitionsraw = (fabric
    .evaluate_dax(
        dataset = dataset_name,
        dax_string=
        """
        select * from $SYSTEM.TMSCHEMA_PARTITIONS
        """  
       )  
)
dfpartitionsraw.rename(columns={"Name": "PartitionName"}, inplace=True)
dfpartitions = dfpartitionsraw[["TableID", "PartitionName", "RangeStart", "RangeEnd"]]
dfpartitions.head(20)

StatementMeta(, 42cb5c09-3eb0-4ce4-bf39-4dd9363a8bcc, 13, Finished, Available)

Unnamed: 0,TableID,PartitionName,RangeStart,RangeEnd
0,14,2019,2019-01-01,2020-01-01
1,14,2020,2020-01-01,2021-01-01
2,14,2021,2021-01-01,2022-01-01
3,14,2022,2022-01-01,2023-01-01
4,14,2023,2023-01-01,2024-01-01
5,14,2024Q1,2024-01-01,2024-04-01
6,14,2024Q204,2024-04-01,2024-05-01


In [10]:
# Join table and partition dataframes based on TableID
dfoverview = pd.merge(dftables, dfpartitions, left_on='ID', right_on='TableID', how='inner')
dfoverview.head(20)

StatementMeta(, 42cb5c09-3eb0-4ce4-bf39-4dd9363a8bcc, 14, Finished, Available)

Unnamed: 0,ID,TableName,Description,TableID,PartitionName,RangeStart,RangeEnd
0,14,IncrementalRefreshDemo,,14,2019,2019-01-01,2020-01-01
1,14,IncrementalRefreshDemo,,14,2020,2020-01-01,2021-01-01
2,14,IncrementalRefreshDemo,,14,2021,2021-01-01,2022-01-01
3,14,IncrementalRefreshDemo,,14,2022,2022-01-01,2023-01-01
4,14,IncrementalRefreshDemo,,14,2023,2023-01-01,2024-01-01
5,14,IncrementalRefreshDemo,,14,2024Q1,2024-01-01,2024-04-01
6,14,IncrementalRefreshDemo,,14,2024Q204,2024-04-01,2024-05-01


In [11]:
# Get the current year as a string
current_year = str(datetime.datetime.now().year)

# Add the new column based on whether the first 4 characters of 'PartitionName' match the current year
dfoverview['PartitionCY'] = dfoverview['PartitionName'].str[:4] == current_year

#print(dfoverview)
dfoverview.head(20)

StatementMeta(, 42cb5c09-3eb0-4ce4-bf39-4dd9363a8bcc, 15, Finished, Available)

Unnamed: 0,ID,TableName,Description,TableID,PartitionName,RangeStart,RangeEnd,PartitionCY
0,14,IncrementalRefreshDemo,,14,2019,2019-01-01,2020-01-01,False
1,14,IncrementalRefreshDemo,,14,2020,2020-01-01,2021-01-01,False
2,14,IncrementalRefreshDemo,,14,2021,2021-01-01,2022-01-01,False
3,14,IncrementalRefreshDemo,,14,2022,2022-01-01,2023-01-01,False
4,14,IncrementalRefreshDemo,,14,2023,2023-01-01,2024-01-01,False
5,14,IncrementalRefreshDemo,,14,2024Q1,2024-01-01,2024-04-01,True
6,14,IncrementalRefreshDemo,,14,2024Q204,2024-04-01,2024-05-01,True


In [12]:
# Define relevant columns for json message
dfrelevant = dfoverview[["TableName", "PartitionName"]].copy()

# Define the condition
condition = dfoverview['PartitionCY'] == True

# Use .loc to apply the condition and modify the DataFrame
filtered_df = dfrelevant.loc[condition].copy()

# Define columns to rename
columns_to_rename = {
    "TableName": "table",
    "PartitionName": "partition"
}

# Rename columns
filtered_df.rename(columns=columns_to_rename, inplace=True)

# Convert the modified DataFrame to a list of dictionaries
filtered_dicts = filtered_df.to_dict(orient='records')

# Convert the list of dictionaries to a JSON string
json_string = json.dumps(filtered_dicts, indent=4)

# Print the JSON string properly formatted
print(json_string)

StatementMeta(, 42cb5c09-3eb0-4ce4-bf39-4dd9363a8bcc, 16, Finished, Available)

[
    {
        "table": "IncrementalRefreshDemo",
        "partition": "2024Q1"
    },
    {
        "table": "IncrementalRefreshDemo",
        "partition": "2024Q204"
    }
]


#### Refresh semantic model
Below section refreshes the semantic model. Additional properties can be found via [this documentation](https://learn.microsoft.com/en-us/python/api/semantic-link-sempy/sempy.fabric?view=semantic-link-python#sempy-fabric-refresh-dataset). 

In [13]:
# Refresh the dataset
fabric.refresh_dataset(
    workspace=workspace_name,
    dataset=dataset_name, 
    objects=json.loads(json_string), # Since the function requests a dictionairy, converted it from string to dictionairy
    refresh_type = "full",
    apply_refresh_policy = False
)

StatementMeta(, 42cb5c09-3eb0-4ce4-bf39-4dd9363a8bcc, 17, Finished, Available)

'd6c74125-18d5-464b-bc79-e7b0c982458f'

In [14]:
# List the refresh requests
dflistrefreshrequests = fabric.list_refresh_requests(dataset=dataset_name, workspace=workspace_name)

# show last 5 requests
dflistrefreshrequests.head(5) 

StatementMeta(, 42cb5c09-3eb0-4ce4-bf39-4dd9363a8bcc, 18, Finished, Available)

Unnamed: 0,Id,Request Id,Start Time,End Time,Refresh Type,Service Exception Json,Status,Extended Status,Refresh Attempts
0,859282034,d6c74125-18d5-464b-bc79-e7b0c982458f,2024-05-28 15:12:04.637000+00:00,NaT,ViaEnhancedApi,,Unknown,NotStarted,[]
1,859275736,d6c69f35-f8be-ffaf-55c3-f49a29d3f3e2,2024-05-28 15:06:27.247000+00:00,2024-05-28 15:09:20.083000+00:00,OnDemand,,Completed,,"[{'attemptId': 1, 'startTime': '2024-05-28T15:..."
2,859248011,88fa893e-e327-4d0a-ab23-0298457fd35c,2024-05-28 14:34:04.233000+00:00,2024-05-28 14:34:06.103000+00:00,ViaEnhancedApi,,Completed,Completed,"[{'attemptId': 1, 'startTime': '2024-05-28T14:..."
3,859202684,ef4f7960-d5f3-4590-9e50-f332e10490e5,2024-05-28 13:46:26.023000+00:00,2024-05-28 13:46:27.093000+00:00,ViaEnhancedApi,,Completed,Completed,"[{'attemptId': 1, 'startTime': '2024-05-28T13:..."
4,859201448,726ad2c4-83c3-4ccc-8db7-458481e896f9,2024-05-28 13:45:19.080000+00:00,2024-05-28 13:45:20.273000+00:00,ViaEnhancedApi,,Completed,Completed,"[{'attemptId': 1, 'startTime': '2024-05-28T13:..."


In [15]:
# Get details about the refresh
fabric.get_refresh_execution_details(
    dataset=dataset_name, 
    workspace=workspace_name, 
    refresh_request_id = dflistrefreshrequests.iloc[0]["Request Id"] # Filters the latest request ID based on the refresh requests
    )

StatementMeta(, 42cb5c09-3eb0-4ce4-bf39-4dd9363a8bcc, 19, Finished, Available)

RefreshExecutionDetails(start_time=datetime.datetime(2024, 5, 28, 15, 12, 4, 637000, tzinfo=datetime.timezone.utc), end_time=None, type='Full', commit_mode='Transactional', status='Unknown', extended_status='NotStarted', current_refresh_type='Full', number_of_attempts=0, objects=                    Table Partition      Status
0  IncrementalRefreshDemo    2024Q1  NotStarted
1  IncrementalRefreshDemo  2024Q204  NotStarted, messages=Empty DataFrame
Columns: [Message, Type]
Index: [], refresh_attempts=Empty DataFrame
Columns: [Attempt Id, Start Time, End Time, Type]
Index: [])