# Tutorial: Extract and calculate Power BI measures from a Jupyter notebook
This tutorial illustrates how to use SemPy to calculate measures in semantic models (Power BI datasets).

### In this tutorial, you learn how to:
- Evaluate Power BI measures programmatically via a Python interface of semantic link's Python library ([SemPy](https://learn.microsoft.com/en-us/python/api/semantic-link-sempy)), apply filtering, grouping, and so on.
- Get familiarized with components of SemPy that help bridge the gap between AI and BI. These components include:
    - FabricDataFrame - a pandas-like structure enhanced with additional semantic information.
    - Useful functions that allow you to fetch semantic models, including raw data, configurations, and measures.

### Prerequisites

* A [Microsoft Fabric subscription](https://learn.microsoft.com/fabric/enterprise/licenses). Or sign up for a free [Microsoft Fabric (Preview) trial](https://learn.microsoft.com/fabric/get-started/fabric-trial).
* Sign in to [Microsoft Fabric](https://fabric.microsoft.com/).
* Go to the Data Science experience in Microsoft Fabric.
* Select **Workspaces** from the left navigation pane to find and select your workspace. This workspace becomes your current workspace.
* Download the [_Retail Analysis Sample PBIX.pbix_](https://download.microsoft.com/download/9/6/D/96DDC2FF-2568-491D-AAFA-AFDD6F763AE3/Retail%20Analysis%20Sample%20PBIX.pbix) dataset and upload it to your workspace.
* Open your notebook. You have two options:
    * [Import this notebook into your workspace](https://learn.microsoft.com/en-us/fabric/data-engineering/how-to-use-notebook#import-existing-notebooks). You can import from the Data Science homepage.
    * Alternatively, you can create [a new notebook](https://learn.microsoft.com/fabric/data-engineering/how-to-use-notebook#create-notebooks) to copy/paste code into cells.
* In the Lakehouse explorer section of your notebook, add a new or existing lakehouse to your notebook. For more information on how to add a lakehouse, see [Attach a lakehouse to your notebook](https://learn.microsoft.com/en-us/fabric/data-science/tutorial-data-science-prepare-system#attach-a-lakehouse-to-the-notebooks).

## Set up the notebook

In this section, you set up a notebook environment with the necessary modules and data.

1. install `SemPy` from PyPI using the `%pip` in-line installation capability within the notebook:

In [None]:
%pip install semantic-link

StatementMeta(, 203c1ef6-c809-42a7-8bd7-577b3bec7114, -1, Finished, Available)

Collecting semantic-link
  Downloading semantic_link-0.3.4-py3-none-any.whl (8.2 kB)
Collecting semantic-link-functions-geopandas==0.3.4
  Downloading semantic_link_functions_geopandas-0.3.4-py3-none-any.whl (4.0 kB)
Collecting semantic-link-functions-holidays==0.3.4
  Downloading semantic_link_functions_holidays-0.3.4-py3-none-any.whl (4.2 kB)
Collecting semantic-link-sempy==0.3.4
  Downloading semantic_link_sempy-0.3.4-py3-none-any.whl (2.9 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.9/2.9 MB[0m [31m116.2 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting semantic-link-functions-phonenumbers==0.3.4
  Downloading semantic_link_functions_phonenumbers-0.3.4-py3-none-any.whl (4.3 kB)
Collecting semantic-link-functions-meteostat==0.3.4
  Downloading semantic_link_functions_meteostat-0.3.4-py3-none-any.whl (4.5 kB)
Collecting semantic-link-functions-validators==0.3.4
  Downloading semantic_link_functions_validators-0.3.4-py3-none-any.whl (4.8 kB)
Collecting mapclass




2. Perform necessary imports of modules that you'll need later: 

In [None]:
import sempy.fabric as fabric

StatementMeta(, 203c1ef6-c809-42a7-8bd7-577b3bec7114, 9, Finished, Available)

3. You can connect to the Power BI workspace. List the semantic models in the workspace:

In [None]:
fabric.list_datasets()

StatementMeta(, 203c1ef6-c809-42a7-8bd7-577b3bec7114, 10, Finished, Available)

Unnamed: 0,Dataset Name,Dataset ID,Created Timestamp,Last Update
0,semantic_link_data,7e659c31-a025-4f17-85f3-f259c7cdef19,2021-02-12 23:00:58,0001-01-01 00:00:00
1,Customer Profitability Sample,3bb45e13-8773-445c-8356-ada632997731,2014-07-22 03:50:22,0001-01-01 00:00:00
2,Retail Analysis Sample PBIX,0a4b5b1f-76c9-4dda-9e8c-7739790a9c98,2014-05-30 20:16:22,0001-01-01 00:00:00


In this tutorial, you use the _Retail Analysis Sample PBIX_ semantic model:

In [None]:
dataset = "Retail Analysis Sample PBIX"

StatementMeta(, 203c1ef6-c809-42a7-8bd7-577b3bec7114, 11, Finished, Available)

## List workspace measures

Start by listing measures in the semantic model, using SemPy's `list_measures` function as follows:

In [None]:
fabric.list_measures(dataset)

StatementMeta(, 203c1ef6-c809-42a7-8bd7-577b3bec7114, 12, Finished, Available)

Unnamed: 0,Table Name,Measure Name,Measure Expression,Measure Data Type
0,Store,Average Selling Area Size,AVERAGE([SellingAreaSize]),Double
1,Store,New Stores,"CALCULATE(COUNTA([Store Type]), FILTER(ALL(Sto...",Int64
2,Store,New Stores Target,14,Int64
3,Store,Total Stores,COUNTA([StoreNumberName]),Int64
4,Store,Open Store Count,COUNTA([OpenDate]),Int64
5,Store,Count of OpenDate,COUNTA('Store'[OpenDate]),Int64
6,Sales,Regular_Sales_Dollars,SUM([Sum_Regular_Sales_Dollars]),Double
7,Sales,Markdown_Sales_Dollars,SUM([Sum_Markdown_Sales_Dollars]),Double
8,Sales,TotalSales,[Regular_Sales_Dollars]+[Markdown_Sales_Dollars],Double
9,Sales,TotalSalesLY,"CALCULATE([TotalSales], Sales[ScenarioID]=2)",Double


## Evaluate measures

### Evaluate a raw measure

In the following code, use SemPy's `evaluate_measure` function to calculate a preconfigured measure that is called "Average Selling Area Size". You can see the underlying formula for this measure in the output of the previous cell. 

In [None]:
fabric.evaluate_measure(dataset, measure="Average Selling Area Size")

StatementMeta(, 203c1ef6-c809-42a7-8bd7-577b3bec7114, 13, Finished, Available)

Unnamed: 0,Average Selling Area Size
0,24326.923077


### Evaluate a measure with `groupby_columns`

You can group the measure output by certain columns by supplying the additional parameter `groupby_columns`:

In [None]:
fabric.evaluate_measure(dataset, measure="Average Selling Area Size", groupby_columns=["Store[Chain]", "Store[DistrictName]"])

StatementMeta(, 203c1ef6-c809-42a7-8bd7-577b3bec7114, 14, Finished, Available)

Unnamed: 0,Chain,DistrictName,Average Selling Area Size
0,Fashions Direct,FD - District #1,43888.888889
1,Fashions Direct,FD - District #2,47777.777778
2,Fashions Direct,FD - District #3,50000.0
3,Fashions Direct,FD - District #4,50500.0
4,Lindseys,LI - District #1,10384.615385
5,Lindseys,LI - District #2,10909.090909
6,Lindseys,LI - District #3,10333.333333
7,Lindseys,LI - District #4,12500.0
8,Lindseys,LI - District #5,11785.714286


In the previous code, you grouped by the columns `Chain` and `DistrictName` of the `Store` table in the semantic model.

### Evaluate a measure with filters

You can also use the `filters` parameter to specify specific values that the result can contain for particular columns:

In [None]:
fabric.evaluate_measure(dataset, \
                        measure="Total Units Last Year", \
                        groupby_columns=["Store[Territory]"], \
                        filters={"Store[Territory]": ["PA", "TN", "VA"], "Store[Chain]": ["Lindseys"]})

StatementMeta(, 203c1ef6-c809-42a7-8bd7-577b3bec7114, 15, Finished, Available)

Unnamed: 0,Territory,Total Units Last Year
0,PA,11309
1,TN,81663
2,VA,160863


Note that `Store` is the name of the table, `Territory` is the name of the column, and `PA` is one of the values that are allowed by the filter.

### Evaluate a measure across multiple tables

These groups can span multiple tables in the semantic model.

In [None]:
fabric.evaluate_measure(dataset, measure="Total Units Last Year", groupby_columns=["Store[Territory]", "Sales[ItemID]"])

StatementMeta(, 203c1ef6-c809-42a7-8bd7-577b3bec7114, 16, Finished, Available)

Unnamed: 0,Territory,ItemID,Total Units Last Year
0,DE,18049,1
1,DE,18069,1
2,DE,18079,1
3,DE,18085,1
4,DE,18087,3
...,...,...,...
178636,WV,244167,13
178637,WV,244223,4
178638,WV,244242,2
178639,WV,244246,2


### Evaluate multiple measures

The function `evaluate_measure` allows you to supply identifiers of multiple measures and output the calculated values in the same DataFrame:

In [None]:
fabric.evaluate_measure(dataset, measure=["Average Selling Area Size", "Total Stores"], groupby_columns=["Store[Chain]", "Store[DistrictName]"])

StatementMeta(, 203c1ef6-c809-42a7-8bd7-577b3bec7114, 17, Finished, Available)

Unnamed: 0,Chain,DistrictName,Average Selling Area Size,Total Stores
0,Fashions Direct,FD - District #1,43888.888889,9
1,Fashions Direct,FD - District #2,47777.777778,9
2,Fashions Direct,FD - District #3,50000.0,9
3,Fashions Direct,FD - District #4,50500.0,10
4,Lindseys,LI - District #1,10384.615385,13
5,Lindseys,LI - District #2,10909.090909,11
6,Lindseys,LI - District #3,10333.333333,15
7,Lindseys,LI - District #4,12500.0,14
8,Lindseys,LI - District #5,11785.714286,14


## Use Power BI XMLA connector

The default semantic model client is backed by Power BI's REST APIs. If there are any issues running queries with this client, it's possible to switch the back end to Power BI's XMLA interface using `use_xmla=True`. The SemPy parameters remain the same for measure calculation with XMLA.

In [None]:
fabric.evaluate_measure(dataset, \
                        measure=["Average Selling Area Size", "Total Stores"], \
                        groupby_columns=["Store[Chain]", "Store[DistrictName]"], \
                        filters={"Store[Territory]": ["PA", "TN", "VA"], "Store[Chain]": ["Lindseys"]}, \
                        use_xmla=True)

StatementMeta(, 203c1ef6-c809-42a7-8bd7-577b3bec7114, 18, Finished, Available)

Unnamed: 0,Chain,DistrictName,Average Selling Area Size,Total Stores
0,Lindseys,LI - District #2,11000,10
1,Lindseys,LI - District #5,12000,5
2,Lindseys,LI - District #1,10000,1


## Related content

Check out other tutorials for semantic link / SemPy:
1. [Clean data with functional dependencies](https://github.com/microsoft/fabric-samples/blob/main/docs-samples/data-science/semantic-link-samples/data_cleaning_functional_dependencies_tutorial.ipynb)
1. [Analyze functional dependencies in a sample semantic model](https://github.com/microsoft/fabric-samples/blob/main/docs-samples/data-science/semantic-link-samples/powerbi_dependencies_tutorial.ipynb)
1. [Discover relationships in the _Synthea_ dataset, using semantic link](https://github.com/microsoft/fabric-samples/blob/main/docs-samples/data-science/semantic-link-samples/relationships_detection_tutorial.ipynb)
1. [Discover relationships in a semantic model, using semantic link](https://github.com/microsoft/fabric-samples/blob/main/docs-samples/data-science/semantic-link-samples/powerbi_relationships_tutorial.ipynb)