Skip to content

Latest commit

 

History

History
74 lines (52 loc) · 5.76 KB

semantic-link-overview.md

File metadata and controls

74 lines (52 loc) · 5.76 KB
title description ms.reviewer ms.author author reviewer ms.topic ms.custom ms.date ms.search.form
What is semantic link?
Overview of semantic link.
mopeakande
marcozo
eisber
msakande
overview
ignite-2023
06/06/2023
semantic link

What is semantic link?

Semantic link is a feature that allows you to establish a connection between semantic models and [!INCLUDE fabric-ds-name] in Microsoft Fabric. Use of semantic link is only supported in Microsoft Fabric.

For Spark 3.4 and above, Semantic link is available in the default runtime when using Fabric, and there is no need to install it. If you are using Spark 3.3 or below, or if you want to update to the most recent version of Semantic Link, you can run the command:

%pip install -U semantic-link

The primary goals of semantic link are to facilitate data connectivity, enable the propagation of semantic information, and seamlessly integrate with established tools used by data scientists, such as notebooks. semantic link helps you to preserve domain knowledge about data semantics in a standardized way that can speed up data analysis and reduce errors.

Overview of semantic link

The data flow starts with semantic models that contain data and semantic information. Semantic link bridges the gap between Power BI and the Data Science experience.

:::image type="content" source="media/semantic-link-overview/data-flow-with-semantic-link.png" alt-text="A diagram that shows data flow from Power BI to notebooks in Synapse Data Science and back to Power BI.":::

With semantic link, you can use semantic models from Power BI in the Data Science experience to perform tasks such as in-depth statistical analysis and predictive modeling with machine learning techniques. The output of your data science work can be stored in OneLake using Apache Spark and ingested into Power BI using Direct Lake.

Power BI connectivity

Semantic models serve as the single tabular object model, providing a reliable source for semantic definitions, such as Power BI measures. To connect to semantic models:

  • Semantic link offers data connectivity to the Python pandas ecosystem via the SemPy Python library, making it easy for data scientists to work with the data.
  • Semantic link provides access to semantic models through the Spark native connector for data scientists that are more familiar with the Apache Spark ecosystem. This implementation supports various languages, including PySpark, Spark SQL, R, and Scala.

Applications of semantic information

Semantic information in data includes Power BI data categories such as address and postal code, relationships between tables, and hierarchical information. These data categories comprise metadata that semantic link propagates into the Data Science environment to enable new experiences and maintain data lineage. Some example applications of semantic link are:

  • Intelligent suggestions of built-in semantic functions.
  • Innovative integration for augmenting data with Power BI measures through the use of add-measures.
  • Tools for data quality validation based on the relationships between tables and functional dependencies within tables.

Semantic link is a powerful tool that enables business analysts to use data effectively in a comprehensive data science environment. Semantic link facilitates seamless collaboration between data scientists and business analysts by eliminating the need to reimplement business logic embedded in Power BI measures. This approach ensures that both parties can work efficiently and productively, maximizing the potential of their data-driven insights.

FabricDataFrame data structure

FabricDataFrame is the core data structure of semantic link. It subclasses the pandas DataFrame and adds metadata, such as semantic information and lineage. FabricDataFrame is the primary data structure that semantic link uses to propagate semantic information from semantic models into the Data Science environment.

:::image type="content" source="media/semantic-link-overview/semantic-link-overview-fabric-dataframes.png" alt-text="A diagram that shows data flow from connectors to semantic models to FabricDataFrame to Semantic Functions." lightbox="media/semantic-link-overview/semantic-link-overview-fabric-dataframes.png":::

FabricDataFrame supports all pandas operations and more. It exposes semantic functions and the add-measure method that enable you to use Power BI measures in your data science work.

Related content