# Transform and enrich data seamlessly with AI functions (Preview)

[Transform and enrich data seamlessly with AI functions - Microsoft Fabric | Microsoft Learn](https://learn.microsoft.com/en-us/fabric/data-science/ai-functions/overview?tabs=pandas#getting-started-with-ai-functionshttps://learn.microsoft.com/en-us/fabric/data-science/ai-functions/overview?tabs=pandas#getting-started-with-ai-functions)

Use of the AI functions library in a Fabric notebook currently requires certain custom packages. The following code installs and imports those packages. Afterward, you can use AI functions with pandas or PySpark, depending on your preference.

In [1]:
# Install fixed version of packages
%pip install -q openai==1.30
%pip install -q --force-reinstall httpx==0.27.0

# Install latest version of SynapseML-core
%pip install -q --force-reinstall https://mmlspark.blob.core.windows.net/pip/1.0.9/synapseml_core-1.0.9-py2.py3-none-any.whl

# Install SynapseML-Internal .whl with AI functions library from blob storage:
%pip install -q --force-reinstall https://mmlspark.blob.core.windows.net/pip/1.0.10.0-spark3.4-6-0cb46b55-SNAPSHOT/synapseml_internal-1.0.10.0.dev1-py2.py3-none-any.whl

StatementMeta(, b18c4422-ab95-459f-90f7-b88f2ce73453, 11, Finished, Available, Finished)

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
fsspec-wrapper 0.1.14 requires PyJWT>=2.6.0, but you have pyjwt 2.4.0 which is incompatible.[0m[31m
[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.0[0m[39;49m -> [0m[32;49m25.0.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython -m pip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
datasets 2.19.1 requires fsspec[http]<=2024.3.1,>=2023.1.0, but you have fsspec 2024.6.1 which is incompatible.
fsspec-wrapper 0.1.14 requires PyJWT>=2.6.0, but you have pyjwt 2.4.0 which is incompatible.[0

In [12]:
# Required imports
import synapse.ml.aifunc as aifunc
import pandas as pd
import openai

# Optional import for progress bars
from tqdm.auto import tqdm
tqdm.pandas()

StatementMeta(, b18c4422-ab95-459f-90f7-b88f2ce73453, 23, Finished, Available, Finished)

### Calculate similarity with `ai.similarity`

The `ai.similarity` function invokes AI to compare input text values with a single common text value, or with pairwise text values in another column. The output similarity scores are relative, and they can range from **-1** (opposites) to **1** (identical). A score of **0** indicates that the values are completely unrelated in meaning. For more detailed instructions about the use of `ai.similarity`, visit [this article](https://learn.microsoft.com/en-us/fabric/data-science/ai-functions/similarity).

In [13]:
# This code uses AI. Always review output for mistakes. 
# Read terms: https://azure.microsoft.com/support/legal/preview-supplemental-terms/

df = pd.DataFrame([ 
        ("Bill Gates", "Microsoft"), 
        ("Satya Nadella", "Toyota"), 
        ("Joan of Arc", "Nike") 
    ], columns=["names", "companies"])
    
df["similarity"] = df["companies"].ai.similarity(df["names"])
display(df)

StatementMeta(, b18c4422-ab95-459f-90f7-b88f2ce73453, 24, Finished, Available, Finished)

100%|██████████| 3/3 [00:00<00:00,  6.55it/s]
100%|██████████| 3/3 [00:00<00:00, 10.50it/s]


SynapseWidget(Synapse.DataFrame, 662bd067-09e1-4b13-8e19-b00d92777e64)

### Categorize text with `ai.classify`

The `ai.classify` function invokes AI to categorize input text according to custom labels you choose. For more information about the use of `ai.classify`, visit [this article](https://learn.microsoft.com/en-us/fabric/data-science/ai-functions/classify).

In [14]:
# This code uses AI. Always review output for mistakes. 
# Read terms: https://azure.microsoft.com/support/legal/preview-supplemental-terms/

df = pd.DataFrame([
        "This duvet, lovingly hand-crafted from all-natural fabric, is perfect for a good night's sleep.",
        "Tired of friends judging your baking? With these handy-dandy measuring cups, you'll create culinary delights.",
        "Enjoy this *BRAND NEW CAR!* A compact SUV perfect for the professional commuter!"
    ], columns=["descriptions"])

df["category"] = df['descriptions'].ai.classify("kitchen", "bedroom", "garage", "other")
display(df)

StatementMeta(, b18c4422-ab95-459f-90f7-b88f2ce73453, 25, Finished, Available, Finished)

100%|██████████| 3/3 [00:00<00:00,  3.40it/s]


SynapseWidget(Synapse.DataFrame, 679dab12-f67a-46bd-9b10-f7496bf6ead5)

###  Detect sentiment with `ai.analyze_sentiment`

The `ai.analyze_sentiment` function invokes AI to identify whether the emotional state expressed by input text is positive, negative, mixed, or neutral. If AI can't make this determination, the output is left blank. For more detailed instructions about the use of `ai.analyze_sentiment`, visit [this article](https://learn.microsoft.com/en-us/fabric/data-science/ai-functions/analyze-sentiment).

In [15]:
# This code uses AI. Always review output for mistakes. 
# Read terms: https://azure.microsoft.com/support/legal/preview-supplemental-terms/

df = pd.DataFrame([
        "The cleaning spray permanently stained my beautiful kitchen counter. Never again!",
        "I used this sunscreen on my vacation to Florida, and I didn't get burned at all. Would recommend.",
        "I'm torn about this speaker system. The sound was high quality, though it didn't connect to my roommate's phone.",
        "The umbrella is OK, I guess."
    ], columns=["reviews"])

df["sentiment"] = df["reviews"].ai.analyze_sentiment()
display(df)

StatementMeta(, b18c4422-ab95-459f-90f7-b88f2ce73453, 26, Finished, Available, Finished)

100%|██████████| 4/4 [00:00<00:00,  8.09it/s]


SynapseWidget(Synapse.DataFrame, bb41a666-1dc1-4ab4-96ce-f082939df75d)

### Extract entities with `ai.extract`

The `ai.extract` function invokes AI to scan input text and extract specific types of information designated by labels you choose—for example, locations or names. For more detailed instructions about the use of `ai.extract`, visit [this article](https://learn.microsoft.com/en-us/fabric/data-science/ai-functions/extract).

In [16]:
# This code uses AI. Always review output for mistakes. 
# Read terms: https://azure.microsoft.com/support/legal/preview-supplemental-terms/

df = pd.DataFrame([
        "MJ Lee lives in Tuscon, AZ, and works as a software engineer for Microsoft.",
        "Kris Turner, a nurse at NYU Langone, is a resident of Jersey City, New Jersey."
    ], columns=["descriptions"])

df_entities = df["descriptions"].ai.extract("name", "profession", "city")
display(df_entities)

StatementMeta(, b18c4422-ab95-459f-90f7-b88f2ce73453, 27, Finished, Available, Finished)

100%|██████████| 2/2 [00:00<00:00,  2.48it/s]


SynapseWidget(Synapse.DataFrame, ea9ba4e9-35e7-441f-9d2e-e6520a774758)

###  Fix grammar with `ai.fix_grammar`

The `ai.fix_grammar` function invokes AI to correct the spelling, grammar, and punctuation of input text. For more detailed instructions about the use of `ai.fix_grammar`, visit [this article](https://learn.microsoft.com/en-us/fabric/data-science/ai-functions/fix-grammar).

In [None]:
# This code uses AI. Always review output for mistakes. 
# Read terms: https://azure.microsoft.com/support/legal/preview-supplemental-terms/

df = pd.DataFrame([
        "There are an error here.",
        "She and me go weigh back. We used to hang out every weeks.",
        "The big picture are right, but you're details is all wrong."
    ], columns=["text"])

df["corrections"] = df["text"].ai.fix_grammar()
display(df)

### Summarize text with `ai.summarize`

The `ai.summarize` function invokes AI to generate summaries of input text (either values from a single column of a DataFrame, or row values across all the columns). For more detailed instructions about the use of `ai.summarize`, visit [this dedicated article](https://learn.microsoft.com/en-us/fabric/data-science/ai-functions/summarize).

In [17]:
# This code uses AI. Always review output for mistakes. 
# Read terms: https://azure.microsoft.com/support/legal/preview-supplemental-terms/

df= pd.DataFrame([
        ("Microsoft Teams", "2017",
        """
        The ultimate messaging app for your organization—a workspace for real-time 
        collaboration and communication, meetings, file and app sharing, and even the 
        occasional emoji! All in one place, all in the open, all accessible to everyone.
        """),
        ("Microsoft Fabric", "2023",
        """
        An enterprise-ready, end-to-end analytics platform that unifies data movement, 
        data processing, ingestion, transformation, and report building into a seamless, 
        user-friendly SaaS experience. Transform raw data into actionable insights.
        """)
    ], columns=["product", "release_year", "description"])

df["summaries"] = df["description"].ai.summarize()
display(df)

StatementMeta(, b18c4422-ab95-459f-90f7-b88f2ce73453, 28, Finished, Available, Finished)

100%|██████████| 2/2 [00:00<00:00,  3.29it/s]


SynapseWidget(Synapse.DataFrame, 274bdfa0-e07e-4588-9381-b44b333e53fe)

###  Translate text with `ai.translate`

The `ai.translate` function invokes AI to translate input text to a new language of your choice. For more detailed instructions about the use of `ai.translate`, visit [this article](https://learn.microsoft.com/en-us/fabric/data-science/ai-functions/translate).

In [18]:
# This code uses AI. Always review output for mistakes. 
# Read terms: https://azure.microsoft.com/support/legal/preview-supplemental-terms/

df = pd.DataFrame([
        "Hello! How are you doing today?", 
        "Tell me what you'd like to know, and I'll do my best to help.", 
        "The only thing we have to fear is fear itself."
    ], columns=["text"])

df["translations"] = df["text"].ai.translate("french")
display(df)

StatementMeta(, b18c4422-ab95-459f-90f7-b88f2ce73453, 29, Finished, Available, Finished)

100%|██████████| 3/3 [00:00<00:00,  5.39it/s]


SynapseWidget(Synapse.DataFrame, c71896aa-6cf2-4336-a88d-f2f7ce66a7e2)

### Answer custom user prompts with `ai.generate_response`

The `ai.generate_response` function invokes AI to generate custom text based on your own instructions. For more detailed instructions about the use of `ai.generate_response`, visit [this article](https://learn.microsoft.com/en-us/fabric/data-science/ai-functions/generate-response).

In [19]:
# This code uses AI. Always review output for mistakes. 
# Read terms: https://azure.microsoft.com/support/legal/preview-supplemental-terms/

df = pd.DataFrame([
        ("Scarves"),
        ("Snow pants"),
        ("Ski goggles")
    ], columns=["product"])

df["response"] = df.ai.generate_response("Write a short, punchy email subject line for a winter sale.")
display(df)

StatementMeta(, b18c4422-ab95-459f-90f7-b88f2ce73453, 30, Finished, Available, Finished)

100%|██████████| 3/3 [00:00<00:00,  3.31it/s]


SynapseWidget(Synapse.DataFrame, edbce91e-d034-4046-9a55-a6cb532475cb)