### Microsoft Fabric Book - Azure AI Services demonstration

#### Analyze hotel reviews with AI in Fabric

*Imagine you’re planning a trip to Yellowstone National Park and need to book a hotel that meets your preferences. With hundreds of reviews to sift through, it can be overwhelming to find the most relevant information and filter hotels accordingly. However, with Fabric, you can easily translate, extract, and classify hotel reviews with zero setup effort using prebuilt AI services. Then, with Power BI, you can create a visual report that allows you to filter hotels by categories and view their ratings and comments.*

**SynapseML Installation**

We will be using [SynapseML](https://microsoft.github.io/SynapseML/docs/Overview/) as a tool to help us analyze hotel reviews. SynapseML is an open-source library that makes it easy to create large scale machine learning pipelines. Fabric has the latest SynapseML package preinstalled and integrated with prebuilt AI models, making it a breeze to create smart and scalable systems for various domains.

In [1]:
import synapse.ml.core
from synapse.ml.services import *
from pyspark.sql.functions import col, flatten, udf, lower, trim
from pyspark.sql.types import StringType

StatementMeta(, a7473636-cb13-4b0b-9998-6078b2f76b85, 3, Finished, Available, Finished)

In [18]:
df = spark.sql("SELECT * FROM LakeDBIA.hotel_reviews_demo WHERE name LIKE '%Best Western%'")
display(df)

StatementMeta(, a7473636-cb13-4b0b-9998-6078b2f76b85, 20, Finished, Available, Finished)

SynapseWidget(Synapse.DataFrame, fd9c583c-b218-4fe5-9a7c-4069c646a6f1)

In [19]:
filtered_df = df.filter(df.reviews_text.like("%el hotel%"))
display(filtered_df)

StatementMeta(, a7473636-cb13-4b0b-9998-6078b2f76b85, 21, Finished, Available, Finished)

SynapseWidget(Synapse.DataFrame, f6f94c2f-8182-4993-aa94-d6200bb6d1f9)

**Hotel reviews in multiple languages, Text Translation using Azure AI Translator**

Translating text in SynapseML is a straightforward process with just a single operation call of `*translate()*`. 

In [20]:
translate = (Translate()
    .setTextCol("reviews_text")
    .setToLanguage("en")
    .setOutputCol("translation")
    .setConcurrency(5))

df_en = translate.transform(df)\
        .withColumn("translation_result", flatten(col("translation.translations")))\
        .withColumn("translation", col("translation_result.text")[0])\
        .cache()

df_en = df_en.select(df_en.columns[:6]+ ["translation"])

display(df_en.tail(10))

StatementMeta(, a7473636-cb13-4b0b-9998-6078b2f76b85, 22, Finished, Available, Finished)

SynapseWidget(Synapse.DataFrame, b103c899-6215-44ab-bc37-38d236412320)

In [21]:
filtered_df_en = df_en.filter(df_en.reviews_text.like("%el hotel%"))
display(filtered_df_en)

StatementMeta(, a7473636-cb13-4b0b-9998-6078b2f76b85, 23, Finished, Available, Finished)

SynapseWidget(Synapse.DataFrame, 6fe80c64-4b92-4805-85f0-b66fd5a0bfff)

**Key Phrase and Sentiment Extraction using Azure Text Analytics**

The [Azure AI Language](https://azure.microsoft.com/products/ai-services/text-analytics/) is a cloud-based service that enables you to understand and analyze text with Natural Language Processing (NLP) features.

By using the prebuilt AI language in Fabric, you can analyze sentiment, identify key points, and redact sensitive information from the input text. 

In [22]:
model = (AnalyzeText()
        .setTextCol("translation")
        .setKind("KeyPhraseExtraction")
        .setOutputCol("response"))

df_en_key = model.transform(df_en)\
        .withColumn("documents", col("response.documents"))\
        .withColumn("keyPhrases", col("documents.keyPhrases"))\
        .cache()

df_en_key = df_en_key.select(df_en_key.columns[:7]+ ["keyPhrases"])

display(df_en_key.tail(5))

StatementMeta(, a7473636-cb13-4b0b-9998-6078b2f76b85, 24, Finished, Available, Finished)

SynapseWidget(Synapse.DataFrame, dd9f64c3-f033-453f-8919-36bde6835b19)

In [24]:
model = (AnalyzeText()
        .setTextCol("translation")
        .setKind("SentimentAnalysis")
        .setOutputCol("response"))

df_en_key_w_sentiment = model.transform(df_en_key)\
        .withColumn("documents", col("response.documents"))\
        .withColumn("sentiment", col("documents.sentiment"))

df_en_key_w_sentiment = df_en_key_w_sentiment.select(df_en_key_w_sentiment.columns[:8]+ ["sentiment"])

df_en_key_w_sentiment.show(20)

StatementMeta(, a7473636-cb13-4b0b-9998-6078b2f76b85, 26, Finished, Available, Finished)

+---------------+--------+----------+--------------------+--------------+--------------------+--------------------+--------------------+---------+
|           city|latitude| longitude|                name|reviews_rating|        reviews_text|         translation|          keyPhrases|sentiment|
+---------------+--------+----------+--------------------+--------------+--------------------+--------------------+--------------------+---------+
|       San Jose|34.44178|-119.81979|Best Western Plus...|             3|This hotel was ni...|This hotel was ni...|[The Plus categor...|    mixed|
|  San Francisco|36.55722|-121.92194|Best Western Carm...|             4|Nous avons s�jour...|We stayed in the ...|[three night day,...|    mixed|
|Prescott Valley|36.55722|-121.92194|Best Western Carm...|             3|Parking was horri...|Parking was horri...|[rental car, vend...| negative|
|       Guaynabo|36.55722|-121.92194|Best Western Carm...|             5|Non economico ma ...|Not cheap but exc...|[go

In [25]:
spark.conf.set('spark.sql.parquet.vorder.enabled', 'true')

(df_en_key_w_sentiment
.write
.mode("overwrite")
.format("delta")
.option("parquet.vorder.enabled ","true")
.saveAsTable("hotel_review_AI_services")
)

StatementMeta(, a7473636-cb13-4b0b-9998-6078b2f76b85, 27, Finished, Available, Finished)

In [26]:
df_result = spark.sql("SELECT * FROM LakeDBIA.hotel_review_AI_services")
display(df_result)

StatementMeta(, a7473636-cb13-4b0b-9998-6078b2f76b85, 28, Finished, Available, Finished)

SynapseWidget(Synapse.DataFrame, 51bdde0e-afeb-4b3a-a403-0748d7e15ff7)