### Microsoft Fabric Book - Azure AI Services demonstration

#### Analyze hotel reviews with AI in Fabric

*Imagine you’re planning a trip to Yellowstone National Park and need to book a hotel that meets your preferences. With hundreds of reviews to sift through, it can be overwhelming to find the most relevant information and filter hotels accordingly. However, with Fabric, you can easily translate, extract, and classify hotel reviews with zero setup effort using prebuilt AI services. Then, with Power BI, you can create a visual report that allows you to filter hotels by categories and view their ratings and comments.*

**SynapseML Installation**

We will be using [SynapseML](https://microsoft.github.io/SynapseML/docs/Overview/) as a tool to help us analyze hotel reviews. SynapseML is an open-source library that makes it easy to create large scale machine learning pipelines. Fabric has the latest SynapseML package preinstalled and integrated with prebuilt AI models, making it a breeze to create smart and scalable systems for various domains.

In [14]:
import synapse.ml.core
from synapse.ml.services import *
from pyspark.sql.functions import col, flatten, udf, lower, trim
from pyspark.sql.types import StringType

StatementMeta(, 8691274f-0912-4d2f-b92c-ca45e2b06440, 16, Finished, Available, Finished)

In [15]:
df = spark.sql("SELECT * FROM LakeDBIA.hotelsreviewsdatas LIMIT 1000")
display(df)

StatementMeta(, 8691274f-0912-4d2f-b92c-ca45e2b06440, 17, Finished, Available, Finished)

SynapseWidget(Synapse.DataFrame, efa68a92-2fc3-413f-8508-551ec2b065a0)

In [16]:
filtered_df = df.filter(df.reviews_text.like("%el hotel%"))
display(filtered_df)

StatementMeta(, 8691274f-0912-4d2f-b92c-ca45e2b06440, 18, Finished, Available, Finished)

SynapseWidget(Synapse.DataFrame, 2a96f068-83bc-4474-b12a-a6721424fc26)

**Hotel reviews in multiple languages, Text Translation using Azure AI Translator**

Translating text in SynapseML is a straightforward process with just a single operation call of `*translate()*`. 

In [17]:
translate = (Translate()
    .setTextCol("reviews_text")
    .setToLanguage("en")
    .setOutputCol("translation")
    .setConcurrency(5))

df_en = translate.transform(df)\
        .withColumn("translation_result", flatten(col("translation.translations")))\
        .withColumn("translation", col("translation_result.text")[0])\
        .cache()

df_en = df_en.select(df_en.columns[:6]+ ["translation"])

display(df_en.tail(10))

StatementMeta(, 8691274f-0912-4d2f-b92c-ca45e2b06440, 19, Finished, Available, Finished)

SynapseWidget(Synapse.DataFrame, 66c0917b-d1fd-4066-a7db-83ed362aa971)

In [18]:
filtered_df_en = df_en.filter(df_en.reviews_text.like("%el hotel%"))
display(filtered_df_en)

StatementMeta(, 8691274f-0912-4d2f-b92c-ca45e2b06440, 20, Finished, Available, Finished)

SynapseWidget(Synapse.DataFrame, 1f08f909-58e3-4107-bbc9-6e9075c4fdd3)

**Key Phrase and Sentiment Extraction using Azure Text Analytics**

The [Azure AI Language](https://azure.microsoft.com/products/ai-services/text-analytics/) is a cloud-based service that enables you to understand and analyze text with Natural Language Processing (NLP) features.

By using the prebuilt AI language in Fabric, you can analyze sentiment, identify key points, and redact sensitive information from the input text. 

In [45]:
model = (AnalyzeText()
        .setTextCol("translation")
        .setKind("KeyPhraseExtraction")
        .setOutputCol("response"))

df_en_key = model.transform(df_en)\
        .withColumn("documents", col("response.documents"))\
        .withColumn("keyPhrases", col("documents.keyPhrases"))\
        .cache()

df_en_key = df_en_key.select(df_en_key.columns[:7]+ ["keyPhrases"])

display(df_en_key.tail(5))

StatementMeta(, 8691274f-0912-4d2f-b92c-ca45e2b06440, 47, Finished, Available, Finished)

SynapseWidget(Synapse.DataFrame, b787c90b-6790-4498-8710-cd117a3a5679)

In [49]:
model = (AnalyzeText()
        .setTextCol("translation")
        .setKind("SentimentAnalysis")
        .setOutputCol("response"))

df_en_key_w_sentiment = model.transform(df_en_key)\
        .withColumn("documents", col("response.documents"))\
        .withColumn("sentiment", col("documents.sentiment"))

df_en_key_w_sentiment = df_en_key_w_sentiment.select(df_en_key_w_sentiment.columns[:8]+ ["sentiment"])

df_en_key_w_sentiment.show(20)

StatementMeta(, 8691274f-0912-4d2f-b92c-ca45e2b06440, 51, Finished, Available, Finished)

+-------------+---------+----------+--------------------+--------------+--------------------+--------------------+--------------------+---------+
|         city| latitude| longitude|                name|reviews_rating|        reviews_text|         translation|          keyPhrases|sentiment|
+-------------+---------+----------+--------------------+--------------+--------------------+--------------------+--------------------+---------+
|         Reno| 36.55722|-121.92194|Best Western Carm...|             2|If you get the ro...|If you get the ro...|[noisy, smelly ba...|    mixed|
|    Arlington|29.949125|-90.069748|   The Whitney Hotel|             2|This Hotel, forme...|This Hotel, forme...|[Casey A. Callais...|    mixed|
|      Houston|29.949125|-90.069748|   The Whitney Hotel|             2|The hotel was alr...|The hotel was alr...|[Casey A. Callais...|    mixed|
|     Hossegor|29.949125|-90.069748|   The Whitney Hotel|             2|Nice locationGood...|Nice locationGood...|[Nice loca

In [52]:
spark.conf.set('spark.sql.parquet.vorder.enabled', 'true')

(df_en_key_w_sentiment
.write
.mode("overwrite")
.format("delta")
.option("parquet.vorder.enabled ","true")
.saveAsTable("hotel_review_AI_services")
)

StatementMeta(, 8691274f-0912-4d2f-b92c-ca45e2b06440, 54, Finished, Available, Finished)

In [54]:
df_result = spark.sql("SELECT * FROM LakeDBIA.hotel_review_AI_services")
display(df_result)

StatementMeta(, 8691274f-0912-4d2f-b92c-ca45e2b06440, 56, Finished, Available, Finished)

SynapseWidget(Synapse.DataFrame, ec4da6af-c9c3-4139-ae5c-17c28457cfbd)