# Sentiment Analysis using local model service

We're utilizing the `/sentiment` endpoint in out model api service to analyze the reviews and assign a sentiment value. <br/>

Using the [nlptown/bert-base-multilingual-uncased-sentiment](https://huggingface.co/nlptown/bert-base-multilingual-uncased-sentiment) model downloaded from Huggingface

In [13]:
import sys
import pandas as pd
from pyspark.sql import SparkSession,  DataFrame, functions as F

In [14]:
import requests
import json

def post_json_request(url, payload, additional_headers=None):
    # Set default headers
    headers = {
        'Content-Type': 'application/json',
        'Accept': 'application/json',
    }

    # Merge with additional headers if provided
    if additional_headers:
        headers.update(additional_headers)
        
    payload_json = json.dumps(payload)

    # Perform the POST request
    response = requests.post(url, data=payload_json, headers=headers)

    # Return the response
    return response



In [15]:
spark = SparkSession.builder.appName("sentiment_analysis").getOrCreate()

24/01/23 01:28:44 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
24/01/23 01:28:44 WARN Utils: Service 'SparkUI' could not bind on port 4041. Attempting port 4042.


In [16]:
df = spark.read.table("lakehouse.kaggle_airbnb.reviews")

In [17]:
df_10 = df.limit(10)
df_10.toPandas().head()

                                                                                

Unnamed: 0,listing_id,id,date,reviewer_id,reviewer_name,comments
0,7202016,38917982,2015-07-19,28943674,Bianca,Cute and cozy place. Perfect location to every...
1,7202016,39087409,2015-07-20,32440555,Frank,Kelly has a great room in a very central locat...
2,7202016,39820030,2015-07-26,37722850,Ian,"Very spacious apartment, and in a great neighb..."
3,7202016,40813543,2015-08-02,33671805,George,Close to Seattle Center and all it has to offe...
4,7202016,41986501,2015-08-10,34959538,Ming,Kelly was a great host and very accommodating ...


In [18]:
def get_sentiment(review:str)->str:
    URL = "http://model-api-svc.models.svc.cluster.local:8000/api/v1/models/sentiment"
    request = {"text":review}
    return post_json_request(URL,  request).json()['result']
    

In [19]:
from pyspark.sql.functions import col, udf
get_sentiment_udf = udf(lambda r: get_sentiment(r)) 


In [20]:
df_10_sentiments = df_10.withColumn('review_sentiment',  get_sentiment_udf(col("comments"))  ) 

In [21]:
df_10_sentiments.select('comments', 'review_sentiment').toPandas().head()

                                                                                

Unnamed: 0,comments,review_sentiment
0,Cute and cozy place. Perfect location to every...,Very Positive
1,Kelly has a great room in a very central locat...,Very Positive
2,"Very spacious apartment, and in a great neighb...",Positive
3,Close to Seattle Center and all it has to offe...,Very Positive
4,Kelly was a great host and very accommodating ...,Very Positive


In [None]:
df_10_sentiments.select('comments', 'review_sentiment').show(truncate=False)

[Stage 2:>                                                          (0 + 1) / 1]

In [None]:
spark.stop()