# TUTORIAL: Test Lettria model for sentiment analysis with AI notebooks

The aim is to analyze sentiment of e-commerce site reviews thanks to Lettria model.

- **What is a Lettria model?**

Lettria is a start-up specialized in NLP (Natural Language Processing). The platform enables all organizations, from start-ups to large corporations, to perform textual analysis on their data to take the best strategic decisions.

Lettria provides text understanding models that allow users to easily identify and extract key information from their text. This method relies on artificial intelligence and NLP techniques to extract **sentiments**, emotions and entities from a text.

### Code:
- Install dependencies
- Import Python librairies
- Load test dataset from Hugging Face hub and create a dataframe
- Use Lettria app for sentiment analysis

## Step 1 - Install dependencies

In [None]:
!pip install datasets scikit-learn

## Step 2 - Import Python librairies

In [1]:
# import dependencies
from datasets import load_dataset

import time
import json
import requests

import pandas as pd
import numpy as np

from sklearn.metrics import confusion_matrix, accuracy_score, precision_score, recall_score, f1_score

## Step 2 - Load test dataset from Hugging Face hub and create a dataframe

- **Load test dataset and process output**

In [2]:
# load test dataset function (https://huggingface.co/datasets/amazon_reviews_multi)
def womens_clothing_ecommerce_reviews():

    # download test set from Hugging Face and display information
    dataset = load_dataset("saattrupdan/womens-clothing-ecommerce-reviews", "test")

    # extract needed information and add it into a list for Dataset
    dataset_test = []
    for i in range(len(dataset['test'])):
        info = {}
        # extract sentence (str)
        info['review_text'] = dataset['test'][i]['review_text']
        # extract sentiment (int 1, 2, 3, 4, 5) -> (int 0, 1, 2, 3, 4)
        polarity = dataset['test'][i]['rating']
        info['rating'] = polarity - 1
        dataset_test.append(info)

    return dataset_test

- **Use function and save data as json file**

In [3]:
# load test dataset
print("The test dataset starts loading...")

# create json file
test_set = womens_clothing_ecommerce_reviews()
dataset_test_json = f'/workspace/data/dataset_test.json'
with open(dataset_test_json, 'w') as json_file:
    json.dump(test_set, json_file, indent=1)
    
print("The test dataset is now ready and saved as a json file!")

The test dataset starts loading...
The test dataset is now ready and saved as a json file!


## Step 3 - Use Lettria app for sentiment analysis

- **Test Lettria model on data**

In [4]:
# sentiment analysis function for Lettria model on test dataset
def lettria_sentiment(dataset_test):

    # call partner model from the app deployed with OVHcloud AI Deploy - Lettria
    url = "https://73f75f90-73c3-4e08-8326-e3fef84e74e5.app.gra.ai.cloud.ovh.net/predict"

    # define headers and add token
    headers = {'content-type': 'application/json',
               'Accept-Charset': 'UTF-8'}

    # analyse sentiment on texts from test set with Lettria
    result_model_lettria = []
    for i in range(len(dataset_test)):
        result = {}
        # add sentence (str)
        result['review_text'] = dataset_test[i]['review_text']
        # add sentiment (float [-1;1]) -> (int 0, 1, 2, 3, 4)
        inp = json.dumps([dataset_test[i]['review_text']])
        output = requests.post(url, data=inp, headers=headers).json()
        score = output[0]['score']
        result['rate_lettria'] = 0 if score < -0.6 \
                                    else 1 if -0.6<= score <-0.2 \
                                    else 2 if -0.2<= score <0.2  \
                                    else 3 if 0.2<= score <0.6 \
                                    else 4
        result_model_lettria.append(result)

    return result_model_lettria

- **Use function and time the process**

In [5]:
# get sentiment analysis result
print("The lettria model starts analysis...")

# time the inference
start = time.time()

# get sentiment analysis result
lettria_output = lettria_sentiment(test_set)

end = time.time()
print(f"Lettria model process time: {end - start} seconds")

The lettria model starts analysis...
Lettria model process time: 65.11147165298462 seconds


- **Save data as json file**

In [6]:
# create json file
result_lettria_json = '/workspace/results/result_model_lettria.json'
with open(result_lettria_json, 'w') as json_file:
    json.dump(lettria_output, json_file, indent=1)