# Sentiment Prompt
Apply Prompt to extract relevant information from Woman Dresses reviews include:
- Sentiment and Emotions of customers
- Summary the feedback

## Set up

In [1]:
import openai # need to be >= 0.27.0, Python >= 3.7.1
import os

from dotenv import load_dotenv
load_dotenv()

openai.api_key = os.getenv('openai_api_key')

In [42]:
def get_completion(prompt, model = "gpt-3.5-turbo"):
    messages = [{"role": "user", "content": prompt}]
    response = openai.ChatCompletion.create(
        model = model,
        messages = messages,
        temperature = 0
    ) 

    return response['choices'][0]['message']['content'] # this is old: response.choice[0].message['content']

## Review texts for analysis
Extract the review_text from main dataset (over 20000 rows). But for experiment, the project uses 5000 rows

In [3]:
import pandas as pd

data = pd.read_csv('./data/Women-Dresses-Reviews-Dataset.csv', index_col=0) # ignore 1st col

data.head()

Unnamed: 0_level_0,age,division_name,department_name,class_name,clothing_id,title,review_text,alike_feedback_count,rating,recommend_index
s.no,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
0,40,General,Bottoms,Jeans,1028,Amazing fit and wash,Like other reviewers i was hesitant to spend t...,0,5,1
1,62,General Petite,Tops,Blouses,850,Lovely and unique!,As is true of a bunch of the fall clothing pho...,12,5,1
2,47,General Petite,Bottoms,Skirts,993,Meh,"I so wanted this skirt to work, love the desig...",3,1,0
3,45,General Petite,Bottoms,Pants,1068,Wow,Love love this! i was hesitant to buy this at ...,0,5,1
4,37,Initmates,Intimate,Swim,24,Great for bigger busts,I absolutely love the retro look of this swims...,0,5,1


In [4]:
# Get 5000 rows
reviews = data.review_text[:5000].to_list() # convert texts into list before feeding to prompt
print(type(reviews))
print(len(reviews))

<class 'list'>
5000


In [9]:
reviews

['Like other reviewers i was hesitant to spend this much on a pair of jeans. however, i purchased them at  20% off on retailer day and...honestly...they look so good i probably would have paid full price. these jeans are fresh!',
 'As is true of a bunch of the fall clothing photos, the colors are totally washed out in these model images which is such a shame. the embroidery is bright and vivid and totally unique on this! the bib area is actually a soft corduroy which i think is nice to transition into fall and winter. in terms of fit, i do feel like this is maybe geared more towards the slender build - it is a slim cut which i found really flattering for me since i sometimes swim in tunics. at 5\'7", 128# with a very small',
 "I so wanted this skirt to work, love the design! but, it's way, way too long... i am 5, 5, 116lb, and the small is 1 inch on the floor. i step on the skirt as i walk.",
 "Love love this! i was hesitant to buy this at first - the reviews made it seem so big and i 

## Prompt Engineering
There are 2 prompts applied in this case:
- Prompt 1: Extracting all items as instructions
- Prompt 2: Transforming the data extraction in prompt 1 to HTML 

In [None]:
#import time
import json

In [88]:
reviews = data.review_text[600:700].to_list()
print(len(reviews))

100


In [89]:
all_responses = []

for i in range(len(reviews)):
    prompt = f"""
     Your task is to identify the following items based on product reviews:
    - department: (two most related departments, or one if reviews relate to just one department)
    - sentiment: (positive, negative)
    - emotions: (all list of emotions)
    - summary: (summary)
    
    department: From the review, extract the relevant departments. Information about deparments:
    - Marketing: tailors promotions, discounts, campaigns, highlights customer-favored styles and brands to boost sales
    - Sales: Gives sales and price strategies and offers
    - Inventory: Adjusts clothing in stock, is it lack of products, or sizes of products in stock
    - Product: Refines clothing design, colors, style, material quality, textil and sizing accuracy, follows up trends, aligns products with customer preferences, Ensures clothing meets quality standards, addresses quality-related feedback
    - Logistics: Manages timely clothing delivery, addresses shipping concerns 
      
    sentiment: What is the sentiment of the following product 
    review. Give your answer as a single word, \
    either "positive" or "negative".
                
    emotions: A list of emotions that the writer of the \
    following review is expressing. Include in range of \
    five emotions, and filter just emotions are unique, \
    and different, compared to other emotions.\
    Format your answers separated by commas. 
    
    summary: summarize review implying the \
    relevant information in related department,\
    if relating to Marketing, show the promotions \
    or campaign. limit answer to 10 words and \
    in lower-case words. 

    And reviews are delimited by triple backticks.\
    Format all the words in the required answers \
    in lower-case words.
    
    Review: '''{reviews[i]}'''
    """
    response = get_completion(prompt)
    all_responses.append(response)
    print(i, response, "\n")

    # Save the results every 50 rows
    if (i + 1) % 100 == 0:
        with open(f"./data/responses_{i + 1}.json", "w") as f:
            json.dump(all_responses, f)
    
        # Add a 60-second break time
        # time.sleep(60)

    

0 department: Product
sentiment: negative
emotions: disappointed, frustrated, dissatisfied, regretful, hopeful
summary: sweater dress not flattering for hourglass figure, quality is good 

1 department: Product
sentiment: negative
emotions: frustration, disappointment, dissatisfaction, annoyance, dissatisfaction
summary: sizing issue in bust and shoulder area 

2 department: Product
sentiment: positive
emotions: soft, airy, great, terrible, more
summary: soft and airy, great color 

3 department: Product
sentiment: negative
emotions: disappointment, frustration, dissatisfaction, skepticism, reluctance
summary: fit issues, fabric quality, elastic back, not worth the price 

4 department: Product
sentiment: positive
emotions: good, felt, perfect, happier
summary: navy dress, white crochet sweater, perfect dress for mother of the groom. 

5 department: Product
sentiment: positive
emotions: unique, gorgeous, great, ethnic/bohemian beauty, femininity
summary: unique and gorgeous top with et

In [None]:
all_responses

In [82]:
with open('./data/responses1_100.json', 'r') as f1:
    json1 = json.load(f1)
with open('./data/responses2_100.json', 'r') as f2:
    json2 = json.load(f2)

combined_json = json1 + json2
with open ('./data/combined1.json', 'w') as f:
    json.dump(combined_json, f)


In [77]:
print(a)

None


In [83]:
# Initialize empty lists for each column
departments = []
sentiments = []
emotions = []
summaries = []

# Iterate through each sample and extract information
for sample in combined_json:
    lines = sample.split('\n')  # Split the sample into lines
    
    # Initialize variables for each piece of information
    department = ''
    sentiment = ''
    emotion = ''
    summary = ''
    
    for line in lines:
        parts = line.split(': ')
        if len(parts) == 2:
            key, value = parts
            if key == 'department':
                department = value.strip()
            elif key == 'sentiment':
                sentiment = value.strip()
            elif key == 'emotions':
                emotion = value.strip()
            elif key == 'summary':
                summary = value.strip()
    
    # Append extracted values to respective lists
    departments.append(department)
    sentiments.append(sentiment)
    emotions.append(emotion)
    summaries.append(summary)

# Create a DataFrame
df = pd.DataFrame({
    'Department': departments,
    'Sentiment': sentiments,
    'Emotions': emotions,
    'Summary': summaries
})


In [84]:
df.head()

Unnamed: 0,Department,Sentiment,Emotions,Summary
0,Product,negative,"loved, ripped, flimsy, cool, shame","flimsy leggings, cool looking, ripped belt loop"
1,Product,positive,"soft, transition, shape, nicely, blob","soft material, transition piece, nicely cut, n..."
2,,,,
3,Product,negative,"disappointment, frustration, uncertainty, diss...","oversized cardigan, bulky shoulders, too much ..."
4,Product,positive,"excited, satisfied, impressed, happy, intrigued","comfortable fabric, unique design, zipper for ..."


In [85]:
df.to_csv('./data/result4.csv', index=False)

In [37]:
df = pd.read_csv('./data/result1.csv')
df


Unnamed: 0,Department,Sentiment,Emotions,Summary
0,"Marketing, Sales",positive,"hesitant, good, fresh","20% off on retailer day, jeans look good"
1,Product and Quality Assurance,positive,"shame, bright, vivid, unique, nice","bright and vivid embroidery, unique design, sl..."
2,Product and Quality Assurance,negative,"disappointment, frustration, dissatisfaction, ...","skirt design too long, not suitable for height..."
3,"Marketing, Inventory Management",positive,"love, hesitant, perfect, fabulous, great","perfect find, fabulous color, great fit, can't..."
4,"Marketing, Product and Quality Assurance",positive,"love, excited, satisfied, relieved, confident","retro swimsuit, blogger amber fillerup-clark, ..."
...,...,...,...,...
95,Product and Quality Assurance,positive,"love, great","pants fit perfectly, great for casual days at ..."
96,"Marketing, Product and Quality Assurance",positive,"love, adore","relaxed fit dress, great for casual wear with ..."
97,,,,
98,"Marketing, Product and Quality Assurance",positive,"great, adorable, versatile, excited, soft","great skirt, versatile piece, soft as butter, ..."


In [None]:
test_response = all_responses[:10]

In [None]:
# Convert the results to HTML
prompt1 = f"""
Translate {test_response} to HTML
"""
response1 = get_completion(prompt1)
print(response1)

In [None]:
# Display HTML
from IPython.display import display, Markdown, Latex, HTML, JSON
display(HTML(response1))