# Sentiment Prompt
Apply Prompt to extract relevant information from Woman Dresses reviews include:
- Sentiment and Emotions of customers
- Summary the feedback

## Set up

In [48]:
import openai # need to be >= 0.27.0, Python >= 3.7.1
import os

from dotenv import load_dotenv
load_dotenv()

openai.api_key = os.getenv('openai_api_key')

In [31]:
def get_completion(prompt, model = "gpt-3.5-turbo"):
    messages = [{"role": "user", "content": prompt}]
    response = openai.ChatCompletion.create(
        model = model,
        messages = messages,
        temperature = 0
    ) 

    return response['choices'][0]['message']['content'] # this is old: response.choice[0].message['content']

## Review texts for analysis
Extract the review_text from main dataset (over 20000 rows). But for experiment, the project uses 1200 rows

In [42]:
import pandas as pd

data = pd.read_csv('./data/Women-Dresses-Reviews-Dataset.csv', index_col=0) # ignore 1st col

data.head()

Unnamed: 0_level_0,age,division_name,department_name,class_name,clothing_id,title,review_text,alike_feedback_count,rating,recommend_index
s.no,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
0,40,General,Bottoms,Jeans,1028,Amazing fit and wash,Like other reviewers i was hesitant to spend t...,0,5,1
1,62,General Petite,Tops,Blouses,850,Lovely and unique!,As is true of a bunch of the fall clothing pho...,12,5,1
2,47,General Petite,Bottoms,Skirts,993,Meh,"I so wanted this skirt to work, love the desig...",3,1,0
3,45,General Petite,Bottoms,Pants,1068,Wow,Love love this! i was hesitant to buy this at ...,0,5,1
4,37,Initmates,Intimate,Swim,24,Great for bigger busts,I absolutely love the retro look of this swims...,0,5,1


## Prompt Engineering
There are 2 prompts applied in this case:
- Prompt 1: Extracting all items as instructions
- Prompt 2: Transforming the data extraction in prompt 1 to HTML 

In [43]:
import time
import json

In [44]:
reviews = data.review_text[500:1200].to_list() 
print(len(reviews))

100


In [35]:
reviews = data.review_text[1199]
print(reviews)

Love this sweater. super comfy. can wear with everything.


In [17]:
all_responses = []

for i in range(len(reviews)):
    prompt = f"""
     Your task is to identify the following items based on product reviews:
    - department: (two most related departments, or one if reviews relate to just one department)
    - sentiment: (positive, negative)
    - emotions: (all list of emotions)
    - summary: (summary)
    
    department: From the review, extract the relevant departments. Information about deparments:
    - Marketing: tailors promotions, discounts, campaigns, highlights customer-favored styles and brands to boost sales
    - Sales: Gives sales and price strategies and offers
    - Inventory: Adjusts clothing in stock, is it lack of products, or sizes of products in stock
    - Product: Refines clothing design, colors, style, material quality, textil and sizing accuracy, follows up trends, aligns products with customer preferences, Ensures clothing meets quality standards, addresses quality-related feedback
    - Logistics: Manages timely clothing delivery, addresses shipping concerns 
      
    sentiment: What is the sentiment of the following product 
    review. Give your answer as a single word, \
    either "positive" or "negative".
                
    emotions: A list of emotions that the writer of the \
    following review is expressing. Include in range of \
    five emotions, and filter just emotions are unique, \
    and different, compared to other emotions.\
    Format your answers separated by commas. 
    
    summary: summarize review implying the \
    relevant information in related department,\
    if relating to Marketing, show the promotions \
    or campaign. limit answer to 10 words and \
    in lower-case words. 

    And reviews are delimited by triple backticks.\
    Format all the words in the required answers \
    in lower-case words.
    
    Review: '''{reviews[i]}'''
    """
    response = get_completion(prompt)
    all_responses.append(response)
    print(i, response, "\n")

    # Save the results every 50 rows
    if (i + 1) % 100 == 0:
        with open(f"./data/responses_{i + 1}.json", "w") as f:
            json.dump(all_responses, f)
    
        # Add a 60-second break time
        # time.sleep(60)

    

0 department: Product
sentiment: positive
emotions: favorite, fun, trendy, comfy, dressy
summary: trendy and comfy dressy piece 

1 department: Product
sentiment: positive
emotions: love, great, breathing room, wearing
summary: white shirt fits great, versatile for summer outfits 

2 department: Product
sentiment: positive
emotions: love, better fit, great
summary: side zipper for better fit, hits right above knee. 

3 department: Product
sentiment: positive
emotions: love, great, beautiful, perfect, feel like
summary: love dress, great material, beautiful coloring, perfect length, pleated neckline 

4 department: Product
sentiment: positive
emotions: love, loved, flattering, vibrant, concern
summary: vibrant patterns, flattering shape, concern about fabric quality 

5 department: Product
sentiment: positive
emotions: love, feel, perfect, staple, winter
summary: red color, fabric feel, versatile for winter and fall. 

6 department: Product
sentiment: negative
emotions: excited, bad, lo

In [36]:
with open('./data/responses5_100.json', 'r') as f5:
    json5 = json.load(f5)
with open('./data/responses6_100.json', 'r') as f6:
    json6 = json.load(f6)
with open('./data/responses7_100.json', 'r') as f7:
    json7 = json.load(f7)
with open('./data/responses8_100.json', 'r') as f8:
    json8 = json.load(f8)
with open('./data/responses9_100.json', 'r') as f9:
    json9 = json.load(f9)
with open('./data/responses10_100.json', 'r') as f10:
    json10 = json.load(f10)
with open('./data/responses11_100.json', 'r') as f11:
    json11 = json.load(f11)

combined_json = json5 + json6 + json7 + json8 + json9 + json10 + json11 
with open ('./data/combined5-11.json', 'w') as f:
    json.dump(combined_json, f)


In [45]:
# Initialize empty lists for each column
departments = []
sentiments = []
emotions = []
summaries = []

# Iterate through each sample and extract information
for sample in combined_json:
    lines = sample.split('\n')  # Split the sample into lines
    
    # Initialize variables for each piece of information
    department = ''
    sentiment = ''
    emotion = ''
    summary = ''
    
    for line in lines:
        parts = line.split(': ')
        if len(parts) == 2:
            key, value = parts
            if key == 'department':
                department = value.strip()
            elif key == 'sentiment':
                sentiment = value.strip()
            elif key == 'emotions':
                emotion = value.strip()
            elif key == 'summary':
                summary = value.strip()
    
    # Append extracted values to respective lists
    departments.append(department)
    sentiments.append(sentiment)
    emotions.append(emotion)
    summaries.append(summary)

# Create a DataFrame
df = pd.DataFrame({
    'Department': departments,
    'Sentiment': sentiments,
    'Emotions': emotions,
    'Summary': summaries
})


In [46]:
df

Unnamed: 0,Department,Sentiment,Emotions,Summary
0,Product,negative,"disappointment, surprise, confusion","maternity top, not suitable for summer"
1,Product,negative,"frustration, disappointment, dissatisfaction, ...","sizing issue with bodice and skirt, flaws visi..."
2,Product,negative,"comfortable, flattering, cheap, pilled, worn, ...","cheap and low-quality shirt, fades and pills a..."
3,Product,positive,"love, feel, looking forward, trips","true to size, accurate color, love the feel, w..."
4,Product,positive,"beautiful, favorite, attractive, love","effortlessly beautiful design, favorite pajama..."
...,...,...,...,...
695,Product,positive,"satisfied, comfortable, pleased, happy, content","fits perfectly, forgiving sleeves, beautiful n..."
696,Product,positive,"flattering, lovely, nice, soft, beautiful","flattering tank top, solid navy blue chiffon, ..."
697,Product,positive,"great, love, classic, pizzazz, excellent","classic looking shirt with a little pizzazz, e..."
698,Product,positive,"perfect, great, flattering","dress is perfect for any occasion, looks great..."


In [47]:
df.to_csv('./data/result5-11.csv', index=False)

In [None]:
# Display HTML
from IPython.display import display, Markdown, Latex, HTML, JSON
display(HTML(response1))

### Take 1000 rows as a sample

In [50]:
# Combine sentiment data
file = ['result1.csv', 'result2.csv', 'result3.csv', 'result4.csv', 'result5-11.csv']
data_path = os.path.join(os.getcwd(), 'data') #data is sub-directory
file_paths = [os.path.join(data_path, filename) for filename in file]

dataframes = []

# Loop to read csv files and append into dataframes
for f in file_paths:
    df = pd.read_csv(f)
    dataframes.append(df)

# Concatenate the list of DataFrames into one DataFrame
combined_df = pd.concat(dataframes, ignore_index=True)
combined_df.to_csv('./data/combined.csv', index=False)
combined_df.head()


Unnamed: 0,Department,Sentiment,Emotions,Summary
0,"Marketing, Sales",positive,"hesitant, good, fresh","20% off on retailer day, jeans look good"
1,Product and Quality Assurance,positive,"shame, bright, vivid, unique, nice","bright and vivid embroidery, unique design, sl..."
2,Product and Quality Assurance,negative,"disappointment, frustration, dissatisfaction, ...","skirt design too long, not suitable for height..."
3,"Marketing, Inventory Management",positive,"love, hesitant, perfect, fabulous, great","perfect find, fabulous color, great fit, can't..."
4,"Marketing, Product and Quality Assurance",positive,"love, excited, satisfied, relieved, confident","retro swimsuit, blogger amber fillerup-clark, ..."


In [24]:
combined_df[1095:1100]

Unnamed: 0,Department,Sentiment,Emotions,Summary
1095,Product,positive,"satisfied, comfortable, pleased, happy, content","fits perfectly, forgiving sleeves, beautiful n..."
1096,Product,positive,"flattering, lovely, nice, soft, beautiful","flattering tank top, solid navy blue chiffon, ..."
1097,Product,positive,"great, love, classic, pizzazz, excellent","classic looking shirt with a little pizzazz, e..."
1098,Product,positive,"perfect, great, flattering","dress is perfect for any occasion, looks great..."
1099,Product,positive,"love, comfy","comfy sweater, versatile"


In [51]:
sentiment = pd.read_csv('./data/combined.csv', index_col=False)
sentiment

Unnamed: 0,Department,Sentiment,Emotions,Summary
0,"Marketing, Sales",positive,"hesitant, good, fresh","20% off on retailer day, jeans look good"
1,Product and Quality Assurance,positive,"shame, bright, vivid, unique, nice","bright and vivid embroidery, unique design, sl..."
2,Product and Quality Assurance,negative,"disappointment, frustration, dissatisfaction, ...","skirt design too long, not suitable for height..."
3,"Marketing, Inventory Management",positive,"love, hesitant, perfect, fabulous, great","perfect find, fabulous color, great fit, can't..."
4,"Marketing, Product and Quality Assurance",positive,"love, excited, satisfied, relieved, confident","retro swimsuit, blogger amber fillerup-clark, ..."
...,...,...,...,...
1195,Product,positive,"satisfied, comfortable, pleased, happy, content","fits perfectly, forgiving sleeves, beautiful n..."
1196,Product,positive,"flattering, lovely, nice, soft, beautiful","flattering tank top, solid navy blue chiffon, ..."
1197,Product,positive,"great, love, classic, pizzazz, excellent","classic looking shirt with a little pizzazz, e..."
1198,Product,positive,"perfect, great, flattering","dress is perfect for any occasion, looks great..."


In [27]:
data1 = data[:1000]
print(len(data1.age))

1000
