## ACOS quadruples

1. -1,-1 means there’s no aspect
2. category is SERVICE#GENERAL
3. sentiment 0 is “negative”
4. 5,6 means the words between index 5 (inclusive) to 6 (exclusive) in the review text split by space


In [20]:
import os

## Reading from the dataset (Modified from Eric's code)

In [57]:
from typing import Optional
from pathlib import Path

acos_base_dir = (Path(__name__).parent.parent.parent / 'dataset/ACOS').resolve()

laptop_acos_dir = acos_base_dir / 'Laptop-ACOS'
restaurant_acos_dir = acos_base_dir / 'Restaurant-ACOS'

laptop_acos_train_file = laptop_acos_dir / 'laptop_quad_train.tsv'
laptop_acos_dev_file = laptop_acos_dir / 'laptop_quad_dev.tsv'

restaurant_acos_train_file = restaurant_acos_dir / 'rest16_quad_test.tsv'
restaurant_acos_dev_file = restaurant_acos_dir / 'rest16_quad_dev.tsv'

In [58]:
def get_words_from_indices_str(review, indices_str: str) -> Optional[str]:
    if indices_str == '-1,-1':
        return None
    start_i, end_i = [int(s) for s in indices_str.split(',')]
    return review[start_i: end_i]

def get_sentiment(sentiment_t: str):
    sentiment_i = int(sentiment_t)
    return {0: 'negative', 1: 'neutral', 2: 'positive'}[sentiment_i]

In [74]:
# Lightly modified from Eric's code
# Function to return the acos quadruples only in the top num_lines reviews provided in the dataset
def read_acos(num_lines, filename):
    lines_to_read = num_lines
    acos = []
    for line_i, line in enumerate(open(filename)):
        line_items = line.split('\t')
        review = line_items[0].strip().split()
        acos_quads = [item.strip().split() for item in line_items[1:]]
        # print(' '.join(review))
        # print(acos_quads)
        for aspect_t, category_t, sentiment_t, opinion_t in acos_quads:
            aspect = get_words_from_indices_str(review, aspect_t)
            category = category_t
            sentiment = get_sentiment(sentiment_t)
            opinion = get_words_from_indices_str(review, opinion_t)
            acos_str = f"aspect: {aspect}, category: {category}, sentiment: {sentiment}, opinion: {opinion}"
            # print(acos_str)
            acos.append(acos_str)
        # print('=' * 80)
        if line_i >= lines_to_read-1:
            break
    return acos

## Using ChatGPT to generate short summaries using ACOS quadruples (Victor)

In [86]:
import openai
import pandas as pd
import numpy as np

APIKEY = 

In [61]:
# Setting api_key for chatgpt 
openai.api_key = APIKEY

In [62]:
# CAUTION: DO NOT RUN THIS FUNCTION WITHOUT CONSULTING THE TEAM, EACH TOKEN TO THE CHATGPT API COSTS MONEY
# EACH FREE TRIAL GIVES US 18 DOLLARS OF CREDIT

# Each functional call to the chatgpt api also takes several seconds, so expect it to take much longer than expected

# function to generate a short summary + prompt engineering to get the best response
# acos: List[str], the string in the acos format returned by Eric's code
# example: aspect: None, category: RESTAURANT#GENERAL, sentiment: positive, opinion: None
# returns a list of the summary generated from the acos quadruple. 
def acos_to_summary(acos):
    prompt = "Generate a summary for me using the following ACOS quadruple. This quadruple was extracted from a restaurant review which commented on a specific category of that restaurant. Please make the summary sound natural, in first person view. Only summarize based on non None fields. Reply only with the summary. Do not mention no opinion."
    summaries = []
    for quad in acos:
        query = prompt + " " + quad
        output = openai.ChatCompletion.create(
            model='gpt-3.5-turbo',
            # roles: system, user, assistant
            # System: (BUGGED) provide overarching context to the system
            messages=[{"role": "user", "content": query}]
        )
        summaries.append(output['choices'][0]['message']['content'])
    return summaries

In [82]:
# obtain the top N acos quadruples in the dataset provided
acos = read_acos(10, restaurant_acos_dev_file)
summaries = acos_to_summary(acos)

In [84]:
# Trimming artifacts from the output
for i in range(len(summaries)):
    summaries[i] = summaries[i].replace('\n', '')

In [88]:
acos_summary = pd.DataFrame()
acos_summary['ACOS'] = acos
acos_summary['summary'] = summaries

In [90]:
acos_summary

Unnamed: 0,ACOS,summary
0,"aspect: None, category: RESTAURANT#GENERAL, se...",I recently visited a restaurant and had a grea...
1,"aspect: ['sake', 'list'], category: DRINKS#STY...",I really enjoyed the sake list at this restaur...
2,"aspect: None, category: SERVICE#GENERAL, senti...",The service at this restaurant was fantastic.
3,"aspect: ['spicy', 'tuna', 'roll'], category: F...",I really appreciated the spicy tuna roll at th...
4,"aspect: ['rock', 'shrimp', 'tempura'], categor...",The rock shrimp tempura at this restaurant is ...
5,"aspect: ['pink', 'pony'], category: RESTAURANT...",I absolutely love the pink pony aspect of this...
6,"aspect: ['place'], category: RESTAURANT#GENERA...","In my opinion, the place was the best part of ..."
7,"aspect: ['sea', 'urchin'], category: FOOD#QUAL...",The sea urchin at the restaurant was of high q...
8,"aspect: ['prix', 'fixe', 'menu'], category: FO...",I highly recommend the prix fixe menu at this ...
9,"aspect: ['prix', 'fixe', 'menu'], category: FO...",I think the prix fixe menu is worth trying at ...
