# OpenAI and Large Language Models

Given the deficulty understanding the BERTopic and LDA output, I experimented with the OpenAI Large Language Model (LLM).

A Large Language Model (LLM) is an advanced type of artificial intelligence model designed to understand, generate, and work with human language. 

I sent approximately 60,000 prompts, asking the LLM to read the review, and output a Python Dictionary, coding a positive or negative sentiment across six topics. 

Example : {'seat comfort': 'negative', 'cabin service': 'positive'}. 

This worked very well, with the vast majority of the output fitting this format. Some errors were cleaned up with a subsequent script, or the output was deleted if there was no obvious idication of intent.

I manually checked 10 rows, which had 25 tags, of which there were three errors - mostly of omission.

As a next step, I would like to experiment with an open source LLM, and experiment more with how to prompt the system to get the most useful output.


In [28]:
import pandas as pd
import numpy as np
from tqdm import tqdm
import time
import os

import openai as ai

from concurrent.futures import ThreadPoolExecutor, as_completed

In [2]:
# Load API Key
with open('/Users/paulhershaw/brainstation_course/airplane_project/pass.txt', 'r') as file:
    API_KEY = file.read().strip()

# Get the key from an environment variable on the machine it is running on
ai.api_key = API_KEY

In [None]:
# Load cleaned data
df = pd.read_csv('/Users/paulhershaw/brainstation_course/airplane_project/data/airline_reviews_cleaned.csv')

In [None]:
# Isolate to the customer review column
df = pd.DataFrame(df, columns=['customer_review'])

In [39]:
# Function to send API calls to Open AI.
def generate_gpt3_response(user_text, print_output=False):
    try:
        time.sleep(0.1)  # Small delay to prevent rapid requests
        completions = openai.Completion.create(
            engine='gpt-3.5-turbo-instruct',  # Fast and effective engine. 
            temperature=0.1, # I want the output to be very structured, hence a low temperature
            prompt=user_text,
            max_tokens=500,
            n=1, # Number of completions to generate - May want to experiment with this setting in the future.
            stop=None,
        )

        if print_output:
            print(completions)

        response = completions.choices[0].text.strip() if completions.choices else None
        return response

    except Exception as e:
        print(f"An error occurred: {e}")
        return None


#Processing in chunks to avoid memory issues
def process_chunk(chunk, start_index):
    # Function to generate request text for GPT-3
    def make_request_text(row):
        return "You are a topic modeling algorithm. " \
               "You will be shown flight reviews of mean length 170 words. " \
               "You will output one or more topics in a comma-separated list. " \
               "For each topic, identify if there is a 'positive' or 'negative' sentiment. " \
               "Topics include: 'seat comfort', 'cabin service', 'food and beverage', 'entertainment', 'ground service', 'value for money' " \
               "Examples: 'seat comfort': 'negative', 'cabin service': 'positive' " \
               "{}".format(row['customer_review'])

    # Process each row in parallel
    with ThreadPoolExecutor(max_workers=10) as executor:
        future_to_row = {executor.submit(generate_gpt3_response, make_request_text(row)): idx for idx, row in chunk.iterrows()}
        for future in as_completed(future_to_row):
            idx = future_to_row[future]
            try:
                response = future.result()
                chunk.at[idx, 'GPT'] = response.replace('\n\n', ' ').replace('\n', ' ') if response else None
            except Exception as e:
                print(f"An error occurred while saving the file: {e}")

    # Parsing topics and sentiments
    for topic in topics:
        chunk[topic] = chunk['GPT'].apply(lambda x: parse_topics(x).get(topic))

    # Save the processed chunk to a CSV file
    output_directory = '/Users/paulhershaw/brainstation_course/airplane_project/data'
    try:
        chunk.to_csv(f'{output_directory}/GPT3_output_chunk_{start_index}.csv', index=True)
        print(f"File saved: GPT3_output_chunk_{start_index}.csv")
    except Exception as e:
        print(f"An error occurred while saving the file: {e}")


def parse_topics(output):
    topic_sentiments = dict.fromkeys(topics, None)  # Initialize all topics with None
    if output:
        for topic in topics:
            if topic in output:
                try:
                    sentiment = output.split(topic)[1].split(',')[0].strip().split(' ')[-1].strip("'")
                    topic_sentiments[topic] = sentiment
                except IndexError:
                    pass  # If parsing fails, leave it as None
    return topic_sentiments

In [40]:
# Define the chunk size and total number of rows to process
chunk_size = 2500
num_rows = 59392  # Replace with the actual number of rows, or use a smaller number for testing
start_row = 0
# Process the DataFrame in chunks
for start in range(start_row, num_rows, chunk_size):
    end = start + chunk_size
    print(f"Processing rows {start} to {end}...")
    chunk = df.iloc[start:end].copy()
    process_chunk(chunk, start)
    print(f"Rows {start} to {end} processed and saved to CSV.")

Processing rows 20998 to 23498...
File saved: GPT3_output_chunk_20998.csv
Rows 20998 to 23498 processed and saved to CSV.
Processing rows 23498 to 25998...
File saved: GPT3_output_chunk_23498.csv
Rows 23498 to 25998 processed and saved to CSV.
Processing rows 25998 to 28498...
File saved: GPT3_output_chunk_25998.csv
Rows 25998 to 28498 processed and saved to CSV.
Processing rows 28498 to 30998...
File saved: GPT3_output_chunk_28498.csv
Rows 28498 to 30998 processed and saved to CSV.
Processing rows 30998 to 33498...
File saved: GPT3_output_chunk_30998.csv
Rows 30998 to 33498 processed and saved to CSV.
Processing rows 33498 to 35998...
File saved: GPT3_output_chunk_33498.csv
Rows 33498 to 35998 processed and saved to CSV.
Processing rows 35998 to 38498...
File saved: GPT3_output_chunk_35998.csv
Rows 35998 to 38498 processed and saved to CSV.
Processing rows 38498 to 40998...
File saved: GPT3_output_chunk_38498.csv
Rows 38498 to 40998 processed and saved to CSV.
Processing rows 40998 to

In [42]:
# Initialize an empty list to store DataFrames
dfs = []

# Loop to read each CSV file and append to the list
for i in range(1,17):  # From 0 to 14
    filename = f'/Users/paulhershaw/brainstation_course/airplane_project/data/GPT3_output_chunk_{i}.csv'
    read = pd.read_csv(filename)
    dfs.append(read)

# Concatenate all DataFrames in the list
gpt_results = pd.concat(dfs, ignore_index=True)

In [58]:
# Check if there are any duplicate index values
has_duplicates = final_gpt_output.index.duplicated().any()
print("Are there any duplicate index values? ", has_duplicates)

Are there any duplicate index values?  False


In [5]:
gpt_results = pd.read_csv('/Users/paulhershaw/brainstation_course/airplane_project/data/gpt_results.csv')

In [6]:

# A number of the outputs from GPT-3 are not to specification, but are still useful. This maps out issues for manual correction.
unique_values_dict = {'routine': '',
                       'mentioned': '',
                       'average': 'positive',
                       'comfort': '',
                       'seamless': 'positive',
                       'flight': '',
                       'snack': '',
                       'efficient': 'positive',
                       'delayed': 'negative',
                       'Airways': '',
                       'good': 'positive',
                       'unclear': 'negative',
                       'meal': '',
                       'nightmare': 'negative',
                       'selection': '',
                       'hit-or-miss': 'negative',
                       'movies': '',
                       'satisfactory': 'positive',
                       'variable': '',
                       'negative)': 'negative',
                       'system': '',
                       'nan': '',
                       '(mixed)': '',
                       '(average)': 'positive',
                       'decent': 'positive',
                       'smooth': 'positive',
                       'neutral': 'positive',
                       'options': '',
                       'none': '',
                       'disorganized': 'negative',
                       'opinions': '',
                       'slow': 'negative',
                       '(neutral)': 'positive',
                       'wine': '',
                       '(negative)': 'negative',
                       'food': '',
                       'class)': '',
                       'adequate': 'positive',
                       'excellent': 'positive',
                       'mixed': '',
                       'great': 'positive',
                       'quality': '',
                       'Airlines': '',
                       'surprised': '',
                       'feelings': '',
                       'unknown': '',
                       '(neutral/unpleasant)': 'negative',
                       'selection)': '',
                       'fair': '',
                       'malfunction': '',
                       'miss)': '',
                       'mentioned)': '',
                       'disappointing': 'negative',
                       'service': '',
                       'positive': 'positive',
                       'miss': '',
                       'improvement)': '',
                       'inconsistent': 'negative',
                       'pyjamas': '',
                       'negative': 'negative',
                       'outdated': '',
                       '(positive)': 'positive',
                       'options)': '',
                       'limited': '',
                       2: ''}

In [8]:

columns_to_check = ['seat comfort', 'cabin service', 'food and beverage', 'entertainment', 'ground service', 'value for money']

# Loop through the specified columns and update values using the dictionary
for column in columns_to_check:
    gpt_results[column] = gpt_results[column].map(unique_values_dict)

In [9]:
# In this section, we will create new columns to store the positive and negative sentiments for each topic.
for column in columns_to_check:
    gpt_results[f'{column}_pos'] = ''
    gpt_results[f'{column}_neg'] = ''

In [10]:
# Loop through the specified columns and mark 'pos' and 'neg' columns accordingly
for column in columns_to_check:
    gpt_results[f'{column}_pos'] = gpt_results[column].apply(lambda x: 1 if x == 'positive' else 0)
    gpt_results[f'{column}_neg'] = gpt_results[column].apply(lambda x: 1 if x == 'negative' else 0)


In [11]:
columns_to_drop = ['Unnamed: 0',
                   'customer_review','GPT', 
                   'seat comfort', 
                   'cabin service', 
                   'food and beverage', 
                   'entertainment', 
                   'ground service', 
                   'value for money']

gpt_results.drop(columns_to_drop, axis=1, inplace=True)

In [12]:
gpt_results

Unnamed: 0,seat comfort_pos,seat comfort_neg,cabin service_pos,cabin service_neg,food and beverage_pos,food and beverage_neg,entertainment_pos,entertainment_neg,ground service_pos,ground service_neg,value for money_pos,value for money_neg
0,1,0,1,0,1,0,1,0,0,1,0,1
1,0,1,0,1,0,1,0,1,0,1,0,1
2,0,1,1,0,0,1,0,1,0,1,0,1
3,1,0,1,0,1,0,1,0,1,0,1,0
4,0,1,0,1,0,1,0,1,0,1,0,1
...,...,...,...,...,...,...,...,...,...,...,...,...
59386,1,0,1,0,1,0,1,0,1,0,1,0
59387,0,0,1,0,0,1,0,0,0,0,0,1
59388,0,0,1,0,0,1,0,0,1,0,0,0
59389,1,0,1,0,1,0,1,0,1,0,1,0


In [13]:
gpt_results.to_csv('/Users/paulhershaw/brainstation_course/airplane_project/data/gpt_final.csv', index=True)