# Large Language Models (LLM) Analysis Tool

### Basic logic
- The app ingests a series of reviews (or other customer feedback, or indeed any form of text) and a number of user-defined categories to assess the text input on. The output is in the form of a written assessment in English of max. 50 characters and an integer score of 1 (very negative) to 5 (very positive) for each of the categories. 
- The script is tailored for (somewhat) larger datasets that require batching of the input: since every LLM has a fixed token window, the input and output of any call to the LLM has to stay within the token window to be processed in full. For the LLM used here, gpt-3.5-turbo, the token window is 4,096. The function create_batches below ensures that given the length of each review and the number of categories, batches are created that fall neatly within the token window.
- The final output is a simple panel app, requiring the categories to be inputted. A sanity check then first provides an estimate of the associated cost (gtp-3.5-turbo doesn't come for free, alas), before executing the full script.


### Libraries
As always, we start with importing the required libraries, ordered per function they are (newly) called for.

In [1]:
#!pip install openai python-dotenv
#!pip install unidecode

# Generic
import pandas as pd

# To get started
import openai
import os

# For preprocess_data
import glob
from datetime import datetime, timedelta
from unidecode import unidecode

# For create_batches
from transformers import GPT2Tokenizer

# For get_ratings
import re
import json

# For plot_output
import numpy as np
import matplotlib.pyplot as plt

# For panel
import panel as pn
import seaborn as sns
import random

### Get started with OpenAI API

The easiest and cleanest way to work with OpenAI (and other) APIs: make sure your API key isn't publicly visible!

In [2]:
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) 

openai.api_key  = os.getenv('OPENAI_API_KEY')

### Function to generate LLM response based on prompt

Basic function to initiate a chat conversation with a given model (in this case gpt-3.5-turbo), set for maximal predictability (temperature=0). We will define the prompt further below.

In [3]:
def get_completion(prompt, model="gpt-3.5-turbo"): 
    
    '''Simple function to open a chat conversation with a specified model, based on a given prompt, 
    returns the model's answer'''
    
    messages = [{"role": "user", "content": prompt}]
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=0, 
    )
    
    print(f'{response["usage"]["prompt_tokens"]} prompt tokens used.')
    
    return response.choices[0].message["content"]

### Function to preprocess data

This example was built using a set of Google Maps Reviews, the function preprocess_data does some parsing required for the planned analysis and visualisation further downstream. It also provides a reviews_list which will serve as input to the LLM.

In [4]:
def preprocess_data(data):
    
    '''Function to preprocess raw input from outscraper file to reviews_list to feed into LLM
    Includes transformation to unicode characters
    Returns both full string to feed into LLM and full dataframe''' 
    
    # Parse date information
    data['review_date'] = pd.to_datetime(data['review_datetime_utc'])
    data['owner_answer_date'] = pd.to_datetime(data['owner_answer_timestamp_datetime_utc'])
    data.loc[:,'review_year'] = data['review_date'].dt.year
    data.loc[:,'answer_time'] = (data.loc[:,'owner_answer_date'] - data.loc[:,'review_date']).dt.days
    
    data.loc[:,'review_text_unicode'] = data.loc[:,'review_text'].apply(unidecode)
    
    # Only keep required columns
    columns_to_select = ['name', 'review_text_unicode', 'review_rating', 'review_likes', 'review_year',
                         'review_date', 'owner_answer', 'owner_answer_date', 'answer_time']
    df = data[columns_to_select].drop_duplicates()
   
    # Directly create reviews_list without using an intermediate string
    reviews_list = [f'<ID={idx}; review={unidecode(row["review_text"])}>' for idx, row in data.iterrows()]
    
    # Output string and df
    return reviews_list, df, data

### Function to create JSON output format for LLM query

To make sure our LLM outputs in a structured format, we instruct it to deliver JSON following a given template.

In [5]:
def review_categories(categories):
    
    '''Simple function to create a JSON template based on a list of review categories'''
    
    categories_loop = [f'''"Id": "ID", "category": "{category}", "assessment": <your findings in English>, "score": <your score>''' 
                       for category in categories]
    
    return categories_loop

### Function to create prompt

Here we can make changes to the prompt - this function is mostly used for easy calling downstream.

In [6]:
def create_prompt(reviews_list, categories_loop):
    
    '''Function intended to easily change prompt and create prompt without result string (call create_prompt(""))'''
    
    prompt = f"""
    You are a skilled analyzer of google maps reviews, multilingual with an eye for nuance.\

    Your output must be in valid JSON. Do not output anything other than the JSON.\

    You will analyze a list of reviews, where each review is surrounded by tags and is structured like \
    <ID=number; review=review>, based on a number of categories. For each category you will include\
    your findings in English and an integer score between 1 (very negative) and 5 (very positive).\
    If you do not have enough information, give a score of 0. Your assessment should not be a translation\
    of the review, but a different, succint summary of no more than 50 characters for each of the categories.\

    The categories are listed in the JSON template below.\

    Your output will be nothing other than a valid JSON, structured as follows:\
    '''{categories_loop}'''

    Surround your JSON output with <result></result> tags.

    Review text: '''{reviews_list}'''
    """
    
    return prompt

### Function to create batches of reviews to stay below token limit

LLMs have a token limit per API call, this function ensures we stay below the 4,096 token limit imposed by our chosen gpt-3.5-turbo model.

In [7]:
def create_batches(data, categories, total_token_limit=4000):

    '''Given the token limit for all LLMs, create batches of input and estimated output to stay below limit'''
    
    tokenizer = GPT2Tokenizer.from_pretrained("gpt2-medium")

    prompt_noreviews = create_prompt("","")
    number_of_categories = len(categories)

    # Constants
    total_token_limit = total_token_limit
    estimated_max_output_per_review = sum([len(tokenizer.tokenize(s)) for s in review_categories(categories)])
    initial_prompt_tokens = len(tokenizer.tokenize(prompt_noreviews))
    reviews = preprocess_data(data)[0]

    # Set up loop
    batches = []
    current_batch = []
    current_token_count = initial_prompt_tokens

    for review in reviews:

        review_token_count = len(tokenizer.tokenize(review))
        current_token_count = current_token_count + review_token_count + estimated_max_output_per_review

        if current_token_count <= total_token_limit:
            current_batch.append(review)

        else:
            batches.append(current_batch)
            current_batch = [review]
            current_token_count = initial_prompt_tokens + review_token_count

    if current_batch:
        batches.append(current_batch)
    
    return batches, current_token_count

### Function to feed the data and review categories into the LLM

This is where the magic happens: feed each batch in turn into the LLM, and run the analysis for a given set of categories.

In [8]:
def get_ratings(batch, categories):
    
    '''Function to string together previous functions and produce a table with tailor-made
    review categories'''

    # Fill prompt
    categories_loop = review_categories(categories)
    prompt = create_prompt(reviews_list=batch, categories_loop=categories_loop)
    
    # Execute LLM prompting
    ratings = get_completion(prompt)

    # Parse output
    json_string = ratings.split('<result>')[1].split('</result>')[0]
    data_list = json.loads(json_string)
    df = pd.DataFrame(data_list)
    
    pivot_table = df.pivot_table(index='Id',
                               columns='category', 
                               values=['assessment', 'score'], 
                               aggfunc='first')

    return pivot_table

### Function to execute LLM per batch, then append results to original data

Bring it all together.

In [9]:
def execute_script(data, categories):
    
    '''Bring all the above functions together to create an output table with all the original data and the results
    of the LLM querying'''

    data_df = preprocess_data(data)[1]
    
    batches = create_batches(data=data, categories=categories)[0]
        
    outputs = [get_ratings(batch=batch, categories=categories) for batch in batches]
    combined_df = pd.concat(outputs)

    data_df.index = data_df.index.astype(int)
    combined_df.index = combined_df.index.astype(int)

    final_df = pd.concat([data_df,combined_df], axis=1)
    
    return final_df, combined_df

### Simple interface to run the app

We made a simple interface with the panel library.

In [18]:
def run_panel(data):
    
    '''Simple panel interface to run the entire script, given specified input data'''
        
    # Initialize the extension
    pn.extension()

    # Define a function to calculate the cost based on the number of categories
    def calculate_cost(data, categories):
        categories = category_input.value.split(",")
        categories = [category.strip() for category in categories]
        cost_per_token = 0.000002  
        estimated_tokens = len(create_batches(data[:records_input.value], categories)[0])*4000+\
        create_batches(data[:records_input.value], categories)[1]
        total_estimated_cost = estimated_tokens * cost_per_token
        return total_estimated_cost

    # Define a function to process the review category and return some output
    def process_category(event):
        categories = category_input.value.split(",")
        categories = [category.strip() for category in categories]
        cost = calculate_cost(data, categories)

        categories_size = len(categories)
        cost_pane.object = f"The estimated cost for processing {records_input.value} records for\
        {categories_size} categories is €{round(cost,3)}. Do you want to proceed?"
        confirm_button.visible = True

    def confirm_execution(event):
        categories = category_input.value.split(",") 
        categories = [category.strip() for category in categories]
        result = f"You entered the categories: {categories}. \n\n Please be patient while the Analyst does her magic."
        output_pane.object = result

        analysis = execute_script(data=data[:records_input.value], categories=categories)[0]
        df_pane.object = analysis.loc[:, ['review_text_unicode'] + list(analysis.columns[-2*len(categories):])]


    # Create the widgets
    category_input = pn.widgets.TextInput(name="Enter review categories, divided by comma's",
                                          placeholder='E.g. Friendliness of staff, quality of products, ...')
    process_button = pn.widgets.Button(name="Calculate Cost", button_type="primary")
    process_button.on_click(process_category)

    confirm_button = pn.widgets.Button(name="Confirm & Execute", button_type="success", visible=False)
    confirm_button.on_click(confirm_execution)
    
    records_input = pn.widgets.IntSlider(name="Number of records to process", 
                                         start=1, 
                                         end=len(data), 
                                         value=5, 
                                         step=1)

    # Create an output pane to display the result
    output_pane = pn.pane.Markdown("Output will be displayed here.")
    cost_pane = pn.pane.Markdown()  
    df_pane = pn.pane.DataFrame()

    # Create a layout for the app
    layout = pn.Column(
        pn.pane.Markdown("## My Google Maps Review Analyst"),
        pn.pane.Markdown("Analyse your reviews the way you want it!\nWe use fictional data for this example"),
        category_input,
        records_input,
        process_button,
        cost_pane,
        confirm_button,
        output_pane,
        df_pane
    )

    # Display the app
    return layout

### Create mock dataset

With thanks to GPT-4 ;)

In [11]:
# Sample reviews with varying sentiments.
reviews = [
"Absolutely love my new sofa! Staff was friendly and the delivery was prompt.",
"Bought a table that broke within a week. Staff was dismissive when I called.",
"Decent selection, but delivery took much longer than promised.",
"Quality furniture and great in-store service, but the delivery was late.",
"Staff were rude. I won't be coming back despite the good product range.",
"Superb quality! Bought a bed and it's very comfortable. Staff was kind.",
"Expected more from a store with such reviews. Products are mediocre.",
"Impressed with the quality of the chairs. Quick delivery!",
"Terrible service! Waited weeks for my order and it came damaged.",
"Staff went out of their way to help. But, the dining set is average at best.",
"Ordered a shelf; poor quality and wobbly. Expected better from this store.",
"Amazing service! Staff guided me through every step. Happy with my purchase.",
"Delivery was a disaster. Two weeks late and the package was damaged.",
"Customer service is top-notch. They resolved my issue promptly.",
"Stunning collection! I always find unique pieces here. Staff is also very knowledgeable.",
"Bought a desk and it's of terrible quality. Very disappointed.",
"Highly recommend! Great quality furniture and excellent customer service.",
"Delivery guys were rude and didn't handle the furniture well.",
"Lovely store ambiance. The staff makes you feel very welcomed.",
"Products are overpriced for the quality they offer.",
"Great deals during the sale! Got a fabulous couch at half the price.",
"Horrible experience. The table I bought had termites!",
"The installation team did a great job setting up my bedroom.",
"Delivery time is a joke! They always deliver later than promised.",
"Found the perfect lamp for my study. Thanks to the staff for their suggestions.",
"Expected better quality. The drawers of the chest I bought are misaligned.",
"Been buying from here for years. Quality has gone down.",
"Exceptional service! The staff was patient and helped me choose the right furniture.",
"Very displeased. The sofa color looked different from what was shown online.",
"Always a pleasure shopping here. They never disappoint with their collection.",
"Bought a bed that started creaking within a month. Not worth the price.",
"Love their eco-friendly collection. Good quality and looks elegant.",
"Staff wasn't helpful. Had to figure everything out on my own.",
"The dining set I purchased is the star of my home. Absolutely beautiful!",
"Worst purchase ever. The wardrobe doors are uneven.",
"Great store layout. Makes shopping easy and pleasant.",
"Delivery was quick, but they sent the wrong furniture piece!",
"Quality has deteriorated. Not the same store it used to be.",
"Kudos to the staff! Made my shopping experience delightful.",
"Bought a rug. The color faded after one wash. Very disappointed.",
"Fantastic offers during the festive season. Made a great purchase.",
"The coffee table I bought is wobbly. Expected better stability.",
"The store offers a great blend of traditional and modern designs. Love it!",
"Terrible customer support. They didn't resolve my complaint.",
"Best place to buy furniture. Quality and design are unmatched.",
"The mattress I bought is uncomfortable. Woke up with back pain.",
"Love their kids' furniture collection. Fun and safe designs.",
"Staff didn't have knowledge about the products. Gave misleading information.",
"My go-to store for home decor. Always find something interesting.",
"Very disappointed. The bookshelf I bought can't even handle weight properly.",
"Great customization options. Got the furniture tailored to my needs.",
"The installation was rushed. They left marks on my wall.",
"Their vintage collection is to die for! Bought some classic pieces.",
"Bought chairs and they're very uncomfortable. Regret the purchase.",
"Helpful staff and a wide range of furniture options. Highly recommend.",
"Delivery was a hassle. Multiple delays and lack of communication.",
"Found the perfect centerpiece for my living room. Quality is top-notch.",
"Expected a smooth delivery. Instead, got a damaged product.",
"Staff is courteous and they have a great return policy. Trustworthy store.",
"Bought curtains and they were of a different length. Poor quality check.",
"Fantastic service. They even helped me visualize the setup in my home.",
"The sofa set I ordered has a different fabric. Not what I expected.",
"Great store for budget buys. Good quality at affordable prices.",
"Terrible experience. The staff lost my order and there was a long delay.",
"Found a gem of a table. Love the intricate designs they offer.",
"The wardrobe I bought doesn't close properly. Poor design.",
"Excellent store for all home needs. Always satisfied with my purchases.",
"Got a damaged product and the return process was a nightmare.",
"The staff gives great recommendations. Found the perfect dresser.",
"The bed I bought is squeaky. It's been just a few months!",
"A wide range of options and styles. Always something for everyone.",
"Very unhappy. The couch cushions sagged within a week.",
"Always get compliments for the furniture I buy from here. Love this store.",
"The dining set I received had scratches. Quality control issues.",
"Coolest designs and the store keeps up with the latest trends.",
"The desk I bought is flimsy. Not worth the high price.",
"Store offers great post-purchase support. They address issues promptly.",
"Delivery team mishandled my furniture. Saw them being careless.",
"Love the minimalist designs. My home looks modern and chic now.",
"Bought a lamp and it stopped working within days. Poor quality.",
"Staff takes time to understand your needs. Very personalized service.",
"The rug I ordered sheds a lot. Very messy and not up to the mark.",
"Beautiful store with a vast collection. Always a joy to visit.",
"The furniture assembly was a disaster. Incomplete and messy.",
"Excellent craftsmanship. The woodwork on the furniture is impeccable.",
"The mirror I bought had a crack. Very disappointed with the quality.",
"Very innovative designs. They offer unique pieces not found elsewhere.",
"Terrible experience. The mattress had a weird smell that won't go away.",
"Love shopping during their sales. Great discounts on premium products.",
"The TV unit I bought is not stable. Safety concerns for my family.",
"The store has a great ambiance. Staff is attentive and helpful.",
"The wardrobe I ordered came with missing parts. Incomplete delivery.",
"Extensive collection. I always find what I'm looking for.",
"Delivery was late and they didn't even apologize. Very unprofessional.",
"Beautiful artifacts to complement the furniture. Bought some lovely pieces.",
"The table I received is of a different color. Not as per the online image.",
"Store has great bundle offers. Bought a complete room set at a good price.",
"The dresser I got is chipped. Not a pleasant experience.",
"Stylish designs that are also functional. Very satisfied with my purchases.",
"Bought a couch and the stitching is coming off. Poor craftsmanship."
]

# Generating other columns
likes = [random.randint(0, 50) for _ in range(100)]
years = [2021, 2022, 2023] * 34  # This will give 102 items, we'll slice it down to 100 later
random.shuffle(years)
review_rating = [random.randint(0, 5) for _ in range(100)]
dates = [datetime(year, random.randint(1, 12), random.randint(1, 28)) for year in years][:100]
owner_answers = ["Thank you for your feedback!", "We apologize for the inconvenience."] * 50
answer_dates = [date + timedelta(days=random.randint(0, 15)) for date in dates]
answer_time = [(ad - rd).days for rd, ad in zip(dates, answer_dates)]

# Constructing the DataFrame
sample_data = pd.DataFrame({
    'name': 'Restaurant ABC',
    'review_text': reviews[:100],
    'review_rating': review_rating,
    'review_likes': likes,
    'review_year': years[:100],
    'review_datetime_utc': dates,
    'owner_answer': owner_answers,
    'owner_answer_timestamp_datetime_utc': answer_dates,
    'answer_time': answer_time
})

### Showtime!

In [19]:
run_panel(sample_data)