In [2]:
import pandas as pd
import numpy as np

# In this file, we'll take the dialogue sample (10,000 dialogues) and label them with GPT
# Grab the sample, concatenate the text (save it), and then label with GPT
# Join the labels back to the original dataframe, and save that

In [6]:
# Load the CSV file into a DataFrame
df = pd.read_csv('data/10k-dialogues-sample.csv')

# Display the first few rows of the DataFrame
print(df.head())
print(df.shape)

   folder  dialogueID                      date      from        to  \
0       7  100000.tsv  2007-03-01T07:55:00.000Z     dyrne  martalli   
1       7  100000.tsv  2007-03-01T07:56:00.000Z  martalli       NaN   
2       7  100000.tsv  2007-03-01T07:57:00.000Z  martalli       NaN   
3       7  100000.tsv  2007-03-01T07:57:00.000Z     dyrne  martalli   
4       7  100000.tsv  2007-03-01T07:58:00.000Z  martalli       NaN   

                                                text  
0  could you just put a script in inittab with th...  
1  Well, actually, I was planning somehting like ...  
2                         What would it be for edgy?  
3  im not sure about new init.  i think edgy stil...  
4  Regarding sound cards, I found this one:http:/...  
(437653, 6)


In [4]:
# Let's group the data by dialogueID, separate by <from: , to: > and concatenate the text

# Group by dialogueID and aggregate the other columns
df_grouped = df.groupby('dialogueID').agg({
    'folder': 'first',
    'date': 'first',
    'text': lambda x: ' || '.join(f"<from: {f}, to: {t}> {txt}" for f, t, txt in zip(df['from'], df['to'], x))
}).reset_index()

# Save the grouped DataFrame to a CSV file
df_grouped.to_csv('data/concatenated_dialogue.csv', index=False)

print("The concatenated dialogues have been saved to 'concatenated_dialogue.csv'")

# Optionally, you can verify the file contents
print("\nFirst few lines of the saved CSV file:")
print(pd.read_csv('data/concatenated_dialogue.csv', nrows=5).to_string())

The concatenated dialogues have been saved to 'concatenated_dialogue.csv'

First few lines of the saved CSV file:
   dialogueID  folder                      date                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       

Sample Discussion 1:
<from: dyrne, to: martalli> could you just put a script in inittab with the respawn option and use mplayer with a playlist? || <from: martalli, to: nan> Well, actually, I was planning somehting like that, but looking for a pointer on what to look up first - I will check out inittab || <from: martalli, to: nan> What would it be for edgy? || <from: dyrne, to: martalli> im not sure about new init.  i think edgy still has /etc/inittab || <from: martalli, to: nan> Regarding sound cards, I found this one:http://www.newegg.com/Product/Product.asp?Item=N82E16829130001 || <from: martalli, to: nan> on newegg for 7.50 (abnout 13 after shipping) || <from: martalli, to: dyrne>  Thanks for the tip.  I can probably take it from there || <from: nevermind, to: nan> html || <from: nevermind, to: nan> !html || <from: nevermind, to: nan> !nvu || <from: nevermind, to: nan> !bluefish || <from: nevermind, to: nan> !quanta+ || <from: nevermind, to: nan> !screem || <from: nevermind, to: na

In [10]:
data_labeling_prompt = """
Data Labeling Task: Evaluating Conversation Success

Objective: Your task is to review a set of dialogue conversations and assign a success rating to each one based on how effectively the user's needs were addressed. The rating system ranges from 0 to 4 stars, where:

0 Stars: The conversation was unsuccessful.
4 Stars: The conversation was highly successful.
Instructions:

Identify the User Seeking Help:

Typically, the first person to post in the conversation is the user seeking help.
They are the ones initiating requests, asking questions, or seeking assistance.
Evaluate the Conversation:

Read the entire conversation thoroughly.
Determine the extent to which the user's needs or questions were addressed.
Look for indicators of satisfaction or frustration from the user.
Assign a Star Rating:

4 Stars (Excellent Success):

The user clearly received the help they were seeking.
The user's questions or issues were fully resolved.
The user expresses satisfaction, gratitude, or positive feedback.
Example indicators: "Thanks for the tip. I can probably take it from there."
3 Stars (Good Success):

The user received substantial help but may have minor unresolved issues.
The conversation was helpful but not exceptional.
The user seems mostly satisfied but doesn't explicitly express strong appreciation.
2 Stars (Moderate Success):

The user received some assistance but significant questions or issues remain.
Responses were partially helpful or only addressed part of the user's needs.
The user may seem uncertain or only partially satisfied.
1 Star (Minimal Success):

The user received little help.
The assistance was minimal, off-topic, or not useful.
The user may show signs of mild frustration or confusion.
0 Stars (No Success):

The user did not receive any help or solutions.
The user expresses clear frustration or dissatisfaction.
The user leaves the conversation early without resolution.
Example indicators: Repeated unanswered questions, expressions like "last warning...", "ok, move on".
Additional Considerations:

User Frustration: Pay attention to signs of frustration, such as repeated questions, abrupt comments, or negative language.
Conversation Flow: Consider whether the conversation stays on topic and progresses toward resolving the user's issue.
Incomplete Interactions: If the conversation ends abruptly without resolution, it likely warrants a lower rating.
Multiple Participants: Be aware that others may join the conversation; focus on whether the user's needs are being addressed.
Formatting Your Response:

For each conversation, provide the assigned star rating (0-4 stars).
Optionally, include a brief justification for your rating (1-2 sentences).
Examples:

Sample Conversation A:

Excerpt: "<from: etotheipi, to: jrib> I hadn't -- but that seems to have fixed it. thanks."
Rating: 4 Stars
Justification: The user received a solution that fixed the issue and expressed gratitude.
Sample Conversation B:

Excerpt: "<from: nevermind, to: nan> last warning... || <from: nevermind, to: nan> ok, move on"
Rating: 0 Stars
Justification: The user shows signs of frustration and did not receive the help needed.
Purpose: This labeling will help us assess the effectiveness of our support interactions and identify areas for improvement.
"""

In [13]:
from pydantic import BaseModel
from openai import OpenAI
import pandas as pd
from tqdm import tqdm
import concurrent.futures
import time

class ConversationRating(BaseModel):
    # justification: str
    rating: int

def rate_conversation(args):
    conversation, client, data_labeling_prompt = args
    try:
        completion = client.beta.chat.completions.parse(
            model="gpt-4o-mini",
            messages=[
                {"role": "system", "content": data_labeling_prompt},
                {"role": "user", "content": f"Here is the conversation: {conversation}"}
            ],
            response_format=ConversationRating,
        )
        return completion.choices[0].message.parsed
    except Exception as e:
        print(f"Error processing conversation: {e}")
        return None

# Initialize OpenAI client
client = OpenAI(api_key="")

# Load the dataset
df = pd.read_csv('data/concatenated_dialogue.csv')

# Prepare arguments for parallel processing
args_list = [(conversation, client, data_labeling_prompt) for conversation in df['text']]

# Process conversations in parallel
with concurrent.futures.ThreadPoolExecutor(max_workers=10) as executor:
    results = list(tqdm(executor.map(rate_conversation, args_list), total=len(args_list), desc="Rating conversations"))

# Extract ratings and justifications
ratings = []
# justifications = []
for result in results:
    if result:
        ratings.append(result.rating)
        # justifications.append(result.justification)
    else:
        ratings.append(None)
        # justifications.append(None)

# Add new columns to the DataFrame
df['rating'] = ratings
# df['justification'] = justifications

# Save the updated DataFrame
df.to_csv('data/concatenated_dialogues_labeled.csv', index=False)

print("Ratings have been added to the dataset and saved as 'rated_dialogues.csv'")

Rating conversations:  95%|█████████▌| 9535/10000 [23:59<01:12,  6.46it/s]  

Error processing conversation: Error code: 400 - {'error': {'message': "This model's maximum context length is 128000 tokens. However, your messages resulted in 129448 tokens (including 44 in the response_format schemas.). Please reduce the length of the messages or schemas.", 'type': 'invalid_request_error', 'param': 'messages', 'code': 'context_length_exceeded'}}


Rating conversations: 100%|██████████| 10000/10000 [25:05<00:00,  6.64it/s]


Ratings and justifications have been added to the dataset and saved as 'rated_dialogues.csv'


In [8]:
# Join these labels back to the original dataframe
df = pd.read_csv('data/10k-dialogues-sample.csv')
df_concatenated_labeled = pd.read_csv('data/concatenated_dialogues_labeled.csv')
df_concatenated_labeled = df_concatenated_labeled[['dialogueID', 'rating']]
df = df.merge(df_concatenated_labeled, on='dialogueID', how='left')
df.to_csv('data/10k-dialogues-sample-labeled.csv', index=False)

# Read the CSV file

# Display the first few rows
print(df.head().to_string())

   folder  dialogueID                      date      from        to                                                                                                                               text  rating
0       7  100000.tsv  2007-03-01T07:55:00.000Z     dyrne  martalli                                    could you just put a script in inittab with the respawn option and use mplayer with a playlist?     4.0
1       7  100000.tsv  2007-03-01T07:56:00.000Z  martalli       NaN  Well, actually, I was planning somehting like that, but looking for a pointer on what to look up first - I will check out inittab     4.0
2       7  100000.tsv  2007-03-01T07:57:00.000Z  martalli       NaN                                                                                                         What would it be for edgy?     4.0
3       7  100000.tsv  2007-03-01T07:57:00.000Z     dyrne  martalli                                                                   im not sure about new init.  i think e