# In the Loop Labeling

In-the-loop AI Assisted Labeling: adds functionality for users to review and correct the initial AI’s labels – the user will review and correct the labels and these corrections will be collected and used as new training data

New training data: used to re-train the initial model (to fine-tune the initial AI model, can be a basic retraining loop or a more complex ML)

Iterative Improvement: for this part, we just repeat the process above of generating the labels, gathering the corrections as new training data, retraining the model on it, etc. and the model should become more accurate each time)

Visualizations: we need to visualize the improvement in accuracy over each iteration, which we can save to a csv file when we collect the new training data – but we also need to tell the user the accuracy each time the model is used so we can generate the accuracy data and print it to user and simultaneously export it to csv file

Export functionality: to allow users to export the newly labeled data back to csv

In [21]:
# Importing necessary libraries 
import os
import getpass
import pandas as pd
from openai import OpenAI

In [22]:
os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key:")

In [23]:
import time

In [24]:
from openai import OpenAI
client = OpenAI()

In [25]:
from ai_assisted_coding_final import assistant

In [26]:
assistant_manager = assistant.OpenAIAssistantManager(client)

# Create an assistant
assistant_manager.create_assistant()

asst_X3eHMR3axI9V8LIxFaax63gu


Assistant(id='asst_X3eHMR3axI9V8LIxFaax63gu', created_at=1702038098, description='A tool for classifying teacher utterances into categories like OTR, PRS, REP, NEU.', file_ids=[], instructions='You are the co-founder of an ed-tech startup training an automated teacher feedback tool to classify utterances made. I am going to provide several sentences. \n                                            Please classify each sentence as one of the following: OTR (opportunity to respond), PRS (praise), REP (reprimand), or NEU (neutral)\n        \n                                            user: Can someone give me an example of a pronoun?\n                                            assistant: OTR\n                                            user: That\'s right, \'he\' is a pronoun because it can take the place of a noun.\n                                            assistant: PRS\n                                            user: "You need to keep quiet while someone else is reading."\n       

In [27]:
def read_csv(file_path):
    df = pd.read_csv(file_path)
    return df['Text'].tolist()


In [28]:
def process_lines(lines, assistant_manager):
    data = []
    for line in lines:
        thread, completed_run = assistant_manager.create_thread_and_run(line)
        
        # Wait for the response and get it
        response_page = assistant_manager.get_response()

        # Collect all messages from the response page
        messages = [msg for msg in response_page]  # Iterating over the response page to collect messages

        # Extract the user message and the assistant's label
        if messages:
            user_message = line
            # Assuming the last message is from the assistant and contains the label
            assistant_message = messages[-1].content[0].text.value
            label = assistant_message.split()[-1]  # Extract label
            data.append((user_message, label))

    return data




In [29]:
# Assuming you have an instance of OpenAIAssistantManager as assistant_manager
lines = read_csv('data/009-1.csv')

In [30]:
lines

['Good morning class, today we are going to learn about nouns.',
 'A noun is a word that represents a person, place, thing, or idea.',
 'Can anyone give me an example of a noun?',
 "That's right, 'dog' is a noun because it is a thing.",
 "Let's write down some nouns in our notebooks.",
 "Now, let's talk about verbs. Does anyone know what a verb is?",
 'A verb is a word that describes an action, occurrence, or state of being.',
 'Can someone give me an example of a verb?',
 "Great example, 'run' is a verb because it is an action.",
 "Now, let's write down some verbs in our notebooks.",
 'Next, we are going to learn about adjectives.',
 'An adjective is a word that describes a noun.',
 'Can someone give me an example of an adjective?',
 "Exactly, 'beautiful' is an adjective because it describes a noun.",
 "Let's write down some adjectives in our notebooks.",
 'Now we are going to form sentences using nouns, verbs, and adjectives.',
 'A sentence is a group of words that expresses a comple

In [31]:
# need to make this a group and feed in x number of lines at a time increasing the beatch size along with the accuracy
messages = process_lines(lines[0:4], assistant_manager)


In [32]:
messages

[('Good morning class, today we are going to learn about nouns.', 'NEU'),
 ('A noun is a word that represents a person, place, thing, or idea.', 'NEU'),
 ('Can anyone give me an example of a noun?', 'OTR'),
 ("That's right, 'dog' is a noun because it is a thing.", 'PRS')]

In [33]:
df = pd.DataFrame(messages, columns=["Text", "Label"])

In [34]:
# need to make this interactive and feed in the label and then make it available for download as a csv
df

Unnamed: 0,Text,Label
0,"Good morning class, today we are going to lear...",NEU
1,"A noun is a word that represents a person, pla...",NEU
2,Can anyone give me an example of a noun?,OTR
3,"That's right, 'dog' is a noun because it is a ...",PRS


In [None]:
# after labels are fed in send message like this WITH THE new labels 

""" 
Great. Here are some more examples of how to classify utterances::

user: Can someone give me an example of a pronoun?
assistant: OTR
user: That's right, 'he' is a pronoun because it can take the place of a noun.
assistant: PRS
user: "You need to keep quiet while someone else is reading."
assistant: REP
user: A pronoun is a word that can take the place of a noun.
assistant: NEU

I am going to provide several more sentences. Only answer with the following labels: OTR, PRS, REP, NEU
"""

# feed in next batch of labels and iterate 