# OpenAI notebook

| Pedagogy Dimension | Metrics |
| --- | ----------- |
| Manage cognitive load | Stay on topic |
| Encourage active learning | Do not reveal the answer; guide towards the answer; promote active engagement |
| Deepen metacognition | Identify and address misconceptions |
| Motivate and stimulate curiosity | Communicate with positive tone; respond appropriately to explicit affect cues |
| Adapt to the learners’ goals and needs | Adapt to the learner’s level |

In [1]:
import os
import pickle
import re

DATA_DIR = "user-testing/data/"

def list_pickle_files(directory):
    return [f for f in os.listdir(directory) if f.endswith('.pkl')]

def load_conversation(filepath):
    with open(filepath, "rb") as f:
        data = pickle.load(f)
        return data
    
def read_questions_answers():
    # Split out True/False questions
    exam_questions_TF = []
    exam_answers_TF = []

    with open('exams/together2.tex', 'r', encoding='utf-8') as file:
        tex_content = file.read()
        questions, answers = extract_question_answer(tex_content)
        #print(f"Total Questions: {len(questions)}")
        #print(f"Total Answers: {len(answers)}")
        if len(questions) != len(answers):
            print("Warning: The number of questions and answers do not match!")
 
        #print()
        for idx, (q, a) in enumerate(zip(questions, answers), 1):
            #print(f"Question {idx}:\n{q}\n")
            #print(f"Answer {idx}:\n{a}\n")
            exam_questions_TF.append(q)
            exam_answers_TF.append(a)
    return exam_questions_TF, exam_answers_TF

def extract_question_answer(tex_content):
    # Extract content within the enumerate environment
    enum_match = re.search(r'\\begin{enumerate}(.*?)\\end{enumerate}', tex_content, re.DOTALL)
    if not enum_match:
        return [], []
    enum_content = enum_match.group(1)

    # Find all questions (\item ... \begin{solutionorbox})
    question_blocks = re.findall(
        r'\\item(.*?)(?=\\begin{solutionorbox})',
        enum_content, re.DOTALL
    )

    # Find all answers (\begin{solutionorbox} ... \end{solutionorbox})
    answer_blocks = re.findall(
        r'\\begin{solutionorbox}\[[^\]]*\]\s*(.*?)\\end{solutionorbox}',
        enum_content, re.DOTALL
    )
    questions = [q.strip() for q in question_blocks]
    answers = [a.strip() for a in answer_blocks]
    return questions, answers

def remove_latex_formatting(question: str) -> str:
    # Remove LaTeX bold formatting
    question = question.replace("\\textbf{always}", "*always*")  
    question = question.replace("\\textbf{Every}", "*Every*")
    question = question.replace("\\textbf{any}", "*any*")
    question = question.replace("\\textbf{rotation}", "*rotation*")
    question = question.replace("\\textbf{distinct}", "*distinct*")

    return question

In [2]:
from dotenv import load_dotenv
import os
from openai import OpenAI
from tqdm import tqdm
from collections import Counter

# Load environment variables from a .env file
load_dotenv()
OPENAI_API = os.getenv('OPENAI_API_KEY')
llm = OpenAI(api_key=OPENAI_API)
openai_model = "o4-mini"

o4_llm_state = pickle.load(open('exams/together2_llm_state.pkl', 'rb'))
exam_questions_TF, exam_answers_TF = read_questions_answers()

In [3]:
conversation_summary = """
You are an AI assistant that summarizes conversations between a Student and a Tutor AI model.
Your task is to analyze the conversation and provide a summary of the student's main difficulties.
You are not responsible for providing answers to the questions, but rather to identify the key areas where the student struggled or needed help.
You will be given a conversation in the form of a list of messages, where each message is
a dictionary with 'role' (either 'user' or 'assistant') and 'content' (the text of the message).
Your summary should be concise and focus on the student's challenges, misconceptions, or areas of confusion.
***The conversation:***
"""

In [4]:
question_summary = """
You are an AI assistant that merges the difficulties that students had during a conversation with a Tutor AI model.
You will be given a string with all the difficulties for the same (fixed) question.
Your task is to summarize/merge the different difficulties that students had during the conversation and provide a summary of the student's main difficulties.
Your summary should be concise and focus on the student's challenges, misconceptions, or areas of confusion.
***All difficulties:***
"""

In [5]:
overall_summary = """
You are an AI assistant that merges the difficulties that students had during a conversation with a Tutor AI model.
You will be given a string with all the difficulties for different questions.
Your task is to summarize/merge the different difficulties that students had during the conversation and provide a summary of the student's main difficulties.
Your summary should be concise and focus on the student's challenges, misconceptions, or areas of confusion.
***All difficulties over different questions:***
"""

In [6]:
def add_role_prefix_to_conversation(conversation):
    """
    Returns a new conversation list where each message's content is prefixed
    with 'Student:' or 'Tutor:' depending on the role.

    Args:
        conversation (list): List of dicts with 'role' and 'content'.

    Returns:
        list: New conversation list with prefixed content.
    """
    role_prefix = {'user': 'Student:', 'assistant': 'Tutor:'}
    new_conversation = []
    for msg in conversation:
        prefix = role_prefix.get(msg['role'], '')
        new_content = f"{prefix} {msg['content'].strip()}"
        new_conversation.append({'role': msg['role'], 'content': new_content})
    return new_conversation

In [7]:
def query_llm(prompt: str, messages:list|str, openai_model="o4-mini"):
    """
    Query the LLM with the given prompt and messages.
    
    Args:
        prompt (str): The prompt to send to the LLM.
        messages (list): List of messages in the conversation.
        openai_model (str): The OpenAI model to use for the query.

    Returns:
        str: The response from the LLM.
    """
    new_messages = []
    if isinstance(messages, list):
        new_messages.append({"role": "system", "content": prompt})

        for msg in messages:
            new_messages.append({
                "role": msg["role"],
                "content": msg["content"]
            })
    elif isinstance(messages, str):
        new_messages.append({"role": "system", "content": prompt})
        new_messages.append({"role": "user", "content": messages})
    new_messages.append({"role": "system", "content": "Request: perform your task."})
    try:
        response = llm.chat.completions.create(
            model=openai_model,
            messages=new_messages,
        )
        return response.choices[0].message.content
    except Exception as e:
        print(f"Error querying LLM: {e}")
        return None

In [8]:
def get_report_ideas(openai_model="o4-mini"):
    files_questions = {}
    for filename in list_pickle_files(DATA_DIR):
        filepath = os.path.join(DATA_DIR, filename)
        conversation_data = load_conversation(filepath)

        # Get the selected question
        selected_question = int(conversation_data["selected_question"].strip("Q")) - 1

        # Store it in the dictionary
        if selected_question in files_questions:
            files_questions[selected_question].append(filepath)
        else:
            files_questions[selected_question] = [filepath]
            

    # Now, we can process each question with its associated files
    recommendations_per_file = {}
    for selected_question, files in tqdm(files_questions.items()):
        for file in tqdm(files):
            conversation_data = load_conversation(file)
            conversation = conversation_data["messages"]
            student_tutor_conversations = add_role_prefix_to_conversation(conversation)

            # Use LLM to get recommendations for each conversation
            recommendation = query_llm(conversation_summary, student_tutor_conversations, openai_model=openai_model)

            # Store it in the dictionary
            if selected_question in recommendations_per_file:
                recommendations_per_file[selected_question].append(recommendation)
            else:
                recommendations_per_file[selected_question] = [recommendation]

    # Merge the recommendations into a single list per question
    recommendations_per_question = {}
    for selected_question in tqdm(recommendations_per_file.keys()):
        recommendations = recommendations_per_file[selected_question]
        merged_recommendation = "\n\n".join(recommendations)

        # Use LLM to get merged recommendation for each question.
        merged_recommendation = query_llm(question_summary, merged_recommendation, openai_model=openai_model)

        recommendations_per_question[selected_question] = merged_recommendation
    
    # Get overall recommendations
    merged_overall_recommendations = " ".join(recommendations_per_question.values())
    # Use LLM to get overall recommendations
    overall_recommendations = query_llm(overall_summary, merged_overall_recommendations, openai_model=openai_model)

    return recommendations_per_file, recommendations_per_question, overall_recommendations

In [9]:
#recommendations_per_file, recommendations_per_question, overall_recommendations = get_report_ideas()

In [10]:
files_questions = {}
for filename in list_pickle_files(DATA_DIR):
    filepath = os.path.join(DATA_DIR, filename)
    conversation_data = load_conversation(filepath)

    # Get the selected question
    selected_question = int(conversation_data["selected_question"].strip("Q")) - 1

    # Store it in the dictionary
    if selected_question in files_questions:
        files_questions[selected_question].append(filepath)
    else:
        files_questions[selected_question] = [filepath]

In [11]:
# Now, we can process each question with its associated files
recommendations_per_file = {}
for selected_question, files in tqdm(files_questions.items()):
    for file in tqdm(files):
        conversation_data = load_conversation(file)
        conversation = conversation_data["messages"]
        student_tutor_conversations = add_role_prefix_to_conversation(conversation)

        # Use LLM to get recommendations for each conversation
        recommendation = query_llm(conversation_summary, student_tutor_conversations, openai_model=openai_model)

        # Store it in the dictionary
        if selected_question in recommendations_per_file:
            recommendations_per_file[selected_question].append(recommendation)
        else:
            recommendations_per_file[selected_question] = [recommendation]

100%|██████████| 4/4 [00:20<00:00,  5.13s/it]
100%|██████████| 6/6 [00:29<00:00,  4.90s/it]
100%|██████████| 3/3 [00:12<00:00,  4.18s/it]
100%|██████████| 6/6 [00:29<00:00,  4.84s/it]
100%|██████████| 10/10 [00:43<00:00,  4.32s/it]
100%|██████████| 5/5 [00:20<00:00,  4.15s/it]
100%|██████████| 1/1 [00:04<00:00,  4.86s/it]
100%|██████████| 7/7 [02:40<00:00, 22.91s/it]


In [12]:
# Merge the recommendations into a single list per question
recommendations_per_question = {}
for selected_question in tqdm(recommendations_per_file.keys()):
    recommendations = recommendations_per_file[selected_question]
    merged_recommendation = "\n\n".join(recommendations)

    # Use LLM to get merged recommendation for each question.
    merged_recommendation = query_llm(question_summary, merged_recommendation, openai_model=openai_model)

    recommendations_per_question[selected_question] = merged_recommendation

100%|██████████| 7/7 [00:39<00:00,  5.58s/it]


In [13]:
# Get overall recommendations
merged_overall_recommendations = " ".join(recommendations_per_question.values())
# Use LLM to get overall recommendations
overall_recommendations = query_llm(overall_summary, merged_overall_recommendations, openai_model=openai_model)

In [14]:
with open('recommendations_per_file.pkl', 'wb') as f:
    pickle.dump(recommendations_per_file, f)

with open('recommendations_per_question.pkl', 'wb') as f:
    pickle.dump(recommendations_per_question, f)

with open('overall_recommendations.pkl', 'wb') as f:
    pickle.dump(overall_recommendations, f)

In [1]:
import pickle

with open('recommendations_per_file.pkl', 'rb') as f:
    recommendations_per_file = pickle.load(f)

with open('recommendations_per_question.pkl', 'rb') as f:
    recommendations_per_question = pickle.load(f)

with open('overall_recommendations.pkl', 'rb') as f:
    overall_recommendations = pickle.load(f)

In [4]:
print(overall_recommendations)

The student’s struggles across several topics boil down to three interrelated themes:

1. Unclear or mixed‐up definitions  
   • Eigenvalues vs. eigenvectors (λ I, A v = λ v, characteristic polynomial)  
   • Symmetric matrix (A = Aᵀ, aᵢⱼ = aⱼᵢ) vs. identity or commuting factors  
   • Orthogonal vectors vs. orthogonal matrix (MᵀM = I) vs. linear independence  
   • Rotation matrix entries (linking cos θ, sin θ to geometric rotation)  

2. Gaps in algebraic mechanics and notation  
   • Mixing scalars with matrices (when and why λ → λ I)  
   • Rearranging/factoring to get (A – λ I)v = 0 and using det(A – λ I)=0  
   • Applying the transpose‐of‐a‐product rule ((AB)ᵀ = BᵀAᵀ) and det(Aᵀ)=det(A)  
   • Carrying out dot products correctly (sum vs. vector) and checking all pairwise products  
   • Tracking dimensions and correct order in matrix multiplication  

3. Difficulty linking abstract statements to concrete checks  
   • Translating definitions into index or component form (e.g. aᵢⱼ