In [36]:
!pip install openai



In [50]:
from openai import OpenAI
import ast
import csv
import pandas as pd

In [None]:
client = OpenAI(api_key="sk-proj-")

So we have the remember agent. <br>
Input is the raw text from the slides. <br>
The output is one remember question. <br>

In [94]:
system_prompt_remember = """
You are an agent that generates Remember-level questions.
Your input is raw text from lecture slides, but you must NOT refer to the text or say "in the slide" or "mentioned."

Your task:
Create one factual recall question based on the anatomical facts found in the text.

Rules:
- The question must have one clear, factual answer.
- Do NOT ask the learner to explain, compare, or analyze.
- The question must sound like a normal test question.
- Do NOT mention or reference “the slide,” “the text,” or “the passage.”
- Keep the question short and specific.
- Use only information that actually appears in the input text.
- If the text is sparse, pick any clear fact and ask a recall question about it.

Output:
One Remember-level question.
"""

user_prompt = """
Histology is the study of microscopic anatomy, used
to study cells, tissues, and organs, and is foundational
for physiology, biochemistry, and pathology. Cells are the
basic units of the human body, with differentiation leading
to over 200 different cell types. Basic tissues consist of
collections of structurally and functionally similar cells
and their products. The four basic tissue types are
connective tissue, epithelia, muscle, and neural tissue.
These tissues combine to form organs with distinct structure
and function.
"""

questions = []

for i in range(6):
  response = client.chat.completions.create(
      model="gpt-4.1",  # or "gpt-5", etc.
      messages=[
          {"role": "system", "content": system_prompt_remember},
          {"role": "user",   "content": user_prompt}
      ]
  )

  print(response.choices[0].message.content)
  questions.append(response.choices[0].message.content)

What are the four basic tissue types in the human body?


AttributeError: 'ChatCompletion' object has no attribute 'output_text'

In [95]:
system_prompt_apply = """
You are an agent that generates Apply-level questions.
Your input is raw text from lecture slides.

Your task:
Create one scenario-based question that requires the learner to apply information
from the text to a new but relevant situation.

Rules:
- The question must require using knowledge, not just recalling or explaining it.
- Do NOT analyze, compare, or reference “the slide” or “the text.”
- Keep the question concrete, clear, and solvable from the information provided.
- If the text is sparse, create a simple scenario where the fact must be applied.
- The question must have one correct, logically deducible answer.

Output:
One Apply-level scenario question.
"""

user_prompt = """
Histology is the study of microscopic anatomy, used
to study cells, tissues, and organs, and is foundational
for physiology, biochemistry, and pathology. Cells are the
basic units of the human body, with differentiation leading
to over 200 different cell types. Basic tissues consist of
collections of structurally and functionally similar cells
and their products. The four basic tissue types are
connective tissue, epithelia, muscle, and neural tissue.
These tissues combine to form organs with distinct structure
and function.
"""

questions = []

for i in range(6):
  response = client.chat.completions.create(
      model="gpt-4.1",  # or "gpt-5", etc.
      messages=[
          {"role": "system", "content": system_prompt_apply},
          {"role": "user",   "content": user_prompt}
      ]
  )

  print(response.choices[0].message.content)
  questions.append(response.choices[0].message.content)

A patient has suffered an injury that damaged the tissue responsible for transmitting electrical impulses in their arm. Based on your knowledge of basic tissue types, which type of tissue has likely been affected, and why?
A researcher is examining a sample from a patient’s liver under a microscope to determine the cause of abnormal function. Which scientific discipline is the researcher using, and which organizational level of the body are they primarily examining?
A patient has a disease that specifically affects connective tissue. Based on your understanding of the four basic tissue types and their roles, which structures or functions in the patient's body are most likely to be directly impacted by this disease?
A patient suffers an injury that impairs the function of tissues responsible for transmitting electrical impulses throughout the body. Based on the four basic tissue types, which tissue is most likely affected by the injury?
A medical researcher is investigating a disease th

Okay now we have an array of raw text from slides.

In [40]:
# read in data
with open("slide_info.txt", "r") as f:
    data_str = f.read()

data = ast.literal_eval(data_str)

# Extract raw_text fields
raw_text_list = [topic["raw_text"] for topic in data["topics"]]

In [41]:
def generate_questions(raw_text_list, system_prompt, n=100, filename="questions.csv"):
    examples = []

    with open(filename, "w", newline="") as f:
        writer = csv.writer(f)

        for i in range(n):
            slide_text = raw_text_list[i % len(raw_text_list)]    # for text from one slides

            response = client.chat.completions.create(
                model="gpt-4.1",
                messages=[
                    {"role": "system", "content": system_prompt},
                    {"role": "user", "content": slide_text}
                ]
            )

            question = response.choices[0].message.content.strip()

            writer.writerow([question])
            examples.append(question)

            print(f"{i+1}/{n}: {question}")

    print(f"\nSaved {n} questions to {filename}")
    return examples


In [43]:
remember_questions = generate_questions(raw_text_list, system_prompt_remember, n=100, filename="remember_questions.csv")

1/100: What are the four basic tissue types in the human body?
2/100: What component makes up most of the volume in connective tissue?
3/100: What type of connective tissue protein fibre provides tensile strength?
4/100: What is the typical size range of normal lymph nodes?
5/100: What are the three common components shared by all connective tissues?
6/100: What type of connective tissue has parallel collagen fibers that provide tensile strength along one axis, as found in tendons and ligaments?
7/100: What type of cell is found embedded in the extracellular matrix of cartilage?
8/100: What type of bone houses red bone marrow, the site of hematopoiesis?
9/100: What are the four basic types of tissue in the human body?
10/100: What is the main component that determines the properties of connective tissue?
11/100: What is the main function of collagen fibers in connective tissue?
12/100: What is the normal size range of lymph nodes?
13/100: From which embryonic tissue do connective tissu

In [44]:
system_prompt_understand = """
You are an agent that generates Understand-level questions.
Your input is raw text from lecture slides, which may be short or incomplete.

Your task:
Create one question that requires the learner to demonstrate comprehension
of meaning, concepts, or relationships based on the input text.

Rules:
- The question must require the learner to explain, summarize, interpret, or describe.
- Do NOT ask the learner to apply, solve, or analyze.
- The question must sound like a normal test question.
- Do NOT mention or reference “the slide,” “the text,” or “this passage.”
- Keep the question clear and specific.
- Use only information that appears in the input text.
- If the text is sparse, choose any concept and ask for an explanation or interpretation.

Output:
One Understand-level question.
"""

In [45]:
understand_questions = generate_questions(raw_text_list, system_prompt_understand, n=100, filename="understand_questions.csv")

1/100: Explain how the four basic tissue types contribute to the structure and function of organs in the human body.
2/100: Explain how the composition and proportions of the extracellular matrix (ECM) influence the physical properties of different types of connective tissue.
3/100: Describe how the different components of connective tissue contribute to its overall structure and function.
4/100: Explain how lymph nodes help protect the body from disease and infection.
5/100: Explain how the components of connective tissue contribute to the differences in characteristics and functions among various connective tissue types.
6/100: Summarize the main differences between loose (areolar) connective tissue and dense connective tissue in terms of structure and function.
7/100: Describe how the structure and composition of cartilage enable it to provide support to soft tissues while withstanding compressive forces.
8/100: Explain how the organic and inorganic components of bone matrix contrib

In [46]:
system_prompt_apply = """
You are an agent that generates Apply-level questions.
Your input is raw text from lecture slides.

Your task:
Create one scenario-based question that requires the learner to apply information
from the text to a new but relevant situation.

Rules:
- The question must require using knowledge, not just recalling or explaining it.
- Do NOT analyze, compare, or reference “the slide” or “the text.”
- Keep the question concrete, clear, and solvable from the information provided.
- If the text is sparse, create a simple scenario where the fact must be applied.
- The question must have one correct, logically deducible answer.

Output:
One Apply-level scenario question.
"""

In [72]:
apply_questions = generate_questions(raw_text_list, system_prompt_apply, n=100, filename="apply_questions.csv")

1/100: A patient suffers an injury that damages only the muscle tissue in their arm, but leaves the other tissue types intact. Based on your understanding of basic tissue types, what types of cell structure and function would be primarily affected in this injury, and which three other basic tissue types would remain largely unaffected?
2/100: A patient presents with a connective tissue disorder where the ground substance of the extracellular matrix does not properly bind tissue fluid, leading to unusually loose and watery tissue. Based on your knowledge of connective tissue structure, which property or function of the tissue is most likely to be compromised in this patient, and why?
3/100: A patient suffers a connective tissue injury that results in decreased synthesis of collagen fibres but normal production of elastic fibres and ground substance. Predict how the mechanical properties of their connective tissue would be affected, and describe a likely physical symptom they might exper

Now lets create an evaultor using our anatomy data, which is question labeled with remember, understand, apply.
First we read in anatomy class data.

In [51]:
df = pd.read_csv("questions_labeled.csv", encoding="latin-1")

questions = df["question"].tolist()
labels = df["label"].tolist()

Now we create an evaluator agent.

In [53]:
system_prompt_evaluator = """
You are an agent that classifies a question as either:
- remember
- understand
- apply

Remember: asks for simple factual recall.
Understand: asks to explain, describe, or interpret a concept.
Apply: uses a scenario where knowledge must be used to solve something.

Read the question and output only one label:
"remember", "understand", or "apply".
"""

In [67]:
def classify_questions(question_list):
    predicted_questions = []
    predicted_labels = []

    for q in question_list:
        response = client.chat.completions.create(
            model="gpt-4.1",
            messages=[
                {"role": "system", "content": system_prompt_evaluator},
                {"role": "user", "content": q}
            ]
        )

        label = response.choices[0].message.content.strip().lower()

        predicted_questions.append(q)
        predicted_labels.append(label)

        # print(f"Question: {q}")
        # print(f" → {label}\n")

    return predicted_questions, predicted_labels

First lets runs the classifier on our anatomy data.

In [58]:
questions, gpt_labels= classify_questions(questions)

Question: In which plane could ONLY the right lung be seen in a thorax section?
 → apply

Question: A radiologist wants to isolate the right lung in a thoracic image. Which imaging plane should they choose?
 → apply

Question: Which imaging study produces detailed soft-tissue images without ionizing radiation?
 → remember

Question: Describe why we would use an MRI overt a CT scan?
 → understand

Question: A patient needs detailed soft-tissue imaging but should avoid radiation. Which imaging test should be chosen for this situation?
 → apply

Question: What term describes the position of the arms relative to the thorax?
 → remember

Question: Which cell junction permits molecular and ionic movement between adjacent cells?
 → remember

Question: Describe how certain cell junctions allow communication or passage of molecules between neighboring cells.
 → understand

Question: What epithelium appears stratified but has every cell touching the basement membrane?
 → remember

Question: Expl

In [63]:
def compute_accuracy(true_labels, predicted_labels):
    correct = 0
    total = len(true_labels)

    for true_label, pred_label in zip(true_labels, predicted_labels):
        if true_label.strip().lower() == pred_label.strip().lower():
            correct += 1

    accuracy = correct / total
    print(f"Accuracy: {accuracy:.2%}")

    return accuracy

Now we compute the accuracy of our self made labeling for the antonomy test questions using the GPT evaluator.

In [64]:
accuracy = compute_accuracy(labels, gpt_labels)

Accuracy: 79.13%


0.7913043478260869

Now we compute the accuracy for each of the remmeber, understand, apply agents.

In [65]:
_, gpt_labels =classify_questions(remember_questions)
true_labels = ["remember"] * len(remember_questions)
accuracy = compute_accuracy(true_labels, gpt_labels)

Question: What are the four basic tissue types in the human body?
 → remember

Question: What component makes up most of the volume in connective tissue?
 → remember

Question: What type of connective tissue protein fibre provides tensile strength?
 → remember

Question: What is the typical size range of normal lymph nodes?
 → remember

Question: What are the three common components shared by all connective tissues?
 → remember

Question: What type of connective tissue has parallel collagen fibers that provide tensile strength along one axis, as found in tendons and ligaments?
 → remember

Question: What type of cell is found embedded in the extracellular matrix of cartilage?
 → remember

Question: What type of bone houses red bone marrow, the site of hematopoiesis?
 → remember

Question: What are the four basic types of tissue in the human body?
 → remember

Question: What is the main component that determines the properties of connective tissue?
 → remember

Question: What is the mai

In [68]:
_, gpt_labels = classify_questions(understand_questions)
true_labels = ["understand"] * len(understand_questions)
accuracy = compute_accuracy(true_labels , gpt_labels)

Accuracy: 100.00%


In [73]:
_, gpt_labels = classify_questions(apply_questions)
true_labels = ["apply"] * len(apply_questions)
accuracy = compute_accuracy(true_labels, gpt_labels)

Accuracy: 40.00%


Now lets try one central agent that produces remember, understand, apply questions

In [79]:
system_prompt_three_level = """
You are an agent that generates questions at a specified Bloom’s Taxonomy level.
You will receive two inputs:
1. Raw text extracted from lecture slides.
2. A level request: "remember", "understand", or "apply".

Your task:
Create ONE question at the requested level based only on the information in the slide text.

Rules by level:

remember:
- Ask for simple factual recall.
- One clear factual answer.
- No explanation, interpretation, or scenario.

understand:
- Ask the learner to explain, describe, summarize, or interpret a concept.
- No application or problem-solving.
- Question should test comprehension of meaning or relationships.

apply:
- Ask the learner to use knowledge in a new but relevant situation.
- Create a simple scenario where the information must be applied.
- One logically deducible correct answer.
- No analysis or multi-step inference.

General rules:
- Do NOT reference “the slide” or “the text.”
- Use only information found in the input text.
- Keep the question clear, concise, and natural.
- If the slide text is sparse, choose any fact present and build the question around it.

Output:
Only the question.
"""

In [80]:
def generate_all_levels(raw_text_list, iterations=30, filename="three_level_agent_questions_labels.csv"):
    levels = ["remember", "understand", "apply"]
    all_questions = []

    with open(filename, "w", newline="") as f:
        writer = csv.writer(f)
        writer.writerow(["question", "label"])  # header

        for level in levels:

            for i in range(iterations):
                slide_text = raw_text_list[i % len(raw_text_list)]

                user_prompt = f"""
                  Slide text:
                  {slide_text}
                  Requested level: {level}
                  """

                response = client.chat.completions.create(
                    model="gpt-4.1",
                    messages=[
                        {"role": "system", "content": system_prompt_three_level},
                        {"role": "user", "content": user_prompt}
                    ]
                )

                question = response.choices[0].message.content.strip()

                writer.writerow([question, level])  # save immediately

                all_questions.append((question, level))

                print(f"[{level}] {i+1}/{iterations}: {question}")

        return all_questions

In [82]:
all_questions = generate_all_levels(raw_text_list, iterations=33, filename="three_level_agent_questions_labels.csv")

[remember] 1/33: What are the four basic tissue types in the human body?
[remember] 2/33: What component determines the properties of connective tissue?
[remember] 3/33: What type of protein fibre in connective tissue provides tensile strength?
[remember] 4/33: What is the typical size range of normal lymph nodes?
[remember] 5/33: What are the three common components shared by all connective tissues?
[remember] 6/33: What type of connective tissue is composed mostly of adipocytes and is used for energy storage?
[remember] 7/33: What type of cell is found embedded within cartilage?
[remember] 8/33: What is the primary function of red bone marrow housed in the spaces of spongy bone?
[remember] 9/33: What are the four basic tissue types in the human body?
[remember] 10/33: What is the main component that determines the properties of connective tissue?
[remember] 11/33: What is the main function of collagen fibres in connective tissue?
[remember] 12/33: What is the normal size range of lym

In [83]:
remember_questions  = [q for q, lbl in all_questions if lbl == "remember"]
understand_questions = [q for q, lbl in all_questions if lbl == "understand"]
apply_questions     = [q for q, lbl in all_questions if lbl == "apply"]

In [84]:
_, gpt_labels = classify_questions(remember_questions)
true_labels = ["remember"] * len(remember_questions)

remember_accuracy = compute_accuracy(true_labels, gpt_labels)
print(f"Remember Accuracy: {remember_accuracy:.2%}")

Accuracy: 100.00%
Remember Accuracy: 100.00%


In [85]:
_, gpt_labels = classify_questions(understand_questions)
true_labels = ["understand"] * len(understand_questions)

understand_accuracy = compute_accuracy(true_labels, gpt_labels)
print(f"Understand Accuracy: {understand_accuracy:.2%}")

Accuracy: 100.00%
Understand Accuracy: 100.00%


In [86]:
_, gpt_labels = classify_questions(apply_questions)
true_labels = ["apply"] * len(apply_questions)

apply_accuracy = compute_accuracy(true_labels, gpt_labels)
print(f"Apply Accuracy: {apply_accuracy:.2%}")

Accuracy: 51.52%
Apply Accuracy: 51.52%
