# Experiments

## Overview
This notebook contains preliminary experiments comparing different AI agent configurations. Each scenario will use identical study material input.

## Scenarios
There will be 4 scenarios:
1. **Single agent 0-shot**: One agent with no examples provided
2. **Single agent 1-shot**: One agent with one example provided
3. **Multi-agent 0-shot**: Two agents (question generator and evaluator) with no examples, using manual agent orchestration
4. **Multi-agent 1-shot**: Two agents with one example, using manual agent orchestration

## Methodology
- Each scenario will have same study material input
- Each scenario will be run once as this is a preliminary study
- The multi-agent scenarios will utilize the crewAI framework
- Results will be compared qualitatively rather than statistically


In [13]:
%pip install google-genai


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.2[0m[39;49m -> [0m[32;49m25.0.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


## Base Setup

In [14]:
import os
from google import genai
from google.genai import types
import json

In [15]:
base_model = "gemini-2.0-flash"
api_key = os.environ.get("GEMINI_API_KEY")
client = genai.Client(
  api_key=api_key,
)

# setup study material files
files = [
  client.files.upload(file='section-loop.pdf')
]

## Scenario 1 & 2 Setup

In [16]:
question_generator_prompt = ""
with open('question-generator-prompt.txt', 'r') as file:
    question_generator_prompt = file.read()

def setup_question_generator_config(system_prompt):
    return types.GenerateContentConfig(
        response_mime_type="application/json",
        system_instruction=[
            types.Part.from_text(text=system_prompt),
        ],
    )

question_generator_config_0_shot = setup_question_generator_config(question_generator_prompt)

contents = [
    types.Content(
        role="user",
        parts=[
            types.Part.from_uri(
                file_uri=files[0].uri,
                mime_type=files[0].mime_type,
            ),
            types.Part.from_text(text="Generate exactly 5 MCQs based on the following context."),
        ]
    )
]

## Scenario 1

In [17]:
result = client.models.generate_content(
  model=base_model,
  contents=contents,
  config=question_generator_config_0_shot,
)

In [18]:
print(f">> result: {result.text}")

>> result: [
  {
    "question": "What is the primary function of a 'while' loop in Python?",
    "options": {
      "A": "To iterate through a sequence of elements a fixed number of times.",
      "B": "To execute a block of code repeatedly as long as a specified condition is true.",
      "C": "To define a function that can be called multiple times.",
      "D": "To handle exceptions that may occur during program execution."
    },
    "correct_option": "B"
  },
  {
    "question": "Which statement is used to immediately terminate a loop's execution and proceed to the next statement after the loop?",
    "options": {
      "A": "continue",
      "B": "exit",
      "C": "break",
      "D": "pass"
    },
    "correct_option": "C"
  },
  {
    "question": "What is a 'nested loop'?",
    "options": {
      "A": "A loop that contains only one type of statement.",
      "B": "A loop that is defined inside another loop.",
      "C": "A loop that only iterates through numbers.",
      "D": "

## Scenario 2

In [19]:
scenario_2_prompt = ""
with open('scenario-2-prompts.txt', 'r') as file:
    scenario_2_prompt = file.read()

question_generator_config_1_shot = setup_question_generator_config(scenario_2_prompt)

In [20]:
result_1_shot = client.models.generate_content(
  model=base_model,
  contents=contents,
  config=question_generator_config_1_shot,
)

In [21]:
print(f">> result_1_shot: {result_1_shot.text}")

>> result_1_shot: [
  {
    "question_text": "Which of the following loop types are discussed in the provided text?",
    "options": {
      "A": "do-while loop",
      "B": "for loop and while loop",
      "C": "repeat-until loop",
      "D": "if loop"
    },
    "correct_option": "B"
  },
  {
    "question_text": "What is the primary function of a 'break' statement within a loop?",
    "options": {
      "A": "To skip the current iteration and proceed to the next.",
      "B": "To terminate the loop's execution prematurely.",
      "C": "To execute the loop body at least once.",
      "D": "To define a function within the loop."
    },
    "correct_option": "B"
  },
  {
    "question_text": "In the context of loops, what is a 'container' in Python?",
    "options": {
      "A": "A keyword used to define a loop.",
      "B": "A variable that stores the loop's condition.",
      "C": "A data structure like a list, string, or range of numbers that can be iterated over.",
      "D": "A f

## Scenario 3 & 4 Setup

In [22]:
content_with_initial_mcqs_1 = contents
content_with_initial_mcqs_2 = contents

content_with_initial_mcqs_1.append(
  types.Content(
    role="model",
    parts=[
      types.Part.from_text(text=f"Generated MCQs: {result.text}"),
    ]
  )
)

content_with_initial_mcqs_2.append(
  types.Content(
    role="model",
    parts=[
      types.Part.from_text(text=f"Generated MCQs: {result_1_shot.text}"),
    ]
  )
)

## Scenario 3

In [39]:
evaluator_0_shot_prompt = ""
with open('evaluator-0-shot-prompt.txt', 'r') as file: 
    evaluator_0_shot_prompt = file.read()

evaluator_agent_config_1 = types.GenerateContentConfig(
    response_mime_type="application/json",
    system_instruction=[
        types.Part.from_text(text=evaluator_0_shot_prompt),
    ],
)

In [40]:
evaluator_result_1 = client.models.generate_content(
  model=base_model,
  contents=content_with_initial_mcqs_1,
  config=evaluator_agent_config_1,
)

In [41]:
print(f">> evaluator_result_1: {evaluator_result_1.text}")

>> evaluator_result_1: [
  {
    "question": "In a `while` loop, what happens if the loop condition is always true?",
    "options": {
      "A": "The loop will execute once and then stop.",
      "B": "The loop will not execute at all.",
      "C": "The loop will execute indefinitely, creating an infinite loop.",
      "D": "The program will crash with a syntax error."
    },
    "correct_option": "C"
  },
  {
    "question": "Which of the following control flow statements is primarily used to skip the rest of the current iteration of a loop?",
    "options": {
      "A": "break",
      "B": "pass",
      "C": "continue",
      "D": "return"
    },
    "correct_option": "C"
  },
  {
    "question": "Consider the scenario where a `for` loop iterates through a list of lists. What type of loop structure is this?",
    "options": {
      "A": "Recursive loop",
      "B": "Nested loop",
      "C": "Parallel loop",
      "D": "Sequential loop"
    },
    "correct_option": "B"
  },
  {
    "

In [42]:
content_with_initial_mcqs_1.append(
  types.Content(
    role="model",
    parts=[
      types.Part.from_text(text=f"Feedback from evaluator: {evaluator_result_1.text}"),
      types.Part.from_text(text=f"Regenerate MCQs based on the feedback."),
    ]
  )
)

In [43]:
# Send feedback to question generator
result_with_feedback_1 = client.models.generate_content(
  model=base_model,
  contents=content_with_initial_mcqs_1,
  config=question_generator_config_0_shot,
)

In [44]:
print(f">> result_with_feedback_1: {result_with_feedback_1.text}")

>> result_with_feedback_1: [
  {
    "question": "Which loop structure in Python is best suited for repeating a block of code as long as a specific condition remains true?",
    "options": {
      "A": "for loop",
      "B": "while loop",
      "C": "if-else statement",
      "D": "try-except block"
    },
    "correct_option": "B"
  },
  {
    "question": "What statement allows you to exit a loop prematurely, bypassing any remaining code within the loop's body?",
    "options": {
      "A": "pass",
      "B": "continue",
      "C": "break",
      "D": "return"
    },
    "correct_option": "C"
  },
  {
    "question": "What programming construct involves placing one loop structure (e.g., 'for' or 'while') inside another?",
    "options": {
      "A": "Loop recursion",
      "B": "Loop abstraction",
      "C": "Nested loops",
      "D": "Parallel processing"
    },
    "correct_option": "C"
  },
  {
    "question": "What is the key difference in functionality between the 'break' and 'co

## Scenario 4

In [45]:
evaluator_1_shot_prompt = ""
with open('evaluator-1-shot-prompt.txt', 'r') as file: 
    evaluator_1_shot_prompt = file.read()

evaluator_agent_config_2 = types.GenerateContentConfig(
    response_mime_type="application/json",
    system_instruction=[
        types.Part.from_text(text=evaluator_1_shot_prompt),
    ],
)

In [46]:
evaluator_result_2 = client.models.generate_content(
  model=base_model,
  contents=content_with_initial_mcqs_2,
  config=evaluator_agent_config_2,
)

In [47]:
print(f">> evaluator_result_2: {evaluator_result_2.text}")

>> evaluator_result_2: {
  "question_feedback": [
    {
      "question_evaluated": "In a `while` loop, what happens if the loop condition is always true?",
      "evaluation": {
        "relevance_score": 5,
        "clarity_score": 5,
        "distractor_plausibility_score": 4,
        "brief_comment": "Tests understanding of infinite loops. Options are clear and plausible."
      }
    },
    {
      "question_evaluated": "Which of the following control flow statements is primarily used to skip the rest of the current iteration of a loop?",
      "evaluation": {
        "relevance_score": 5,
        "clarity_score": 5,
        "distractor_plausibility_score": 5,
        "brief_comment": "Clearly asks about 'continue'. All distractors are valid control flow statements, increasing plausibility."
      }
    },
    {
      "question_evaluated": "Consider the scenario where a `for` loop iterates through a list of lists. What type of loop structure is this?",
      "evaluation": {
      

In [48]:
content_with_initial_mcqs_2.append(
  types.Content(
    role="model",
    parts=[
      types.Part.from_text(text=f"Feedback from evaluator: {evaluator_result_2.text}"),
      types.Part.from_text(text=f"Regenerate MCQs based on the feedback."),
    ]
  )
)

In [49]:
result_after_feedback_2 = client.models.generate_content(
  model=base_model,
  contents=content_with_initial_mcqs_2,
  config=question_generator_config_1_shot,
)

In [50]:
print(f">> result_after_feedback_2: {result_after_feedback_2.text}")

>> result_after_feedback_2: [
  {
    "question_text": "What is the term for a loop that continues to execute indefinitely because its termination condition is never met?",
    "options": {
      "A": "Finite loop",
      "B": "Recursive loop",
      "C": "Infinite loop",
      "D": "Conditional loop"
    },
    "correct_option": "C"
  },
  {
    "question_text": "Which statement, when used inside a loop, will skip the rest of the current iteration and jump to the next iteration?",
    "options": {
      "A": "pass",
      "B": "skip",
      "C": "break",
      "D": "continue"
    },
    "correct_option": "D"
  },
  {
    "question_text": "What programming structure involves placing one loop inside another?",
    "options": {
      "A": "Loop fusion",
      "B": "Loop unrolling",
      "C": "Nested loop",
      "D": "Parallel loop"
    },
    "correct_option": "C"
  },
  {
    "question_text": "What is the fundamental difference in functionality between the `break` and `continue` state