# SAT Words in Context question generation using Strands Agentic AI framework and OpenAI model

## Overview

My first [attempt](https://github.com/PragyanR/GenAI_for_SAT_Prep) to create words in context type question using Llama 3B on A100 was a bit involved from a coding perspective. With the advent of Agentic AI, I was able to generate such questions under 50 lines of code and a prompt template.

Strands Agentic AI framework made it very simple to try out the use case.

Check out my site if you want to try out Words in Context questions that I generated: [Acesat.ai](https://www.acesat.ai/).


## Setup and prerequisites

### Prerequisites
* Python 3.10+
* gpt-4.1-mini access

Install the requirement packages for Strands Agent

In [9]:
# installing pre-requisites
!pip install -r requirements.txt


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.3.1[0m[39;49m -> [0m[32;49m25.3[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


### Importing dependency packages

Import the dependency packages

In [10]:
import os
from strands import Agent, tool
from strands.models.litellm import LiteLLMModel
import json

### Setting up OpenAI keys

Setup the OpenAI API Keys

In [11]:
os.environ["OPENAI_API_KEY"] = "<Key goes here>"


### Setting up custom tools

I created three tools for the agent to call:

first, check_word_exists, a tool for the agent to check if a word was used to create a words in context question already.

second, store_paragraph, a tool for the agent to call for storing the generated paragraph.

third, store_answer_choices, a tool for the agent to call for storing the answer choices and the word definitions.

In [12]:
# Dictionary for storing SAT inference questions (keyed by genre)
para = {}

@tool
def check_genre_exists(genre: str):
    '''
    Check if this genre or a very similar one was used already.
    Args:
        genre: SAT passage genre
    '''
    return genre in para


@tool
def store_paragraph(genre: str, paragraph: str):
    '''
    Store inference paragraph
    Args:
        genre: SAT passage genre
        paragraph: paragraph ending with "because ____"
    '''
    para[genre] = {
        "genre": genre,
        "question": paragraph,
        "ans_choices": [],
        "ans_choices_with_explanation": {},
        "correct_answer": None
    }


@tool
def store_answer_choice(choice: str, explanation: str, genre: str):
    '''
    Store answer choices and explanations
    Args:
        choice: answer choice text
        explanation: brief justification
        genre: SAT passage genre
    '''
    if choice not in para[genre]["ans_choices"]:
        para[genre]["ans_choices"].append(choice)

    para[genre]["ans_choices_with_explanation"][choice] = explanation


@tool
def set_correct_answer(choice: str, genre: str):
    '''
    Store the correct inference answer
    Args:
        choice: correct answer choice
        genre: SAT passage genre
    '''
    para[genre]["correct_answer"] = choice


### LLM model

Agent will leverage `gpt-4.1-mini` using LiteLLM.

In [13]:
model = "gpt-4.1-mini"
litellm_model = LiteLLMModel(
    model_id=model, params={"max_tokens": 32000, "temperature": 0.7}
)

### Generating Words in Context questions

In [14]:
# Prompt template for generating words in context question
prompt = '''
You are generating authentic SAT Inference / Logical Completion questions.
You MUST use the provided tools to store and validate content.

Rules (SAT Authenticity):
- Use an academic, neutral tone.
- The conclusion must be implied, not stated.
- There must be exactly ONE logically valid completion.
- Avoid opinions, exaggeration, or unsupported assumptions.

Process:
1) Create a unique random Genre for the text. It can range from History, to science, to archertecture 
2) Use check_question_genre(Genre).
   - If it returns a genre similar to the one gernerted, generate a new genre and restart.
   - else continue
3) Choose a medium-difficulty SAT inference focus (cause–effect, implied explanation, constraint-based reasoning).
4) Write a 70–90 word paragraph that ends with an incomplete sentence introduced by “because ____”.
   - Do NOT repeat earlier information.
   - The blank must require logical inference.
5) Use store_paragraph(Genre, paragraph).

Answer Choices:
6) Generate exactly four answer choices (A–D):
   - One correct logical completion
   - One tempting but unsupported inference
   - Two clearly incorrect inferences (scope shift, reversal, or extreme claim)
7) For each answer choice, call:
   store_answer_choice(choice, explanation, question_id)
   - Explanation must be one concise sentence justifying correctness or incorrectness.
8) Use set_correct_answer(correct_choice, question_id).

Output Rules:
- Do NOT reveal reasoning steps.
- Do NOT explain unless explicitly asked.
- Ensure no more than one answer is defensible; otherwise, regenerate.
'''
# Function for generating
def generate_question(prompt):
    system_prompt = "You are a simple agent that can generate a paragraph for a given word"
    agent = Agent(
        model=litellm_model,
        system_prompt=system_prompt,
        tools=[store_paragraph, store_answer_choice, check_genre_exists,set_correct_answer],
    )
    agent(prompt)

In [15]:


# Specify number of questions to be generated
question_count = 50
para = {}

for i in range(question_count):
    generate_question(prompt)

# Print generated genres (one per question)
for genre in para.keys():
    print(genre)


Tool #1: check_genre_exists

Tool #2: store_paragraph

Tool #3: store_answer_choice

Tool #4: store_answer_choice

Tool #5: store_answer_choice

Tool #6: store_answer_choice

Tool #7: set_correct_answer
Genre: Environmental Science

Recent studies have shown that urban areas with a higher density of green spaces tend to have lower average temperatures during summer months. This phenomenon is observed because ____.

A) The vegetation in green spaces cools the air through transpiration.
B) People in greener areas prefer to use air conditioning less often.
C) Green spaces absorb more heat than urban structures, raising temperatures.
D) Urban areas with fewer green spaces have more cloud cover, reducing temperatures.
Tool #1: check_genre_exists

Tool #2: store_paragraph

Tool #3: store_answer_choice

Tool #4: store_answer_choice

Tool #5: store_answer_choice

Tool #6: store_answer_choice

Tool #7: set_correct_answer
Marine Biology

Marine biologists observed that certain coral species exh

In [16]:
# Print generated genres (one per question)
for genre in para.keys():
    print(genre)

Environmental Science
Marine Biology
Ancient Architecture
Oceanography
Renewable Energy Technology
Astrophysics
Geological Studies
Architectural History
Archaeology
Cultural Anthropology
Geothermal Energy
Astrobiology
Ancient Civilizations
Medieval Architecture
Cognitive Psychology
Ancient Mesopotamian Architecture
Astronomy
Geology
Paleontology
Historical Architecture
Neuroscience Advances
Neuroscience
Geological Sciences
Renewable Energy Technologies
Renaissance Art
Geological Formations
Crystallography
Quantum Physics
Atmospheric Science
Ethnomusicology
Astronomical Physics
Cognitive Neuroscience
History of Astronomy
Psychology of Memory
Geological Science
Astronomical Phenomena
Ancient Astronomy
Environmental Chemistry
Ancient Trade Routes
Astronomy and Space Exploration
Quantum Mechanics
Urban Planning
Medieval History
Linguistics
Quantum Computing
Botany
Paleoclimatology
Geological Processes
Classical Mythology


In [28]:
for key in para:
    print(json.dumps(para[key]))

{"genre": "Marine Biology", "question": "Marine biologists have observed that certain coral species are more resilient to temperature changes than others. This resilience is not solely due to genetic factors but is also influenced by the coral's symbiotic relationship with algae that can adjust their photosynthetic activity. These algae provide essential nutrients to the coral, helping it survive stressful conditions. Therefore, some corals can better withstand warming oceans because ____", "ans_choices": ["the algae associated with these corals can alter their photosynthesis to support coral metabolism under heat stress.", "these corals are genetically superior in all environmental conditions.", "the corals' resilience causes the algae to improve their photosynthesis.", "all marine organisms adapt equally well to temperature changes in the ocean."], "ans_choices_with_explanation": {"the algae associated with these corals can alter their photosynthesis to support coral metabolism under

In [17]:
# write the questions to a Json lines file
with open("Main_Idea_output.jsonl", "w") as f:
    for key in para:
        print(key)
        f.write(json.dumps(para[key])+ "\n")

Environmental Science
Marine Biology
Ancient Architecture
Oceanography
Renewable Energy Technology
Astrophysics
Geological Studies
Architectural History
Archaeology
Cultural Anthropology
Geothermal Energy
Astrobiology
Ancient Civilizations
Medieval Architecture
Cognitive Psychology
Ancient Mesopotamian Architecture
Astronomy
Geology
Paleontology
Historical Architecture
Neuroscience Advances
Neuroscience
Geological Sciences
Renewable Energy Technologies
Renaissance Art
Geological Formations
Crystallography
Quantum Physics
Atmospheric Science
Ethnomusicology
Astronomical Physics
Cognitive Neuroscience
History of Astronomy
Psychology of Memory
Geological Science
Astronomical Phenomena
Ancient Astronomy
Environmental Chemistry
Ancient Trade Routes
Astronomy and Space Exploration
Quantum Mechanics
Urban Planning
Medieval History
Linguistics
Quantum Computing
Botany
Paleoclimatology
Geological Processes
Classical Mythology
