# SAT Words in Context question generation using Strands Agent framework and OpenAI model

## Overview

My first [attempt](https://github.com/PragyanR/GenAI_for_SAT_Prep) to create words in context type question using Llama 3B on A100 was a bit involved from a coding perspective. With the advent of Agentic AI, I was able to generate such questions under 50 lines of code and a prompt template.

Strands Agentic framework made it very simple to try out the use case.

Check out my site if you want to try out Words in Context questions that I generate: [Acesat.ai](https://www.acesat.ai/).


## Setup and prerequisites

### Prerequisites
* Python 3.10+
* gpt-4.1-mini access

Let's now install the requirement packages for our Strands Agent

In [None]:
# installing pre-requisites
!pip install -r requirements.txt

### Importing dependency packages

Now let's import the dependency packages

In [16]:
import os
from strands import Agent, tool
from strands.models.litellm import LiteLLMModel
import json

### Setting up OpenAI keys

Let's now setup the OpenAI API Keys

In [26]:
os.environ["OPENAI_API_KEY"] = "<Your OpenAI key goes here"

### Setting up custom tools

I created three tools for the agent to call:

first, check_word_exists, a tool for the agent to check if a word was to create a words in context question already. 

second, store_paragraph, a tool for the agent to call for storing the generated paragraph. 

third, store_answer_choices, a tool for the agent to call for storing the answer choices and the definition

In [20]:
# Dictionary for storing words in context questions
para = {}

@tool
def check_word_exists(word: str):
    '''Check if a word was used already
    Args:
        word: a SAT word
    '''
    if word in para:
        return True
    else:
        return False
        
@tool
def store_paragraph(word: str, paragraph: str):
    '''Store paragraph in the dictionary, para
    Args:
        word: a SAT word
        paragraph: paragraph that was generated by the agent
    '''
    para[word] = {"word": word}
    paragraph = paragraph.replace(word,"__________________")
    para[word]["question"] = paragraph
    para[word]["ans_choices"] = []
    para[word]["ans_choices_with_def"] = {}

@tool
def store_answer_choice(choice: str, meaning: str, word: str):
    '''Store answer choices
    Args:
        choice: answer choice
        meaning: answer choice definition/meaning
        word: a SAT word
    '''
    if choice not in para[word]["ans_choices"]:
        para[word]["ans_choices"].append(choice)
        
    if choice not in para[word]["ans_choices_with_def"]:
        para[word]["ans_choices_with_def"][choice] = meaning

### LLM model

Agent will leverage `gpt-4.1-mini` using LiteLLM. 

In [21]:
model = "gpt-4.1-mini"
litellm_model = LiteLLMModel(
    model_id=model, params={"max_tokens": 32000, "temperature": 0.7}
)

### Generating Words in Context questions

In [22]:
# Prompt template for generating words in context question
prompt = '''
Always use chain of thought prompting.
1)come up with a medium difficult SAT vocab word. This is the word that will be used to create a question
2)come up with a best genre that fits the word better.
3)check if word exists already, if yes, start again
4)generate a 75-word paragraph on the genre with the word without describing how you came up with the genre. The word should be used only once in the paragraph.
5)after generating, check if the word is used properly or not. if not, generate a new paragraph
6)after generating, check if the word used more than once, generate a new paragraph
7)store just the paragraph and nothing else.
8)then get 3 words in comma seperated format: a difficult antonym of the word. The other two difficult words that does not have a similar meaning to the word.
9)add the word to the comma separated list
10)get meaning for each answer choice and store the answer choice with meaning
'''
# Function for generating 
def generate_question(prompt):
    system_prompt = "You are a simple agent that can generate a paragraph for a given word"
    agent = Agent(
        model=litellm_model,
        system_prompt=system_prompt,
        tools=[store_paragraph, store_answer_choice, check_word_exists],
    )
    agent(prompt)

In [23]:
# Specify number of questions to be generated
question_count = 1
for i in range(question_count):
    generate_question(prompt)
    for key in para:
        print(key)

Alright, let's start with step 1: coming up with a medium difficult SAT vocab word. A good word would be "prosaic."

 fits the word "prosaic" would be a literary or descriptive genre, since "prosaic" refers to something that is commonplace or dull, often used to describe writing or expression.

Step 3: I will check if the word "prosaic" exists already.
Tool #1: check_word_exists
The word "prosaic" does not exist already. Now, for step 4, I will generate a 75-word paragraph on the literary genre using the word "prosaic" only once.
Tool #2: store_paragraph
The paragraph was generated using the word "prosaic" once, and it appears to be used properly in context. Now, moving to step 8: I will get 3 words in comma-separated format: a difficult antonym of "prosaic," and two other difficult words that are not similar in meaning to "prosaic," plus add the word "prosaic" itself to the list. Then, I will get the meanings for each answer choice and store them.
Tool #3: store_answer_choice

Tool #4

In [25]:
for key in para:
    print(json.dumps(para[key]))

{"word": "prosaic", "question": "In the world of literature, some narratives captivate us with extraordinary events and vivid imagery, while others remain grounded in the __________________ details of everyday life. These straightforward stories, though lacking in dramatic flair, often reveal the subtle beauty and complexity within ordinary experiences. By focusing on the mundane, writers can highlight the universal truths that resonate deeply with readers, proving that even the simplest tales can hold profound meaning and emotional impact.", "ans_choices": ["extraordinary", "abstruse", "obfuscate", "prosaic"], "ans_choices_with_def": {"extraordinary": "Very unusual or remarkable; beyond what is usual or ordinary.", "abstruse": "Difficult to understand; obscure.", "obfuscate": "To deliberately make something unclear or difficult to understand.", "prosaic": "Commonplace or dull; lacking in imagination."}}


In [None]:
# write the questions to a Json lines file
with open("output.jsonl", "w") as f:
    for key in para:
        print(key)
        f.write(json.dumps(para[key])+ "\n")