A demonstration of the agentic approach to problem solving: an LLM generates code, runs it through an interpreter, receives a traceback on failure, and self-corrects — until all tests pass.
┌─────────────────────────────┐
│ Agentic Loop │
│ │
┌──────────┐ (1) │ ┌─────────┐ ┌─────────┐ │
│ Task │ ────▶│ │ LLM │──▶│ exec() │ │
└──────────┘ │ │streaming│ │ Python │ │
│ └─────────┘ └────┬────┘ │
│ ▲ pass │ fail │
│ (3) │ ▼ │
│ traceback ┌─────────────┐ │
│ → history │ Validator │ │
│ └─────────────┘ │
└─────────────────────────────┘
Three steps per iteration:
- Reasoning & Action — the task and error history are sent to the LLM; the response streams token-by-token directly into the terminal
- Tool Execution — the generated code runs in an isolated namespace via
exec(code, {}, local_vars) - Validator Feedback — on failure, the full traceback is appended to the context (
history) and the agent rewrites the code
The loop repeats until the first success or MAX_ATTEMPTS is reached.
top_words(text, n) — return the n most frequent words in a text, handling case, punctuation, and alphabetical tie-breaking. The task is intentionally chosen so that the first attempt fails on an edge-case assertion, showcasing the full self-correction cycle.
Each attempt displays:
- code generated in real time (streaming + syntax highlighting)
- request stats: tokens (prompt / completion) and response time
- traceback on error and the growing context size
- a final summary: attempt number, total tokens, total time
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install openai richexport OPENAI_API_KEY="sk-..."
python main.py