Learn AI Engineering Through Play
A submission for the Gemini 3 Hackathon by Google DeepMind.
The AI Engineering course market is booming and for good reason. The demand is massive, yet we're all still figuring out the fundamentals. Building reliable agentic systems, orchestrating multi-step pipelines, evaluating AI outputs. These are skills the industry needs, and even experienced engineers are learning them in real time.
At the same time, Generative AI for image and video has made extraordinary progress in the past year. Tools like Veo, Nano Banana, and others can now produce stunning visuals that were unthinkable just months ago.
I wanted to connect these two worlds: use the power of generative media to create an immersive learning experience for AI Engineering. A game where you play a character and learn by doing, with an AI mentor that adapts to you in real time.
| Link | |
|---|---|
| Demo Video | Watch on YouTube |
| Live App | app.blasto.xyz |
| Landing Page | blasto.xyz |
Each biome represents a different professional context for AI work, with its own aesthetic, storyline, and challenge style:
| Biome | Setting | Learning Focus |
|---|---|---|
| Miragora | Startup | Rapid prototyping, fighting for every token |
| Strandis | Corporation | Designing systems ready for massive scale |
| Orbium Halls | University | Implementing methodologies from research papers |
| Emberfield | NGO | Working with local models and optimization |
Your character defines the difficulty curve and challenge style. Characters live in their biome and accumulate energy - the in-game resource that powers code execution.
Inside the Mission Cockpit, three panels work together:
- Code Editor (Monaco) - write, run, and submit AI solutions directly in the browser. Code executes in an isolated Docker sandbox on a dedicated server.
- Course Panel - animated explanations built with Manim (3Blue1Brown's library).
- Mentor Chat - a real-time AI mentor (Sandstorm) that sees your code, execution output, progress history, and conversation. It decides what to teach, how to teach it, and what not to reveal.
Running code costs energy. Explaining concepts well earns it back. You cannot brute-force missions by spamming code runs - you need to actually understand what you are building.
The core of Blasto: 7 specialized PydanticAI agents, all powered by Gemini 3 Flash (gemini-3-flash-preview), collaborating to deliver adaptive, personalized instruction. Every agent returns validated Pydantic models - not free-form text. A single shared model builder (_build_gemini_model()) configures all agents, making it trivial to switch Gemini versions from one config value.
When a student runs or submits code, a 4-agent analysis pipeline fires. Each agent produces a typed Pydantic model that feeds into the next. The final output is a TeachingPlan that the mentor uses to craft its response.
Student writes / runs code
|
v
+-+
| 4-AGENT TEACHING PIPELINE |
| |
| [1] Code Analyst |
| Analyzes code against mission requirements. |
| Tool: fetch_previous_attempts (Supabase) |
| Output: TrajectoryTrace |
| | |
| v |
| [2] Reflector |
| Classifies learning velocity and proposes |
| playbook deltas. |
| Tool: fetch_playbook (Supabase) |
| Output: ReflectionInsights |
| | |
| v |
| [3] Curator |
| Merges deltas, deduplicates, caps at 30 |
| entries, persists to database. |
| Output: CuratedPlaybook |
| | |
| v |
| [4] Teaching Strategist |
| Designs optimal intervention: type, tone, |
| constraints, and focus concept. |
| Tool: fetch_recent_chat_topics (Supabase) |
| Output: TeachingPlan |
+-+
|
v
[5] Mentor Agent (Sandstorm)
Receives TeachingPlan as dynamic instructions.
Streams response to student via SSE.
Tools:
- evaluate_explanation -> triggers Agent 6
- award_energy -> Supabase RPC
- check_student_progress
- generate_visual
|
v
[6] Feynman Evaluator
Scores student explanations (0-100).
Triggered by the mentor's tool call.
[7] Title Agent
Generates concise conversation titles
from the first user-mentor exchange.
Every pipeline agent produces a validated Pydantic model. The TrajectoryTrace captures intent, approach, errors, struggle points, and positive signals. The ReflectionInsights classifies learning velocity (fast, steady, struggling, regressing) and proposes playbook deltas. The TeachingPlan specifies:
- Intervention type - hint, question, explanation, encouragement, redirect, or visual
- Focus concept - the single concept to address
- What to say - key points for the response
- What NOT to say - concepts the student should discover on their own
- Tone - encouraging, challenging, celebratory, or serious
- Max sentences - keeps responses concise and targeted
This makes the mentor's behavior deterministic and auditable, not a vague prompt hoping for the best.
After submitting working code, the mentor challenges the student to explain the concept in their own words. The Feynman Evaluator scores on a strict rubric:
| Score Range | Meaning |
|---|---|
| 0 - 10 | Off-topic or nonsensical |
| 11 - 30 | Copy-pasted or textbook definitions |
| 50 - 69 | Correct but shallow |
| 70 - 84 | Genuine understanding with own reasoning (pass) |
| 85 - 100 | Deep insight with examples and trade-offs |
Passing (>= 70) earns energy, reinforcing the resource loop. The evaluator also returns feedback and a list of missing concepts to guide the student toward a better explanation.
Each student accumulates a persistent learning playbook - a structured profile containing concept mastery, concept gaps, behavior patterns, effective and ineffective teaching approaches, and recurring error types.
The Reflector reads it, proposes changes, and the Curator merges them. The system remembers what worked for each student across missions and adapts accordingly.
PydanticAI makes this multi-agent system practical:
- Structured outputs - every agent returns a validated Pydantic model. The pipeline passes typed objects (
TrajectoryTrace->ReflectionInsights->CuratedPlaybook->TeachingPlan) between agents. - Dynamic instructions -
@agent.instructionsinjects per-request context without brittle format strings in system prompts. - Tool definitions -
@agent.toolwithRunContext[Deps]gives agents database access and cross-agent calls while keeping dependencies explicit. - Streaming -
agent.run_stream()enables token-by-token SSE delivery for real-time mentor chat. - Model abstraction - all 7 agents share
_build_gemini_model(). Switching Gemini versions is a one-line config change.
Each agent can fail independently. If the Reflector fails, the Curator uses the existing playbook. If the entire pipeline fails, the mentor responds without a teaching plan - still useful, just less targeted. No single agent failure breaks the experience.
Chat uses Server-Sent Events (SSE) with three event types:
| Event | Purpose |
|---|---|
status |
Pipeline phase updates ("Analyzing code...", "Identifying patterns...") |
data |
Streamed text chunks from the mentor |
done |
Completion signal with remaining message quota |
| Layer | Technology |
|---|---|
| AI Engine | Google Gemini 3 Flash via PydanticAI |
| Backend | Python 3.12, FastAPI, Pydantic v2 |
| Frontend | React 19, TypeScript, Vite, Tailwind CSS, Zustand |
| Database | Supabase (PostgreSQL) |
| Code Sandbox | Isolated Docker containers on dedicated Hetzner server |
| Deployment | Docker Compose + Nginx on Hetzner |
| Landing Page | Next.js on Netlify |
| Assets and Demo | Veo 3.1, Nano Banana Pro |
| Course Animations | Manim (3Blue1Brown's library) |
| Development | Claude Code |
blasto/
backend/ FastAPI application, PydanticAI agents, Supabase integration
app/
routers/ API endpoints (auth, missions, chat, characters, environments, profile, leaderboard)
services/ Business logic and AI agents
pipeline/ 4-agent teaching pipeline
schemas/ Pydantic request/response models
dependencies/ Auth middleware (JWT validation)
middleware/ Rate limiting (SlowAPI)
frontend/ React SPA
src/
api/ API client with SSE streaming
components/ UI components (mission cockpit, mentor chat, code editor, tutorial)
pages/ Route-level components (lazy-loaded)
stores/ Zustand state management
hooks/ React hooks (auth context)
deploy/ Docker Compose, Nginx configs, systemd service
- Python 3.12+ and uv
- Node.js 20+ and pnpm
- A Supabase project (for database and auth)
- A Google AI API key (for Gemini)
cd backend
cp .env.example .env
# Fill in: SUPABASE_URL, SUPABASE_ANON_KEY, SUPABASE_SERVICE_ROLE_KEY, GOOGLE_AI_API_KEY
uv sync # Install dependencies
uv run uvicorn app.main:app -reload # Start dev server on :8000See backend/README.md for detailed API documentation.
cd frontend
cp .env.local.example .env.local
# Fill in: VITE_API_URL, VITE_SUPABASE_URL, VITE_SUPABASE_ANON_KEY
pnpm install # Install dependencies
pnpm dev # Start dev server on :5173docker-compose -f deploy/docker-compose.yml up -build- Google Gemini - powers all 7 AI agents (Gemini 3 Flash)
- PydanticAI - agent framework with structured outputs, tools, and streaming
- Veo 3.1 - AI-generated video for demo and environment visuals
- Nano Banana Pro - AI-generated images for characters and assets
- Manim - animated course explanations (3Blue1Brown's library)
- Claude Code - AI-assisted development