An AI-powered voice tutoring platform that teaches any topic through real-time conversation, flashcards, and quizzes — powered by Gemini 2.5 Flash Native Audio and Google ADK.
Gemini Live Agent Challenge · Live Agents Category
Tink is a voice-first AI tutor that breaks the "text box" paradigm. Instead of typing prompts and reading responses, students talk to Tink naturally — and Tink talks back.
Pick any topic (Python, Japanese, Biology, Economics...), choose your level, and Tink generates a personalized curriculum. Then you start a voice lesson where Tink:
- Teaches concepts one-by-one through conversation
- Shows flashcards to reinforce key terms visually
- Quizzes you after every few concepts
- Adapts to your questions — ask anything mid-lesson and Tink explains before moving on
- Tracks progress across lessons with saved notes and quiz history
You can interrupt Tink anytime (barge-in), ask follow-up questions, or request re-explanations — just like a real tutor.
- Python 3.11+
- Node.js 20+
- A Google AI Studio API key (Gemini)
git clone https://github.com/hashtagemy/tink.git
cd tinkcd tink/backend
# Create virtual environment
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Configure environment
cp .env.example .env
# Edit .env and add your GOOGLE_API_KEY
# Start the server
python main.pyThe backend runs on http://localhost:8000.
cd tink/frontend
# Install dependencies
npm install
# Start the dev server
npm run devThe frontend runs on http://localhost:3000 and connects to the backend automatically.
- Open
http://localhost:3000in your browser - Enter your name and choose a topic
- Select a difficulty level — Tink generates a curriculum
- Click a lesson to start a voice session
- Allow microphone access and start learning!
Backend (tink/backend/.env):
| Variable | Description | Required |
|---|---|---|
GOOGLE_API_KEY |
Gemini API key from AI Studio | Yes |
GOOGLE_GENAI_USE_VERTEXAI |
Set to TRUE for Vertex AI |
No (default: FALSE) |
FRONTEND_URL |
Frontend origin for CORS | No (default: http://localhost:3000) |
PORT |
Backend port | No (default: 8000) |
Frontend (optional tink/frontend/.env.local):
| Variable | Description | Required |
|---|---|---|
NEXT_PUBLIC_API_URL |
Backend API URL | No (default: http://localhost:8000) |
To deploy the backend to Google Cloud Run (requires Google Cloud SDK):
./deploy.shTo point the frontend to Cloud Run, create tink/frontend/.env.local:
NEXT_PUBLIC_API_URL=https://your-cloud-run-url.run.app
sequenceDiagram
participant S as 🧑🎓 Student
participant B as 🌐 Browser
participant F as ⚡ FastAPI
participant A as 🔧 ADK Agent
participant G as 🤖 Gemini Live
participant GS as 🔍 Google Search
Note over S,GS: 1. Session Setup (with Grounding)
S->>B: Select topic & lesson
B->>F: POST /api/curriculum/generate
F->>G: Research topic (Text API + Grounding)
G->>GS: Search for reliable sources
GS-->>G: Authoritative results
G-->>F: Grounded research summary
F->>G: Generate curriculum from research
G-->>F: Source-backed lesson plan JSON
F-->>B: Curriculum response
Note over S,G: 2. Voice Connection
B->>F: WebSocket /api/live/{session_id}
F->>A: Create ADK session
A->>G: Open audio stream
G-->>B: ready
Note over S,G: 3. Teaching Loop
S->>B: 🎤 Speaks
B->>F: Audio chunks (PCM 16kHz)
F->>A: Forward audio
A->>G: Stream to Gemini
G-->>A: Audio response + tool calls
A-->>F: Events
F-->>B: 🔊 Audio + 🃏 Flashcards + ❓ Quizzes
Note over S,G: 4. Lesson Complete
A->>F: lesson_complete(summary)
F-->>B: Save notes & quiz history
| Layer | Technology |
|---|---|
| AI Model | Gemini 2.5 Flash Native Audio (voice), Gemini 2.5 Flash (text) |
| Agent Framework | Google ADK (Agent Development Kit) |
| Backend | Python, FastAPI, WebSockets |
| Frontend | Next.js 16, React 19, TypeScript |
| State Management | Zustand (persisted to localStorage) |
| UI | Tailwind CSS, Framer Motion |
| Audio | Web Audio API, AudioWorklet (16kHz PCM) |
| Cloud | Google Cloud Run |
- Voice-first interaction — Talk naturally, get spoken responses in real-time
- Barge-in support — Interrupt Tink anytime, just like a real conversation
- AI-generated curriculum — Structured lessons with concepts, organized by difficulty
- Interactive flashcards — Visual cards shown during voice lessons
- Adaptive quizzes — Multiple-choice questions after every few concepts
- Progress tracking — Lesson completion, notes, and quiz history saved locally
- Auto-reconnect — Seamless session recovery with context preservation
- Live transcript — Real-time captions of the conversation
- Three difficulty tiers — Beginner, Intermediate, Advanced with progressive unlock
- Notes review — Flip through all flashcards and quiz answers after lessons
tink/
├── README.md
├── Tink.svg # Logo
├── deploy.sh # Cloud Run deploy script
│
└── tink/
├── backend/
│ ├── main.py # FastAPI entry point
│ ├── config.py # Environment & model config
│ ├── requirements.txt
│ ├── agents/tutor/
│ │ ├── agent.py # ADK agent definition & system prompt
│ │ └── tools.py # show_flashcard, quiz_student, lesson_complete
│ ├── models/
│ │ └── schemas.py # Pydantic request/response models
│ ├── routers/
│ │ ├── session.py # REST: curriculum generation, session CRUD
│ │ └── live.py # WebSocket: voice streaming & ADK bridge
│ └── skill_quest/
│ ├── genai_client.py # Lazy Gemini client singleton
│ ├── data/curriculum.py # AI curriculum generator
│ └── tools/game_state.py # In-memory session store
│
└── frontend/
├── app/
│ ├── page.tsx # Home — name & topic selection
│ ├── learn/page.tsx # Voice lesson interface
│ ├── roadmap/[topicId]/ # Curriculum & lesson roadmap
│ └── notes/[id]/ # Flashcard & quiz review
├── components/
│ └── learn/ # MascotOrb, Waveform, FlashCard, QuizCard...
├── hooks/
│ └── useVoiceConnection.ts # WebSocket + audio I/O lifecycle
└── lib/
├── learnStore.ts # Zustand: per-session lesson state
├── roadmapStore.ts # Zustand: persisted curriculum & progress
├── api.ts # REST API client
└── types.ts # TypeScript interfaces
- Topic Selection — Student enters a topic and difficulty level
- Curriculum Generation — Gemini generates a structured lesson plan using Google Search Grounding for reliable sources
- Voice Session — Browser connects via WebSocket; mic audio (16kHz PCM) streams to backend
- ADK Agent — Backend runs a Google ADK agent with the Gemini Live model, forwarding audio bidirectionally
- Tool Calls — The agent calls
show_flashcardto display concepts,quiz_studentto test knowledge, andlesson_completeto end the lesson - Progress Saved — Flashcards, quiz results, and lesson summaries persist in the browser for review
Apache License 2.0 — see LICENSE for details.