<a href="https://colab.research.google.com/github/angelatyk/tinytutor/blob/dev/notebooks/00_master_demo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Install Dependencies
Install necessary Python packages for AI models, TTS, audio processing, and Gradio UI.

In [54]:
!pip install -q google-adk google-generativeai python-dotenv
!pip install -q google-cloud-texttospeech pydub
!pip install -q gradio

print("‚úÖ All libraries installed.")

‚úÖ All libraries installed.


## Libraries Import

In [55]:
import os
import json
import asyncio
from pathlib import Path
from typing import List, Tuple

import google.generativeai as genai
from google.colab import userdata

from google.adk.agents import Agent
from google.adk.models.google_llm import Gemini
from google.adk.runners import InMemoryRunner
from google.adk.tools import google_search
from google.genai import types

from google.cloud import texttospeech
from pydub import AudioSegment

import gradio as gr

## Configure API Keys
Set up API keys for Gemini (LLM) and Google Cloud Text-to-Speech service.

In [56]:
# Gemini Key
GOOGLE_API_KEY = userdata.get("GOOGLE_API_KEY")
os.environ["GOOGLE_API_KEY"] = GOOGLE_API_KEY
genai.configure(api_key=GOOGLE_API_KEY)
print("‚úÖ Gemini API configured.")

# Google TTS Service Account JSON
SERVICE_ACCOUNT_JSON = userdata.get("GCP_VI_SERVICE_ACCOUNT_JSON")
if not SERVICE_ACCOUNT_JSON:
    raise RuntimeError("Upload GCP_VI_SERVICE_ACCOUNT_JSON to Colab Secrets!")

with open("tinytutor-tss-agent.json", "w") as f:
    f.write(SERVICE_ACCOUNT_JSON)

os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "tinytutor-tss-agent.json"

tts_client = texttospeech.TextToSpeechClient()
print("‚úÖ Google TTS configured.")

‚úÖ Gemini API configured.
‚úÖ Google TTS configured.


## Configure Retry Options
Set HTTP retry policy for API requests to handle transient errors.

In [57]:
retry_config = types.HttpRetryOptions(
    attempts=5,
    exp_base=7,
    initial_delay=1,
    http_status_codes=[429, 500, 503, 504]
)

## PedagogyAgent
Agent that simplifies a topic into an "Explain Like I'm 5" style explanation.

In [65]:
pedagogy_agent = Agent(
    name="PedagogyAgent",
    model=Gemini(model="gemini-2.5-flash-lite", retry_options=retry_config),
    description="Explains topics in simple ELI5 style.",
    instruction="Explain the topic like I'm 5. Use google_search if needed.",
    tools=[google_search],
)

runner = InMemoryRunner(agent=pedagogy_agent)

In [59]:
async def run_pedagogy_async(topic: str) -> str:
    response = await runner.run_debug(topic)

    return response[0].content.parts[0].text

## AudioScriptWriterAgent
Converts the simplified explanation into a child-friendly teaching script suitable for TTS.

In [60]:
SCRIPTWRITER_SYSTEM_PROMPT = """
You are a Teacher.

Your role is to take a simplified explanation created by the Pedagogy Agent and turn it into a clear, friendly teaching script suitable for a young child around the age of 5.
The script you produce will be used by a Text-to-Speech (TTS) system, so write in a way that sounds natural when spoken aloud.

Follow these steps:

1. Read the simplified explanation provided by the Pedagogy Agent.
2. Transform it into a spoken-style teaching script that:
   - Uses short, clear sentences.
   - Uses warm, encouraging language.
   - Keeps a playful, curious tone suitable for a young child.
   - Avoids complex words unless they were already explained.
   - Includes gentle teacher-like transitions (‚ÄúLet‚Äôs imagine‚Ä¶‚Äù, ‚ÄúDid you know‚Ä¶?‚Äù, ‚ÄúNow let‚Äôs think about‚Ä¶‚Äù).
   - **Do NOT use sound effects or onomatopoeia (e.g., ‚Äúboing,‚Äù ‚Äúzoom,‚Äù ‚Äúpow‚Äù).**
   - **Do NOT repeat words for dramatic effect (e.g., ‚Äústraight, straight, straight‚Äù).**
   - Keep playfulness through ideas and imagery, not noises.
3. Add exactly 2 learning questions inside the story to spark curiosity.
   - The questions must feel natural within the flow of the explanation.
   - They should be simple, open-ended questions a young child can think about.
   - Do NOT place both questions back-to-back.
4. Make sure the script is vivid and engaging:
   - Use simple imagery.
   - Ask simple rhetorical questions.
   - Use examples familiar to young children.
5. Avoid:
   - Any reference to agents, prompts, or system instructions.
   - Visual descriptions that don't make sense in audio (‚Äúlook at this picture‚Äù).
   - Overly long paragraphs‚Äîkeep pacing steady for TTS.
6. Output only the final teaching script, nothing else. No labels, no titles, no markdown.
"""

def run_scriptwriter(explanation: str) -> str:
    model = genai.GenerativeModel(
        model_name="gemini-2.5-flash",
        system_instruction=SCRIPTWRITER_SYSTEM_PROMPT
    )

    response = model.generate_content(
        f"Write a children's story based on this:\n{explanation}",
        generation_config=genai.GenerationConfig(
            temperature=0.9,
            max_output_tokens=4096
        )
    )

    # Safest extraction
    try:
        return response.text
    except Exception:
        pass

    # Fallback
    try:
        return response.candidates[0].content.parts[0].text
    except Exception:
        pass

    return "‚ö†Ô∏è ScriptWriter failed."

## AudioGeneratorAgent
Functions to convert story text into audio, handling long texts in chunks.

In [61]:
def chunk_text(text, max_chars=4500):
    text = text.strip()
    if len(text) <= max_chars:
        return [text]
    chunks = []
    while len(text) > max_chars:
        cut = text.rfind(". ", 0, max_chars)
        if cut == -1:
            cut = max_chars
        chunks.append(text[:cut+1])
        text = text[cut+1:].strip()
    chunks.append(text)
    return chunks


def tts_segment(text):
    synthesis_input = texttospeech.SynthesisInput(text=text)
    voice = texttospeech.VoiceSelectionParams(
        language_code="en-US",
        name="en-US-Journey-F"
    )
    audio_cfg = texttospeech.AudioConfig(
        audio_encoding=texttospeech.AudioEncoding.MP3,
        speaking_rate=0.94,
        pitch=0.0,
        volume_gain_db=0.0
    )
    response = tts_client.synthesize_speech(
        input=synthesis_input,
        voice=voice,
        audio_config=audio_cfg
    )
    return response.audio_content


def audio_writer(script_text: str, out="story.mp3"):
    chunks = chunk_text(script_text)
    audio = AudioSegment.silent(200)
    for i, chunk in enumerate(chunks, 1):
        path = f"seg_{i}.mp3"
        with open(path, "wb") as f:
            f.write(tts_segment(chunk))
        audio += AudioSegment.from_mp3(path)
        audio += AudioSegment.silent(150)
    audio.export(out, format="mp3")
    return out


## Gradio User Interface
Build a simple web interface to input a topic and get explanation, story, and audio output.

In [62]:
async def full_pipeline(topic: str):
    eli5 = await run_pedagogy_async(topic)
    script = run_scriptwriter(eli5)
    audio_path = audio_writer(script, "story.mp3")
    return eli5, script, audio_path

In [67]:
app = gr.Interface(
    fn=full_pipeline,
    inputs=gr.Textbox(label="Your Topic"),
    outputs=[
        gr.Textbox(label="ELI5 Explanation", lines=8),
        gr.Textbox(label="Generated Story Script", lines=20),
        gr.Audio(label="Generated Audio")
    ],
    title="üéß TinyTutor ‚Äî Full Pipeline",
    css=".gradio-container { min-height: 1200px !important; }"
)

app.launch(debug=True)


It looks like you are running Gradio on a hosted Jupyter notebook, which requires `share=True`. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. This cell will run indefinitely so that you can see errors and logs. To turn off, set debug=False in launch().
* Running on public URL: https://44de12ad03cb1fda81.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)



 ### Continue session: debug_session_id

User > why the sky is blue?
PedagogyAgent > The sky is blue because of how sunlight interacts with the air around our planet!

Sunlight looks white, but it's actually made up of all the colors of the rainbow. When sunlight enters Earth's atmosphere, it bumps into tiny molecules in the air, like nitrogen and oxygen.

These molecules scatter the sunlight in all directions. Blue light has shorter, smaller waves, so it gets scattered more than other colors like red and yellow, which have longer waves.

Think of it like this: imagine tiny little balls in the air. When light hits them, the blue light bounces off in every direction, filling up the sky. The other colors don't bounce around as much and mostly travel straight through.

So, when you look up, you see all that scattered blue light, making the sky appear blue!
Keyboard interruption in main thread... closing server.
Killing tunnel 127.0.0.1:7860 <> https://44de12ad03cb1fda81.gradio.live


