# Audio Generation Experiments

Prompt for TinyTutor Notebook 2: 02_audio_generation_experiments.ipynb
# TinyTutor Capstone Notebook 2: Audio Generation Experiments
## Objective
Generate runnable Python code using the ADK to define a custom Text-to-Speech (TTS) function and wrap it as an ADK `FunctionTool`. This action simulates the core multimodal capability required by the TinyTutor project: converting the textual story script generated in Notebook 1 (`final_script`) into an audio artifact.

## Implementation Requirements (Adhering to Tool Best Practices)

1.  **Setup:** Include standard imports (ADK Agents, Runners, Gemini Model), API key setup, and `retry_config`.
2.  **Tool Definition (`generate_tts_audio`):** Define a standard Python function `generate_tts_audio(script: str)` that accepts the final story script text.
3.  **Documentation (MCP Best Practice):** The function **must** include a clear, detailed `docstring` that serves as the instruction manual for the LLM. This docstring should explicitly describe the tool's purpose—**converting the provided narrative script into a lesson audio file**—and clearly state its required input parameter (`script`).
4.  **Simulated Output (Concise Output Rule):** The function should simulate a successful external API call by returning a **concise, structured output** (a JSON/dictionary) that references the location of the generated audio file (e.g., `{'status': 'success', 'audio_uri': 'data/samples/lesson_audio.wav'}`). It should *not* return the raw audio file contents.
5.  **Tool Registration:** Wrap the Python function using `FunctionTool` so the agent can discover and use it.
6.  **Test Agent (`AudioTestAgent`):** Create a simple `LlmAgent` named `AudioTestAgent` and equip it solely with the wrapped `FunctionTool`.
7.  **Execution:** Run the `AudioTestAgent` with a prompt that requires it to call the tool using a sample script string (simulating receipt of the `final_script` from the previous notebook). The output must confirm successful function calling and structured result parsing by the LLM.

## Generation Prompt
"Generate the runnable Python code for the '02_audio_generation_experiments.ipynb' notebook for the **TinyTutor** project. The solution must define the `generate_tts_audio` function with an ADK-compliant docstring and wrap it as a `FunctionTool`. Define and execute an `AudioTestAgent` that successfully calls this tool, demonstrating how the agent uses its 'hands' to convert text input into a structured reference to a multimodal artifact (audio URI)."

### 2. `02_audio_generation_experiments.ipynb` - Final Checklist

| Category | Requirement | Sources & Justification |
| :--- | :--- | :--- |
| **Core Concept** | Implementing **Action Execution** capabilities ("Hands"). |
| **Goal** | Define and wrap the custom **Text-to-Speech (TTS)** capability for audio generation. |
| **Dependencies** | Requires the successful output of Notebook 1 (`final_script`). |
| **Required Tools** | **Custom Tool:** `generate_tts_audio` (Function Tool). **Simulated or Real:** Google Cloud TTS. |
| **Architecture** | Define a Python function and wrap it using ADK's `FunctionTool`. |
| **Good Practices** | **Documentation is Paramount:** Use detailed Python `docstrings` to define the tool's contract (inputs/outputs) for the LLM. |
| **Good Practices** | **Concise Output:** The tool must return a structured dictionary/JSON referencing the file path (URI) and **not** dump the raw audio data into the context window. |
| **Good Practices** | **Granularity:** The tool should encapsulate a single, high-level task (`convert text to audio`) rather than exposing raw API calls. |