# Audio Generation Experiments

---

##  Final System Prompt for `02_audio_generation_experiments.ipynb`

###  Notebook Title:
**TinyTutor Capstone Notebook 02: Audio Generation Experiments**

###  Objective:
Define and test the **Text-to-Speech (TTS)** and **Speech-to-Text (STT)** tools as production-ready ADK `FunctionTool` wrappers. These tools simulate TinyTutor’s multimodal capability to convert a child-friendly story script into audio, and optionally transcribe user speech into text. The notebook must demonstrate tool reliability, clarity, and adherence to ADK best practices.

---

###  System Prompt:
> Generate runnable Python code for `02_audio_generation_experiments.ipynb` that defines and tests TinyTutor’s audio tools. Implement the following:
>
> 1. **TTS Tool (`generate_child_voiceover`)**:
>     - Define a Python function `generate_child_voiceover(script: str, voice_profile: str) -> dict`
>     - Include a detailed docstring describing its purpose: converting a child-friendly story script into a lesson audio file using a specified voice profile.
>     - Simulate a successful output by returning a structured dictionary (e.g., `{'status': 'success', 'audio_uri': 'data/audio/lesson.wav'}`).
>     - Wrap the function as an ADK `FunctionTool`.
>
> 2. **STT Tool (`transcribe_user_audio`)**:
>     - Define a Python function `transcribe_user_audio(file_path: str) -> str`
>     - Include a docstring describing its purpose: transcribing user speech into text.
>     - Simulate output by returning a mock transcription string.
>     - Wrap the function as an ADK `FunctionTool`.
>
> 3. **Test Agent (`AudioTestAgent`)**:
>     - Create an `LlmAgent` equipped with both tools.
>     - Run a test prompt that requires the agent to:
>         - Transcribe a sample audio file
>         - Convert a sample story script into audio
>     - Confirm that the agent successfully calls both tools and parses their outputs.
>
> 4. **Best Practices**:
>     - Use type hints and structured return formats
>     - Avoid returning raw audio data
>     - Log tool usage and parameters
>     - Include inline comments and Markdown to explain tool design and Capstone relevance

---

##  Final Checklist for `02_audio_generation_experiments.ipynb`

| **Category**         | **Requirement**                                                                                                                                       | **Source/Justification**                                                                 |
|----------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------|
| **Core Concept**      | Implementing **Action Execution** capabilities (“Hands”)                                                                                             | Capstone Level 2 architecture                                                             |
| **Goal**              | Define and wrap custom **TTS** and **STT** tools for multimodal interaction                                                                          | Enables audio input/output for TinyTutor                                                  |
| **Dependencies**      | Requires `final_script` from Notebook 01 and simulated audio input                                                                                   | Demonstrates tool chaining and multimodal readiness                                       |
| **Required Tools**    | - `generate_child_voiceover(script, voice_profile)` <br> - `transcribe_user_audio(file_path)` <br> - ADK `FunctionTool` wrappers                     | Ensures modularity and agent discoverability                                              |
| **Tool Design**       | - Clear docstrings <br> - Type hints <br> - Structured outputs (URI or transcription string)                                                         | Aligns with ADK and MCP best practices                                                    |
| **Output Format**     | - TTS returns: `{'status': 'success', 'audio_uri': '...'}` <br> - STT returns: `'transcribed text'`                                                  | Prevents context bloat and ensures clarity                                                |
| **Agent Execution**   | - `AudioTestAgent` must call both tools <br> - Confirm tool invocation and output parsing                                                            | Validates tool integration and agent orchestration                                        |
| **Architecture**      | - FunctionTool wrappers <br> - LlmAgent with tool access                                                                                             | Mirrors production-ready ADK design                                                       |
| **Good Practices**    | - Avoid raw audio data <br> - Use concise artifact references <br> - Log tool usage                                                                 | Ensures scalability and traceability                                                      |
| **Documentation**     | - Inline comments <br> - Markdown explanations                                                                                                       | Supports Capstone reviewers and future collaborators                                      |

---

###  What We’ll Have When This Code Is Done

-  Two production-ready audio tools: one for TTS, one for STT
-  ADK-compliant FunctionTool wrappers with clear docstrings and structured outputs
-  A test agent that demonstrates tool invocation and output parsing
-  Simulated audio and transcription artifacts for downstream use
-  Clear documentation and inline logic to support Capstone delivery and debugging

---


In [None]:
# code here