## Example Notebook: Using VideoAgent

### üõ†Ô∏è Setup Instructions

Before running this notebook:

- Ensure you have created an `.env` file in the **same directory** as this notebook. It must contain all required environment variables (e.g., API keys or configuration values expected by `VideoAgent`).
- Make sure all required libraries are installed by running:

  ```bash
  pip install -r requirements.txt
  ```

### About

**VideoAgent** uses the **Multi-Modal Critical Thinking (MMCT)** framework ([arxiv.org/abs/2405.18358](https://arxiv.org/abs/2405.18358)) for video question answering. MMCT involves two agents:

- **Planner**: Drives the reasoning process using a structured toolchain, generating an initial response.
- **Critic (optional)**: Analyzes the planner's output and, if needed, provides feedback that prompts an improved final answer.

> **Note:** The critic agent is enabled by default. You can disable it by setting `use_critic_agent=False` during initialization.  
> **Disabling the critic agent skips the feedback loop and may reduce the accuracy of the final response.**

---

### Tool Workflow

**VideoAgent uses a fixed pipeline** of tools that work collaboratively during the QA stage. These tools are automatically orchestrated by the planner:

- `GET_CONTEXT` ‚Äì Extracts relevant transcript and visual summary chunks.
- `GET_RELEVANT_FRAMES` ‚Äì Provides semantic similar keyframes related to the query. This tool is based on the clip embedding.
- `QUERY_FRAME` ‚Äì Queries specific video keyframes frames to extract detailed information to provide the additional visual context to the planner.

> The Critic agent helps validate, make answer faithful and refine answers, improving reasoning depth.

All tools work together in a coordinated pipeline to provide comprehensive video analysis and question answering.

### Importing Libaries


In [None]:
from mmct.video_pipeline import VideoAgent
import nest_asyncio
nest_asyncio.apply()

In [None]:
# Test the configuration first
try:
    from mmct.config.settings import MMCTConfig
    config = MMCTConfig()
    print(f"LLM Provider: {config.llm.provider}")
    print(f"LLM Endpoint: {config.llm.endpoint}")
    print(f"LLM Deployment: {config.llm.deployment_name}")
    print(f"Embedding Provider: {config.embedding.provider}")
    print(f"Embedding Endpoint: {config.embedding.endpoint}")
    print(f"Embedding Deployment: {config.embedding.deployment_name}")
    print("‚úÖ Configuration loaded successfully")
except Exception as e:
    print(f"‚ùå Configuration failed: {e}")
    import traceback
    traceback.print_exc()

# Create VideoAgent instance
video_agent = VideoAgent(
    query="user-query", #"input-query",
    index_name="relevant-index-name", #"your-index-name",
    video_id=None,  # Optional: specify video ID
    url=None,  # Optional: URL to filter out the documents
    use_critic_agent=True,  # Enable critic agent
    stream=True,  # Stream response
    use_graph_rag=False,  # Optional: use graph RAG
    cache=False  # Optional: enable caching
)

# Run the agent
response = await video_agent()
print("VideoAgent executed successfully!")

In [None]:
# Display the response
print(response)