# pdf_agent â€” analyze_all demo

This notebook demonstrates the `analyze_all` mode which runs a batched, comprehensive analysis across the indexed corpus and writes the long-form result to `outputs/analyze_all_result.txt`.

Notes:
- This mode can produce long outputs and take significant time on larger corpora.
- Ensure the index is populated before running.

In [None]:
# Setup: imports and agent initialization
from pathlib import Path
import json
import os

# Import the agent (uses repo code)
from agents.pdf_agent import PDFAgent

# Ensure outputs directory exists
Path('outputs').mkdir(parents=True, exist_ok=True)

# Instantiate the agent (may print logs)
agent = PDFAgent()
print('PDFAgent initialized. Use `agent` to run ingestion, retrieval and analyze_all.')

## Comprehensive analysis (`analyze_all`)

The `analyze_all` mode performs a batched analysis across the entire indexed corpus and can produce long outputs. Use a short query and expect longer runtimes. The notebook saves the final text to `outputs/analyze_all_result.txt`.

In [None]:
# Analyze-all demo (may take significant time on large corpora)
query = 'Summarize recent trends in agentic AI'
print('Starting analyze_all; this may take a while depending on corpus size...')
result = agent.search(query, mode='analyze_all')
answer = result.get('answer', '(no answer)')
out_file = Path('outputs/analyze_all_result.txt')
out_file.write_text(answer)
print(f'Analyze_all output saved to: {out_file.resolve()}')
print(f'Documents analyzed: {len(result.get("sources", []))}')

## Next steps

- If you plan to use cloud providers (Azure/Poe), verify `system_config.json` and your environment variables before ingesting large corpora.
- For privacy-sensitive data, consider running Ollama locally and selecting it as the embedding/LLM provider in settings.
- Use the `examples/` CLI scripts if you prefer non-interactive runs.