v2.8.0b2
Pre-release
Pre-release
This release focuses on refactoring and hardening the Ollama web search integration introduced in v2.8.0b1, with improvements to rate limiting, dependency management, and prompt quality.
New Features:
- In-Memory Caching for Ollama Search: Added a cache to
OllamaWebSearchDepsthat stores search results keyed by a normalized query string (punctuation removed, case-folded). This avoids redundant API calls for equivalent queries like"Hello, world!"and"hello world", reducing latency and API costs.
Refactoring & Improvements:
- AgentDeps Architecture: Introduced a new
AgentDepsdata model to centralize dependencies passed to Pydantic AI agents. This replaces the looseAnytype with a typed, extensible container holdingvideo_duration_msand optionalollama_searchfields.- Updated
RateLimitedAgentWrapperto use typedAgentDepsinstead ofAny. - Refactored
ollama_web_searchfunctions to extract dependencies fromctx.deps. - Updated
main.pyto create and configureAgentDepswith Ollama search dependencies.
- Updated
- Rate Limiting Migration: Implemented hook-based rate limiting.
- Moved rate limiting logic from the
run()method to model request hooks via a_rate_limithook function. - Added token calculation helper
_calculate_tokens()for accurate rate limit tracking. - Improved error message formatting to show human-readable durations.
- Moved rate limiting logic from the
- Deferred Response Validation: Refactored response handling to store AI responses in a local variable, validate against video duration, and only assign to
job.responseif validation passes. This prevents storing potentially invalid responses on the job object.
Fixes & Improvements:
- Ollama Search Prompt: Updated
LYRICS_PROMPT_VERSIONto 6 with refined instructions to prevent excessive search queries. Added explicit rules against query spamming, restrictive quotation marks, and guidance to use native titles for better search results. - Developer Experience: Added the Ollama API key registration link (
https://ollama.com/settings/keys) to validation error messages, making it easier for users to configure their API keys. - Documentation: Fixed README to reference the correct split setting (
max-secondsvsmax_minutes).
Full Changelog: v2.8.0b1...v2.8.0b2