Releases: lcy362/agnes-video-generator
v2.2 — Image-to-Image End Frames + Stability Enhancements
Release v2.2 — Image-to-Image End Frames + Stability Enhancements
Release date: 2026-06-19
Overview
v2.2 introduces the i2i (Image-to-Image) end frame pipeline, enabling visual consistency across creative video scenes. This release also delivers comprehensive stability fixes from the second code review batch, a global rate limiter, unified API retry logic, and i18n improvements.
i2i End Frame Pipeline
Six-batch feature implementation for visual consistency across scenes:
- Batch 1+2 — Image model unified to
agnes-image-2.1-flash, i2i array API, character reference image size normalization - Batch 3 — Character appearance persistence across scenes, programmatic prompt injection
- Batch 4 — Prompt structure optimization, facial detail requirements in character reference prompts
- Batch 5 — Multi-image guided i2i end frames, visual chain linking across scenes
- Batch 6 — Keyframes fallback branch synchronization, full 6/6 batches complete
- Creative videos now default to i2i end frames enabled, narrator subtitles disabled
Stability & Bug Fixes
Code Review Batch 2 Fixes (P1-P13)
| ID | Fix |
|---|---|
| P1 | Video concatenation sync blocking → async |
| P2 | active_pipelines concurrent race condition |
| P3 | Custom end frame not applied |
| P4 | Manuscript step key alignment |
| P5 | chat_json robustness |
| P6 | Resource leaks |
| P7 | Parameter validation improvements |
| P8 | Prompt injection protection |
| P9 | SilentTTS return code handling |
| P10 | Subtitle silent degradation on failure |
| P11 | LLM retry logic |
| P12 | URL cache expiry |
| P13 | Temp filename uniqueness |
Other Fixes
- Resume crash —
_upload_image_to_hostmethod name error,_run_pipelinetask_idundefined,load()creating empty directories - Concatenator
AttributeError— video concatenation failure path - Global rate limiter — token bucket (16 req/min) shared across Chat + Image + Video APIs
- API retry — exponential backoff for 429/5xx errors across all three API modules
- Regression runner — 404 polling detection,
--quickmanifest mode, resume enhancements
i18n Improvements
- Duration parsing now supports all 7 languages (zh/en/ru/ja/ko/ms/id)
- User requirements and visual style defaults localized per language
Documentation
docs/bug_fix_plan.md— comprehensive bug fix plan (added)docs/regression_test_plan.md— updated scenarios and flow rulesAGENTS.md— synced rate limiter architecture, runner resume strategydocs/release-notes/— v2.0 and v2.1 release notes (added)- Fixed official website link label (not "Live Demo")
Stats
23 files changed, 1,189 insertions(+), 479 deletions(-)
Key Files
| File | Description |
|---|---|
core/api/rate_limiter.py |
New — global token bucket rate limiter |
core/api/agnes_chat.py |
LLM retry + JSON mode improvements |
core/api/agnes_image.py |
i2i array API + ref image support |
core/api/agnes_video.py |
Retry logic + 429 handling |
core/pipelines/creative_video.py |
i2i end frame pipeline integration |
core/screenwriter.py |
Character appearance persistence |
core/compositor/concatenator.py |
Async refactor + bug fixes |
core/task_manager.py |
Resume crash fixes + backward compat |
server.py |
Rate limiter integration + endpoint fixes |
static/index.html |
i18n duration parse + style defaults |
scripts/regression_runner.py |
Resume + quick-verify enhancements |
docs/bug_fix_plan.md |
New — bug fix tracking |
Upgrade Notes
From v2.1:
git pull
./start.sh2.1 version release
Release v2.1 — Code Review Fixes + Regression Test Framework + Quality Improvements
Release date: 2026-06-16
Overview
v2.1 focuses on code quality and engineering robustness. All 24 issues from the full code review have been fixed, and an automated regression test framework has been introduced to ensure long-term stability.
Code Review Fixes
Based on docs/code_review_report.md, all 24 issues resolved:
High Severity (H1-H6)
- H1 — API Key hardcoded in
agnes_chat.py→ unified read fromconfig.py - H2 — Path traversal in
server.pyfile upload → safe path join withos.path.basename - H3 — Missing font fallback in
concatenator.pysubtitle overlay →resolve_font_pathCJK fallback - H4 — Shell injection in
processor.py→ list arguments instead ofshell=True - H5 — moviepy
write_videofilelog leakage insubtitle.py→ redirect todevnull - H6 — JSON parse failure in
screenwriter.py→ LLM retry with fallback parsing
Medium Severity (M1-M10)
- Index / bounds safety (M1-M3)
- Overly broad exception handling → granular catch (M4-M5)
- Task directory path normalization (M6)
- Unified HTTP timeouts (M7)
- Task state race condition (M8)
- TTS file handle leak (M9)
- Frontend i18n variable shadowing (M10)
Low Severity (L1-L8)
- Automated unit test framework (L1)
- Typo fixes (L2-L3)
- Redundant documentation cleanup (L4-L5)
- AGENTS.md alignment with code (L6)
- Dead file cleanup (L7-L8)
Regression Test Framework
- 9 scenarios concurrent execution (3 simple + 4 creative + 2 manuscript)
- Weighted semaphore for parallelism control (total weight ≤ 10, 50% API headroom)
- Incremental JSON report + Markdown readable report
- Resume / quick-verify modes
--cleanupsafe artifact removal
Endpoint Verification (E1-E9)
All 9 endpoints auto-verified: homepage, config, three task creation endpoints, task query, resume, stop
Artifact Verification (F1-F7, R1-R10)
final_video.mp4existence + non-empty + duration + resolution- Audio track + whisper ASR speech content matching
- SRT subtitle entry validation
- Resume checkpoint completeness
Other Improvements
- Subtitle multi-line wrapping — dynamic
max_chars_per_line, CJK punctuation break priority,method="caption"rendering - TTS — auto 2.5x volume boost, edge case error handling
- Concatenator — single-video shortcut optimization, subtitle overlay failure degradation (non-blocking)
- start.sh — auto venv creation, dependency install, macOS browser auto-open
- Requirements — pinned
edge_tts>=7.0.0,srt>=3.5.0,moviepy>=2.0.0 - Config — API Key clear functionality, enhanced font path fallback
- Static analysis integration — each
Taskfileincludesruff+mypychecks
Stats
26 files changed, 1,611 insertions(+), 235 deletions(-)
New / Deleted Files
| File | Action | Description |
|---|---|---|
docs/code_review_report.md |
+added | 24 code review issues documented |
docs/release-notes/release_notes_v2.0.md |
+added | v2.0 release notes |
docs/release-notes/release_notes_v2.1.md |
+added | v2.1 release notes |
tests/test_core.py |
+added | 428-line automated unit test suite |
test_ref.png / test_end.png |
+added | Regression test assets |
_test_reset.py |
-deleted | Deprecated test script |
start.sh |
refactored | One-click startup with auto venv + deps + browser |
Upgrade Notes
From v2.0:
git pull
.venv/bin/pip install -r requirements.txt
./start.shRun regression tests:
.venv/bin/python scripts/regression_runner.py --auto-start2.0 version release
Release v2.0 — Three-Pipeline Architecture + Multilingual Web UI
Release date: 2026-06-15
Overview
v2.0 is a complete architectural refactor from a single-file script to an engineered application with three distinct video generation pipelines, a four-layer backend, WebSocket real-time progress, and a 7-language internationalized frontend.
Features
Three Task Types
- Simple Video — Single prompt → single video, exposing all 9 Agnes API parameters (t2v/i2v/ti2vid/keyframes)
- Creative Video — AI screenwriter → storyboards → per-scene videos → edge_tts narration → fine-grained subtitles → concatenation
- Manuscript Video — Long text splitting → AI scene prompt → per-paragraph videos → unified TTS+subtitles → concatenation
Architecture
core/api/— Agnes Chat / Image / Video API wrappers with retry and pollingcore/audio/— edge_tts engine (word-level timestamps) + SRT subtitle generation + moviepy overlaycore/compositor/— Video concatenation, scaling, frame extraction, silent audio generationcore/pipelines/— Three pipeline implementations (simple / creative / manuscript)models/— Pydantic v2 data models with persistent task state serialization
Web UI
- Three-tab frontend (Simple / Creative / Manuscript), Tailwind CDN single-page
- 7 languages: 中文 / English / Русский / 日本語 / 한국어 / Bahasa Melayu / Bahasa Indonesia
- WebSocket real-time progress push
- Task pause, resume, and stop
Subtitle System
- edge_tts word-level timestamps → fine-grained SRT grouping
- CJK multi-line wrapping (break at punctuation)
method="caption"rendering, supports stroke / background / position customization
Other
- One-click startup script
start.sh docs/system_design.mdsystem design document- 3 demo videos embedded in README
Stats
40 files changed, 11,268 insertions(+), 2,792 deletions(-)
New Files
| File | Description |
|---|---|
core/pipelines/ |
Three pipeline types (simple / creative / manuscript) |
core/api/ |
Agnes API wrapper layer |
core/audio/ |
TTS + subtitle engine |
core/compositor/ |
Video compositing / processing |
models/task.py |
Three task subtype data models |
scripts/regression_runner.py |
Regression test script |
docs/system_design.md |
System design document |
docs/regression_test_plan.md |
Test plan |
Upgrade Notes
- Python 3.10+ required
- New dependencies:
edge_tts>=6.1.0,srt>=3.5.0 - Run
./start.shfor one-click startup, or.venv/bin/pip install -r requirements.txt && .venv/bin/python server.py