19 Jun 03:59

lcy362

2be8781

v2.2 — Image-to-Image End Frames + Stability Enhancements Latest

Latest

Release v2.2 — Image-to-Image End Frames + Stability Enhancements

Release date: 2026-06-19

Overview

v2.2 introduces the i2i (Image-to-Image) end frame pipeline, enabling visual consistency across creative video scenes. This release also delivers comprehensive stability fixes from the second code review batch, a global rate limiter, unified API retry logic, and i18n improvements.

i2i End Frame Pipeline

Six-batch feature implementation for visual consistency across scenes:

Batch 1+2 — Image model unified to agnes-image-2.1-flash, i2i array API, character reference image size normalization
Batch 3 — Character appearance persistence across scenes, programmatic prompt injection
Batch 4 — Prompt structure optimization, facial detail requirements in character reference prompts
Batch 5 — Multi-image guided i2i end frames, visual chain linking across scenes
Batch 6 — Keyframes fallback branch synchronization, full 6/6 batches complete
Creative videos now default to i2i end frames enabled, narrator subtitles disabled

Stability & Bug Fixes

Code Review Batch 2 Fixes (P1-P13)

ID	Fix
P1	Video concatenation sync blocking → async
P2	`active_pipelines` concurrent race condition
P3	Custom end frame not applied
P4	Manuscript step key alignment
P5	`chat_json` robustness
P6	Resource leaks
P7	Parameter validation improvements
P8	Prompt injection protection
P9	SilentTTS return code handling
P10	Subtitle silent degradation on failure
P11	LLM retry logic
P12	URL cache expiry
P13	Temp filename uniqueness

Other Fixes

Resume crash — _upload_image_to_host method name error, _run_pipeline task_id undefined, load() creating empty directories
Concatenator AttributeError — video concatenation failure path
Global rate limiter — token bucket (16 req/min) shared across Chat + Image + Video APIs
API retry — exponential backoff for 429/5xx errors across all three API modules
Regression runner — 404 polling detection, --quick manifest mode, resume enhancements

i18n Improvements

Duration parsing now supports all 7 languages (zh/en/ru/ja/ko/ms/id)
User requirements and visual style defaults localized per language

Documentation

docs/bug_fix_plan.md — comprehensive bug fix plan (added)
docs/regression_test_plan.md — updated scenarios and flow rules
AGENTS.md — synced rate limiter architecture, runner resume strategy
docs/release-notes/ — v2.0 and v2.1 release notes (added)
Fixed official website link label (not "Live Demo")

Stats

23 files changed, 1,189 insertions(+), 479 deletions(-)

Key Files

File	Description
`core/api/rate_limiter.py`	New — global token bucket rate limiter
`core/api/agnes_chat.py`	LLM retry + JSON mode improvements
`core/api/agnes_image.py`	i2i array API + ref image support
`core/api/agnes_video.py`	Retry logic + 429 handling
`core/pipelines/creative_video.py`	i2i end frame pipeline integration
`core/screenwriter.py`	Character appearance persistence
`core/compositor/concatenator.py`	Async refactor + bug fixes
`core/task_manager.py`	Resume crash fixes + backward compat
`server.py`	Rate limiter integration + endpoint fixes
`static/index.html`	i18n duration parse + style defaults
`scripts/regression_runner.py`	Resume + quick-verify enhancements
`docs/bug_fix_plan.md`	New — bug fix tracking

Upgrade Notes

From v2.1:

git pull
./start.sh

Assets 2

16 Jun 10:46

lcy362

v2.1

9a80c97

2.1 version release

Release v2.1 — Code Review Fixes + Regression Test Framework + Quality Improvements

Release date: 2026-06-16

Overview

v2.1 focuses on code quality and engineering robustness. All 24 issues from the full code review have been fixed, and an automated regression test framework has been introduced to ensure long-term stability.

Code Review Fixes

Based on docs/code_review_report.md, all 24 issues resolved:

High Severity (H1-H6)

H1 — API Key hardcoded in agnes_chat.py → unified read from config.py
H2 — Path traversal in server.py file upload → safe path join with os.path.basename
H3 — Missing font fallback in concatenator.py subtitle overlay → resolve_font_path CJK fallback
H4 — Shell injection in processor.py → list arguments instead of shell=True
H5 — moviepy write_videofile log leakage in subtitle.py → redirect to devnull
H6 — JSON parse failure in screenwriter.py → LLM retry with fallback parsing

Medium Severity (M1-M10)

Index / bounds safety (M1-M3)
Overly broad exception handling → granular catch (M4-M5)
Task directory path normalization (M6)
Unified HTTP timeouts (M7)
Task state race condition (M8)
TTS file handle leak (M9)
Frontend i18n variable shadowing (M10)

Low Severity (L1-L8)

Automated unit test framework (L1)
Typo fixes (L2-L3)
Redundant documentation cleanup (L4-L5)
AGENTS.md alignment with code (L6)
Dead file cleanup (L7-L8)

Regression Test Framework

9 scenarios concurrent execution (3 simple + 4 creative + 2 manuscript)
Weighted semaphore for parallelism control (total weight ≤ 10, 50% API headroom)
Incremental JSON report + Markdown readable report
Resume / quick-verify modes
--cleanup safe artifact removal

Endpoint Verification (E1-E9)

All 9 endpoints auto-verified: homepage, config, three task creation endpoints, task query, resume, stop

Artifact Verification (F1-F7, R1-R10)

final_video.mp4 existence + non-empty + duration + resolution
Audio track + whisper ASR speech content matching
SRT subtitle entry validation
Resume checkpoint completeness

Other Improvements

Subtitle multi-line wrapping — dynamic max_chars_per_line, CJK punctuation break priority, method="caption" rendering
TTS — auto 2.5x volume boost, edge case error handling
Concatenator — single-video shortcut optimization, subtitle overlay failure degradation (non-blocking)
start.sh — auto venv creation, dependency install, macOS browser auto-open
Requirements — pinned edge_tts>=7.0.0, srt>=3.5.0, moviepy>=2.0.0
Config — API Key clear functionality, enhanced font path fallback
Static analysis integration — each Taskfile includes ruff + mypy checks

Stats

26 files changed, 1,611 insertions(+), 235 deletions(-)

New / Deleted Files

File	Action	Description
`docs/code_review_report.md`	+added	24 code review issues documented
`docs/release-notes/release_notes_v2.0.md`	+added	v2.0 release notes
`docs/release-notes/release_notes_v2.1.md`	+added	v2.1 release notes
`tests/test_core.py`	+added	428-line automated unit test suite
`test_ref.png` / `test_end.png`	+added	Regression test assets
`_test_reset.py`	-deleted	Deprecated test script
`start.sh`	refactored	One-click startup with auto venv + deps + browser

Upgrade Notes

From v2.0:

git pull
.venv/bin/pip install -r requirements.txt
./start.sh

Run regression tests:

.venv/bin/python scripts/regression_runner.py --auto-start

Assets 2

16 Jun 10:46

lcy362

v2.0

79e9974

2.0 version release

Release v2.0 — Three-Pipeline Architecture + Multilingual Web UI

Release date: 2026-06-15

Overview

v2.0 is a complete architectural refactor from a single-file script to an engineered application with three distinct video generation pipelines, a four-layer backend, WebSocket real-time progress, and a 7-language internationalized frontend.

Features

Three Task Types

Simple Video — Single prompt → single video, exposing all 9 Agnes API parameters (t2v/i2v/ti2vid/keyframes)
Creative Video — AI screenwriter → storyboards → per-scene videos → edge_tts narration → fine-grained subtitles → concatenation
Manuscript Video — Long text splitting → AI scene prompt → per-paragraph videos → unified TTS+subtitles → concatenation

Architecture

core/api/ — Agnes Chat / Image / Video API wrappers with retry and polling
core/audio/ — edge_tts engine (word-level timestamps) + SRT subtitle generation + moviepy overlay
core/compositor/ — Video concatenation, scaling, frame extraction, silent audio generation
core/pipelines/ — Three pipeline implementations (simple / creative / manuscript)
models/ — Pydantic v2 data models with persistent task state serialization

Web UI

Three-tab frontend (Simple / Creative / Manuscript), Tailwind CDN single-page
7 languages: 中文 / English / Русский / 日本語 / 한국어 / Bahasa Melayu / Bahasa Indonesia
WebSocket real-time progress push
Task pause, resume, and stop

Subtitle System

edge_tts word-level timestamps → fine-grained SRT grouping
CJK multi-line wrapping (break at punctuation)
method="caption" rendering, supports stroke / background / position customization

Other

One-click startup script start.sh
docs/system_design.md system design document
3 demo videos embedded in README

Stats

40 files changed, 11,268 insertions(+), 2,792 deletions(-)

New Files

File	Description
`core/pipelines/`	Three pipeline types (simple / creative / manuscript)
`core/api/`	Agnes API wrapper layer
`core/audio/`	TTS + subtitle engine
`core/compositor/`	Video compositing / processing
`models/task.py`	Three task subtype data models
`scripts/regression_runner.py`	Regression test script
`docs/system_design.md`	System design document
`docs/regression_test_plan.md`	Test plan

Upgrade Notes

Python 3.10+ required
New dependencies: edge_tts>=6.1.0, srt>=3.5.0
Run ./start.sh for one-click startup, or .venv/bin/pip install -r requirements.txt && .venv/bin/python server.py

Assets 2

Releases: lcy362/agnes-video-generator

v2.2 — Image-to-Image End Frames + Stability Enhancements

Release v2.2 — Image-to-Image End Frames + Stability Enhancements

Overview

i2i End Frame Pipeline

Stability & Bug Fixes

Code Review Batch 2 Fixes (P1-P13)

Other Fixes

i18n Improvements

Documentation

Stats

Key Files

Upgrade Notes

Uh oh!

2.1 version release

Release v2.1 — Code Review Fixes + Regression Test Framework + Quality Improvements

Overview

Code Review Fixes

High Severity (H1-H6)

Medium Severity (M1-M10)

Low Severity (L1-L8)

Regression Test Framework

Endpoint Verification (E1-E9)

Artifact Verification (F1-F7, R1-R10)

Other Improvements

Stats

New / Deleted Files

Upgrade Notes

Uh oh!

2.0 version release

Release v2.0 — Three-Pipeline Architecture + Multilingual Web UI

Overview

Features

Three Task Types

Architecture

Web UI

Subtitle System

Other

Stats

New Files

Upgrade Notes

Uh oh!