feat(cli/init): scaffold tests/Dockerfile and simplify run command#581
Conversation
…n agent.py` invocation - Scaffold a `tests/` dir with an example using `vision_agents.testing` (`TestSession`, `LLMJudge`, multi-turn). `pytest` config moves into `[tool.pytest.ini_options]`; add `pytest` + `pytest-asyncio` to a `[dependency-groups] dev` group. - Expose `INSTRUCTIONS`, `MODEL`, and `create_llm()` from `agent.py` so the test imports the same setup as production (production uses `gemini.Realtime`; `create_llm` is the text-mode factory tests use). - Add a `Dockerfile` (python:3.12-slim + uv, layered for cache, runs `serve --host 0.0.0.0`) and `.dockerignore`. - Switch README and the init "next steps" output from `uv run vision-agents agent run` to `uv run agent.py run` — same flow, but reduces the onboarding cognitive load (Django-style `manage.py`). - Teach the scaffolder to render nested template paths.
📝 WalkthroughWalkthroughThis PR enhances the Vision Agents init scaffold: agent template exports INSTRUCTIONS and create_agent uses it; CLI examples and the post-scaffold message now use Agent scaffolding updates
Estimated code review effort Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
agents-core/vision_agents/cli/init/templates/Dockerfile.j2 (1)
3-3: ⚡ Quick winPin the
uvsource image to a concrete tag or digest.
ghcr.io/astral-sh/uv:latestmakes scaffolded builds non-reproducible and can break unexpectedly.Proposed fix
-COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/ +COPY --from=ghcr.io/astral-sh/uv:0.5.30 /uv /uvx /bin/Use a digest (
@sha256:...) for strongest reproducibility.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 78185a70-8bd0-418f-a86b-5c164a126017
📒 Files selected for processing (9)
agents-core/vision_agents/cli/init/command.pyagents-core/vision_agents/cli/init/scaffold.pyagents-core/vision_agents/cli/init/templates/Dockerfile.j2agents-core/vision_agents/cli/init/templates/README.md.j2agents-core/vision_agents/cli/init/templates/agent.py.j2agents-core/vision_agents/cli/init/templates/dockerignore.j2agents-core/vision_agents/cli/init/templates/pyproject.toml.j2agents-core/vision_agents/cli/init/templates/tests/test_agent.py.j2tests/test_cli/test_cli.py
- Dockerfile: run as a non-root `app` user. - Tests: mark with `pytest.mark.integration`; declare the marker in `[tool.pytest.ini_options]`.
Production agent uses `gemini.Realtime()` directly; tests use `gemini.LLM(MODEL)` so they don't open a Realtime WebSocket.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
agents-core/vision_agents/cli/init/templates/tests/test_agent.py.j2 (1)
53-57:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAdd a guard before indexing
response.chat_messages[-1]for clearer failures.Without an explicit assertion, empty
chat_messagesfails asIndexError, which obscures the real failure mode.Proposed fix
async with TestSession(llm=gemini.LLM(MODEL), instructions=INSTRUCTIONS) as session: response = await session.simple_response("Tell me about yourself.") + assert response.chat_messages, "No assistant message captured" verdict = await judge.evaluate( response.chat_messages[-1], @@ async with TestSession(llm=gemini.LLM(MODEL), instructions=INSTRUCTIONS) as session: await session.simple_response("My name is Alex.") response = await session.simple_response("What is my name?") + assert response.chat_messages, "No assistant message captured" verdict = await judge.evaluate( response.chat_messages[-1],Also applies to: 66-72
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@agents-core/vision_agents/cli/init/templates/tests/test_agent.py.j2` around lines 53 - 57, Add an explicit guard that the response chat list is non-empty before indexing response.chat_messages[-1] so failures produce a clear assertion message: after calling session.simple_response(...) check e.g. that response.chat_messages is truthy or len(response.chat_messages) > 0 and raise/assert with a descriptive message (mentioning TestSession/simple_response and judge.evaluate) before calling judge.evaluate; apply the same guard for the second occurrence around the block handling lines 66-72.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@agents-core/vision_agents/cli/init/templates/tests/test_agent.py.j2`:
- Line 12: The tests import only INSTRUCTIONS and currently hardcode Gemini LLM
setup; update them to use the shared factory from agent.py by importing MODEL
and/or create_llm() and replacing any direct Gemini instantiation with calls to
create_llm() (or using MODEL to obtain the provider) so tests exercise the same
model/provider setup as runtime; update all occurrences referenced (lines around
the current import and the other occurrences at ~28, ~35, ~53, ~66) to call
create_llm() and use that LLM in assertions.
---
Outside diff comments:
In `@agents-core/vision_agents/cli/init/templates/tests/test_agent.py.j2`:
- Around line 53-57: Add an explicit guard that the response chat list is
non-empty before indexing response.chat_messages[-1] so failures produce a clear
assertion message: after calling session.simple_response(...) check e.g. that
response.chat_messages is truthy or len(response.chat_messages) > 0 and
raise/assert with a descriptive message (mentioning TestSession/simple_response
and judge.evaluate) before calling judge.evaluate; apply the same guard for the
second occurrence around the block handling lines 66-72.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 4cbd28d8-9192-4192-b6ca-2630c536852d
📒 Files selected for processing (2)
agents-core/vision_agents/cli/init/templates/agent.py.j2agents-core/vision_agents/cli/init/templates/tests/test_agent.py.j2
💤 Files with no reviewable changes (1)
- agents-core/vision_agents/cli/init/templates/agent.py.j2
#573 (Python 3.14 cap) and #582 (Python 3.14 support + smart_turn/vogent requires-python bump) both happened between v0.6.1 and v0.6.2. The net user-visible change in v0.6.2 is "Python 3.14 now works", so the intermediate cap is dropped from the changelog and the #582 entries are folded in instead — feature under New Features, the smart_turn/vogent metadata fix under Bug Fixes alongside the packaging fix. Also picks up entries that landed on main and were missing from this PR: - #581 (Richer `vision-agents init` scaffold) — tests/, Dockerfile, and the simpler `uv run agent.py run` invocation. - #583 (Gemini default model bump) — `gemini-3.1-flash-lite-preview` was decommissioned; replaced with `gemini-flash-lite-latest`.
Why
vision-agents initproduced a minimal project that left two onboarding gaps:vision_agents.testing, so the first time most agents got exercised was inside a live call.uv run vision-agents agent runwas the suggested invocation. It works, but it adds an extra concept (the CLI dispatches to[tool.vision-agents.agent].entrypoint) right when the user is trying to understand whatagent.pyis.This PR ships a small, opinionated set of files so a freshly-generated project is something you can run, test, and deploy without leaving the directory.
Changes
tests/directory with an example built onvision_agents.testing(TestSession,LLMJudge, multi-turn).pytestis configured under[tool.pytest.ini_options]inpyproject.toml;pytest+pytest-asyncioland in a[dependency-groups] devgroup.INSTRUCTIONS,MODEL, andcreate_llm()fromagent.pyso tests import the same setup as production. Production keepsgemini.Realtimefor audio;create_llmis the text-mode factory tests use.Dockerfile(python:3.12-slim +uv, layered for cache, runsserve --host 0.0.0.0) and a.dockerignore.uv run vision-agents agent runtouv run agent.py run— same behavior, but the command now mirrors what's inagent.py(if __name__ == "__main__": runner.cli()). Django'spython manage.pywas the inspiration.tests/works.Summary by CodeRabbit
New Features
Documentation