Skip to content

feat(cli/init): scaffold tests/Dockerfile and simplify run command#581

Merged
aliev merged 3 commits into
mainfrom
cli-init-tests-dockerfile
May 25, 2026
Merged

feat(cli/init): scaffold tests/Dockerfile and simplify run command#581
aliev merged 3 commits into
mainfrom
cli-init-tests-dockerfile

Conversation

@aliev
Copy link
Copy Markdown
Member

@aliev aliev commented May 23, 2026

Why

vision-agents init produced a minimal project that left two onboarding gaps:

  1. No test scaffolding. Nothing pointed new users at vision_agents.testing, so the first time most agents got exercised was inside a live call.
  2. uv run vision-agents agent run was the suggested invocation. It works, but it adds an extra concept (the CLI dispatches to [tool.vision-agents.agent].entrypoint) right when the user is trying to understand what agent.py is.

This PR ships a small, opinionated set of files so a freshly-generated project is something you can run, test, and deploy without leaving the directory.

Changes

  • Scaffold a tests/ directory with an example built on vision_agents.testing (TestSession, LLMJudge, multi-turn). pytest is configured under [tool.pytest.ini_options] in pyproject.toml; pytest + pytest-asyncio land in a [dependency-groups] dev group.
  • Expose INSTRUCTIONS, MODEL, and create_llm() from agent.py so tests import the same setup as production. Production keeps gemini.Realtime for audio; create_llm is the text-mode factory tests use.
  • Add a Dockerfile (python:3.12-slim + uv, layered for cache, runs serve --host 0.0.0.0) and a .dockerignore.
  • Switch the README and the init "next steps" output from uv run vision-agents agent run to uv run agent.py run — same behavior, but the command now mirrors what's in agent.py (if __name__ == "__main__": runner.cli()). Django's python manage.py was the inspiration.
  • Teach the scaffolder to render nested template paths so tests/ works.

Summary by CodeRabbit

  • New Features

    • Docker containerization support added for scaffolded projects with Python 3.12 slim containers and two-stage dependency caching.
    • Test infrastructure and examples included in generated projects using pytest and LLM-based evaluation.
  • Documentation

    • Updated CLI command syntax examples in generated templates and README.
    • Added Docker ignore configuration to newly scaffolded projects.

Review Change Stack

…n agent.py` invocation

- Scaffold a `tests/` dir with an example using `vision_agents.testing`
  (`TestSession`, `LLMJudge`, multi-turn). `pytest` config moves into
  `[tool.pytest.ini_options]`; add `pytest` + `pytest-asyncio` to a
  `[dependency-groups] dev` group.
- Expose `INSTRUCTIONS`, `MODEL`, and `create_llm()` from `agent.py` so
  the test imports the same setup as production (production uses
  `gemini.Realtime`; `create_llm` is the text-mode factory tests use).
- Add a `Dockerfile` (python:3.12-slim + uv, layered for cache, runs
  `serve --host 0.0.0.0`) and `.dockerignore`.
- Switch README and the init "next steps" output from
  `uv run vision-agents agent run` to `uv run agent.py run` — same
  flow, but reduces the onboarding cognitive load (Django-style
  `manage.py`).
- Teach the scaffolder to render nested template paths.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 23, 2026

📝 Walkthrough

Walkthrough

This PR enhances the Vision Agents init scaffold: agent template exports INSTRUCTIONS and create_agent uses it; CLI examples and the post-scaffold message now use uv run agent.py; adds Dockerfile and .dockerignore templates and ensures rendered files are written with parent directories; adds pytest config and a tests template with three async integration tests; updates the scaffolding unit test to expect the new files.

Agent scaffolding updates

Layer / File(s) Summary
Agent template: INSTRUCTIONS constant
agents-core/vision_agents/cli/init/templates/agent.py.j2
Adds module-level INSTRUCTIONS and passes it into create_agent.
README and CLI invocation updates
agents-core/vision_agents/cli/init/command.py, agents-core/vision_agents/cli/init/templates/README.md.j2
Changes printed “Next steps” and README examples to use uv run agent.py run/serve and updates README header.
Docker templates and scaffold mapping/writing
agents-core/vision_agents/cli/init/scaffold.py, agents-core/vision_agents/cli/init/templates/Dockerfile.j2, agents-core/vision_agents/cli/init/templates/dockerignore.j2
Adds Dockerfile and .dockerignore templates, extends TEMPLATE_FILES, and ensures renderer creates parent directories before writing outputs.
Pyproject and test templates
agents-core/vision_agents/cli/init/templates/pyproject.toml.j2, agents-core/vision_agents/cli/init/templates/tests/test_agent.py.j2
Adds a dev dependency group and pytest config; adds templated tests/test_agent.py with dotenv loading, pytest guards, MODEL, and three async integration tests using TestSession and LLMJudge.
Scaffolding test expectation update
tests/test_cli/test_cli.py
Updates unit test to expect tests/test_agent.py, .dockerignore, and Dockerfile among scaffolded outputs.

Estimated code review effort
🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • dangusev

"A rabbit hops through templates bright and cheery,
Scaffolding files like carrots, neat and merry.
INSTRUCTIONS tucked in, Docker seeds sown,
Tests sprout green where examples have grown.
Run the agent — the rabbit cheers, 'Now go!' 🐇"

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main changes: scaffolding tests/Dockerfile and simplifying the run command from 'vision-agents agent run' to direct agent.py invocation.
Description check ✅ Passed The PR description comprehensively covers the 'Why' and 'Changes' sections, clearly articulating the onboarding gaps being addressed and detailing all modifications to scaffolding, configuration, and command invocation.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch cli-init-tests-dockerfile

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
agents-core/vision_agents/cli/init/templates/Dockerfile.j2 (1)

3-3: ⚡ Quick win

Pin the uv source image to a concrete tag or digest.

ghcr.io/astral-sh/uv:latest makes scaffolded builds non-reproducible and can break unexpectedly.

Proposed fix
-COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/
+COPY --from=ghcr.io/astral-sh/uv:0.5.30 /uv /uvx /bin/

Use a digest (@sha256:...) for strongest reproducibility.


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 78185a70-8bd0-418f-a86b-5c164a126017

📥 Commits

Reviewing files that changed from the base of the PR and between 3ea19df and 54baef8.

📒 Files selected for processing (9)
  • agents-core/vision_agents/cli/init/command.py
  • agents-core/vision_agents/cli/init/scaffold.py
  • agents-core/vision_agents/cli/init/templates/Dockerfile.j2
  • agents-core/vision_agents/cli/init/templates/README.md.j2
  • agents-core/vision_agents/cli/init/templates/agent.py.j2
  • agents-core/vision_agents/cli/init/templates/dockerignore.j2
  • agents-core/vision_agents/cli/init/templates/pyproject.toml.j2
  • agents-core/vision_agents/cli/init/templates/tests/test_agent.py.j2
  • tests/test_cli/test_cli.py

Comment thread agents-core/vision_agents/cli/init/templates/Dockerfile.j2
Comment thread agents-core/vision_agents/cli/init/templates/tests/test_agent.py.j2 Outdated
- Dockerfile: run as a non-root `app` user.
- Tests: mark with `pytest.mark.integration`; declare the marker in
  `[tool.pytest.ini_options]`.
@aliev aliev marked this pull request as ready for review May 24, 2026 21:30
Comment thread agents-core/vision_agents/cli/init/templates/agent.py.j2 Outdated
Production agent uses `gemini.Realtime()` directly; tests use
`gemini.LLM(MODEL)` so they don't open a Realtime WebSocket.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
agents-core/vision_agents/cli/init/templates/tests/test_agent.py.j2 (1)

53-57: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add a guard before indexing response.chat_messages[-1] for clearer failures.

Without an explicit assertion, empty chat_messages fails as IndexError, which obscures the real failure mode.

Proposed fix
     async with TestSession(llm=gemini.LLM(MODEL), instructions=INSTRUCTIONS) as session:
         response = await session.simple_response("Tell me about yourself.")
+        assert response.chat_messages, "No assistant message captured"
         verdict = await judge.evaluate(
             response.chat_messages[-1],
@@
     async with TestSession(llm=gemini.LLM(MODEL), instructions=INSTRUCTIONS) as session:
         await session.simple_response("My name is Alex.")
         response = await session.simple_response("What is my name?")
+        assert response.chat_messages, "No assistant message captured"
 
         verdict = await judge.evaluate(
             response.chat_messages[-1],

Also applies to: 66-72

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@agents-core/vision_agents/cli/init/templates/tests/test_agent.py.j2` around
lines 53 - 57, Add an explicit guard that the response chat list is non-empty
before indexing response.chat_messages[-1] so failures produce a clear assertion
message: after calling session.simple_response(...) check e.g. that
response.chat_messages is truthy or len(response.chat_messages) > 0 and
raise/assert with a descriptive message (mentioning TestSession/simple_response
and judge.evaluate) before calling judge.evaluate; apply the same guard for the
second occurrence around the block handling lines 66-72.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@agents-core/vision_agents/cli/init/templates/tests/test_agent.py.j2`:
- Line 12: The tests import only INSTRUCTIONS and currently hardcode Gemini LLM
setup; update them to use the shared factory from agent.py by importing MODEL
and/or create_llm() and replacing any direct Gemini instantiation with calls to
create_llm() (or using MODEL to obtain the provider) so tests exercise the same
model/provider setup as runtime; update all occurrences referenced (lines around
the current import and the other occurrences at ~28, ~35, ~53, ~66) to call
create_llm() and use that LLM in assertions.

---

Outside diff comments:
In `@agents-core/vision_agents/cli/init/templates/tests/test_agent.py.j2`:
- Around line 53-57: Add an explicit guard that the response chat list is
non-empty before indexing response.chat_messages[-1] so failures produce a clear
assertion message: after calling session.simple_response(...) check e.g. that
response.chat_messages is truthy or len(response.chat_messages) > 0 and
raise/assert with a descriptive message (mentioning TestSession/simple_response
and judge.evaluate) before calling judge.evaluate; apply the same guard for the
second occurrence around the block handling lines 66-72.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4cbd28d8-9192-4192-b6ca-2630c536852d

📥 Commits

Reviewing files that changed from the base of the PR and between 7e2fe36 and c6c5130.

📒 Files selected for processing (2)
  • agents-core/vision_agents/cli/init/templates/agent.py.j2
  • agents-core/vision_agents/cli/init/templates/tests/test_agent.py.j2
💤 Files with no reviewable changes (1)
  • agents-core/vision_agents/cli/init/templates/agent.py.j2

Comment thread agents-core/vision_agents/cli/init/templates/tests/test_agent.py.j2
@aliev aliev merged commit 7748f0a into main May 25, 2026
6 checks passed
@aliev aliev deleted the cli-init-tests-dockerfile branch May 25, 2026 13:14
aliev added a commit that referenced this pull request May 25, 2026
#573 (Python 3.14 cap) and #582 (Python 3.14 support + smart_turn/vogent
requires-python bump) both happened between v0.6.1 and v0.6.2. The net
user-visible change in v0.6.2 is "Python 3.14 now works", so the
intermediate cap is dropped from the changelog and the #582 entries are
folded in instead — feature under New Features, the smart_turn/vogent
metadata fix under Bug Fixes alongside the packaging fix.

Also picks up entries that landed on main and were missing from this
PR:

- #581 (Richer `vision-agents init` scaffold) — tests/, Dockerfile, and
  the simpler `uv run agent.py run` invocation.
- #583 (Gemini default model bump) — `gemini-3.1-flash-lite-preview`
  was decommissioned; replaced with `gemini-flash-lite-latest`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants