Vision Agent Lab

Research-grade multi-agent workflow for machine vision experiments.

Vision Agent Lab is a compact framework for organizing the full loop of visual recognition research: paper tracking, PyTorch experiment planning, ablation design, log analysis, and manuscript support. The project is designed for graduate-level computer vision work where each iteration needs to be traceable, reproducible, and easy to hand off to an LLM-powered coding agent.

Why This Exists

Modern vision research is no longer a single training script. A typical week can involve reading new papers, adapting model code, running multiple ablations, comparing metrics, writing notes, and preparing figures. This repository defines a structured agent workflow so that AI coding assistants can contribute without losing experiment context.

Core Workflow

flowchart LR
    A["Research Agent<br/>paper scan and method summary"] --> B["Code Agent<br/>implementation and refactor"]
    B --> C["Experiment Agent<br/>training plan and ablations"]
    C --> D["Evaluation Agent<br/>metrics, tables, figures"]
    D --> E["Writing Agent<br/>paper draft and rebuttal notes"]
    E --> A

Features

Multi-agent role definitions for machine vision research.
Experiment manifest format for datasets, model variants, metrics, and ablation groups.
CLI utilities for generating research run plans and compact progress reports.
Reproducible folder layout for logs, configs, notes, and paper assets.
Minimal Python implementation with no mandatory heavy dependencies.
Unit tests for the planning and reporting logic.

Repository Layout

.
├── configs/
│   └── research_agents.json
├── docs/
│   ├── application_note.md
│   └── experiment_protocol.md
├── examples/
│   └── experiment_manifest.json
├── prompts/
│   ├── code_agent.md
│   ├── evaluation_agent.md
│   ├── research_agent.md
│   └── writing_agent.md
├── src/
│   └── vision_agent_lab/
│       ├── agents.py
│       ├── cli.py
│       ├── experiment.py
│       └── reporting.py
└── tests/
    └── test_workflow.py

Quick Start

python -m vision_agent_lab.cli plan examples/experiment_manifest.json
python -m vision_agent_lab.cli report examples/experiment_manifest.json

For local development:

python -m pip install -e .
python -m unittest discover -s tests

Example Use Case

The default example describes a high-resolution visual defect detection project:

Baseline: ConvNeXt-Tiny detector with multi-scale training.
Research target: robust recognition under low contrast, occlusion, and domain shift.
Ablations: resolution policy, feature fusion, loss weighting, augmentation strategy.
Metrics: mAP, F1, latency, parameter count, and failure case tags.

Agent Roles

Agent	Responsibility	Output
Research Agent	Track papers, summarize methods, extract experiment settings	Literature brief
Code Agent	Implement model changes, clean training scripts, isolate bugs	Patch plan
Experiment Agent	Build ablation matrix, schedule runs, record assumptions	Run queue
Evaluation Agent	Parse metrics, compare baselines, flag regressions	Result table
Writing Agent	Draft method section, limitations, rebuttal notes	Manuscript notes

The prompts/ directory contains reusable handoff prompts for each role. They are written as strict output contracts so a large model can produce artifacts that are easy to review, diff, and turn into follow-up tasks.

Design Principles

Every experiment must have a written hypothesis.
Every ablation must map to one model or data change.
Every result should keep enough metadata to be reproducible.
Agent outputs should be short, structured, and directly actionable.

Status

This repository is an active research scaffold. It is intentionally lightweight so it can be adapted to object detection, semantic segmentation, image restoration, anomaly detection, and multimodal vision tasks.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github/workflows		.github/workflows
configs		configs
docs		docs
examples		examples
prompts		prompts
src/vision_agent_lab		src/vision_agent_lab
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
PROJECT_SHOWCASE.md		PROJECT_SHOWCASE.md
README.md		README.md
index.html		index.html
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vision Agent Lab

Why This Exists

Core Workflow

Features

Repository Layout

Quick Start

Example Use Case

Agent Roles

Design Principles

Status

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Vision Agent Lab

Why This Exists

Core Workflow

Features

Repository Layout

Quick Start

Example Use Case

Agent Roles

Design Principles

Status

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages