Research-grade multi-agent workflow for machine vision experiments.
Vision Agent Lab is a compact framework for organizing the full loop of visual recognition research: paper tracking, PyTorch experiment planning, ablation design, log analysis, and manuscript support. The project is designed for graduate-level computer vision work where each iteration needs to be traceable, reproducible, and easy to hand off to an LLM-powered coding agent.
Modern vision research is no longer a single training script. A typical week can involve reading new papers, adapting model code, running multiple ablations, comparing metrics, writing notes, and preparing figures. This repository defines a structured agent workflow so that AI coding assistants can contribute without losing experiment context.
flowchart LR
A["Research Agent<br/>paper scan and method summary"] --> B["Code Agent<br/>implementation and refactor"]
B --> C["Experiment Agent<br/>training plan and ablations"]
C --> D["Evaluation Agent<br/>metrics, tables, figures"]
D --> E["Writing Agent<br/>paper draft and rebuttal notes"]
E --> A
- Multi-agent role definitions for machine vision research.
- Experiment manifest format for datasets, model variants, metrics, and ablation groups.
- CLI utilities for generating research run plans and compact progress reports.
- Reproducible folder layout for logs, configs, notes, and paper assets.
- Minimal Python implementation with no mandatory heavy dependencies.
- Unit tests for the planning and reporting logic.
.
├── configs/
│ └── research_agents.json
├── docs/
│ ├── application_note.md
│ └── experiment_protocol.md
├── examples/
│ └── experiment_manifest.json
├── prompts/
│ ├── code_agent.md
│ ├── evaluation_agent.md
│ ├── research_agent.md
│ └── writing_agent.md
├── src/
│ └── vision_agent_lab/
│ ├── agents.py
│ ├── cli.py
│ ├── experiment.py
│ └── reporting.py
└── tests/
└── test_workflow.py
python -m vision_agent_lab.cli plan examples/experiment_manifest.json
python -m vision_agent_lab.cli report examples/experiment_manifest.jsonFor local development:
python -m pip install -e .
python -m unittest discover -s testsThe default example describes a high-resolution visual defect detection project:
- Baseline: ConvNeXt-Tiny detector with multi-scale training.
- Research target: robust recognition under low contrast, occlusion, and domain shift.
- Ablations: resolution policy, feature fusion, loss weighting, augmentation strategy.
- Metrics: mAP, F1, latency, parameter count, and failure case tags.
| Agent | Responsibility | Output |
|---|---|---|
| Research Agent | Track papers, summarize methods, extract experiment settings | Literature brief |
| Code Agent | Implement model changes, clean training scripts, isolate bugs | Patch plan |
| Experiment Agent | Build ablation matrix, schedule runs, record assumptions | Run queue |
| Evaluation Agent | Parse metrics, compare baselines, flag regressions | Result table |
| Writing Agent | Draft method section, limitations, rebuttal notes | Manuscript notes |
The prompts/ directory contains reusable handoff prompts for each role. They are written as strict output contracts so a large model can produce artifacts that are easy to review, diff, and turn into follow-up tasks.
- Every experiment must have a written hypothesis.
- Every ablation must map to one model or data change.
- Every result should keep enough metadata to be reproducible.
- Agent outputs should be short, structured, and directly actionable.
This repository is an active research scaffold. It is intentionally lightweight so it can be adapted to object detection, semantic segmentation, image restoration, anomaly detection, and multimodal vision tasks.