DataEvolver

Autonomous Synthetic Data Construction via VLM-Guided Iterative Rendering

DataEvolver is a goal-driven data synthesis pipeline that generates high-quality training datasets through an automated loop of 3D rendering, VLM (Vision-Language Model) quality review, and intelligent parameter adjustment. Unlike traditional pipelines with rigid scoring rules, DataEvolver uses free-form VLM feedback to perceive, diagnose, and fix rendering issues — producing photorealistic, scene-aware training data without human intervention.

Website · Paper · Dataset: DataEvolver-Rotate

Key Features

Goal-Driven Loop Agents — VLM reviewer provides semantic feedback ("flat lighting", "floating object") → AI agent selects targeted actions → re-render → repeat until quality goals are met
24 Atomic Actions — Structured action space across 5 groups: lighting, object placement, scene environment, and material properties — with anti-oscillation control and step-scale scheduling
Scene-Aware Rendering — Objects placed in real Blender scenes with HDRI environments, raycast ground detection, and preserved scene lighting
Multi-Modal Output — RGB, mask, depth, normal maps, and geometry metadata
End-to-End Automation — From natural language seed concept to training-ready dataset, zero human intervention

Pipeline Overview

Seed Concept ─→ T2I Generation ─→ Segmentation ─→ 3D Reconstruction ─→ Scene Rendering ─→ VLM Review Loop
  (Stage 1)       (Stage 2)       (Stage 2.5)        (Stage 3)           (Stage 4)          (Stage 5)

Stage	What it does	Model / Tool
1. Text Expansion	LLM expands seed concept into detailed T2I prompt	Claude API (Anthropic)
2. T2I Generation	Generate 1024×1024 object image	Qwen-Image-2512
2.5. Segmentation	Extract RGBA foreground, remove background	SAM3
3. 3D Reconstruction	Reconstruct textured mesh from single image	Hunyuan3D-2.1
4. Scene Rendering	Blender Cycles 512spp scene-aware insertion	Blender 4.24
5. VLM Review Loop	Free-form review → agent action → re-render until keep	Qwen3.5-35B-A3B

The VLM Review Loop (Stage 5)

The core innovation: a goal-driven loop agent that iteratively improves rendering quality.

┌─────────────┐     ┌──────────────┐     ┌──────────────────┐     ┌──────────────┐
│   Blender   │────→│  VLM Review  │────→│  AI Agent        │────→│ Quality Gate │
│   Render    │     │  (free-form  │     │  Decision        │     │ keep/revise  │
│             │     │   critique)  │     │  (select action) │     │              │
└─────────────┘     └──────────────┘     └──────────────────┘     └──────┬───────┘
       ↑                                                                  │
       └──────────────── loop until reviewer says "keep" ─────────────────┘

Anti-oscillation control prevents parameter thrashing:

Sign-flip tracking: freeze a parameter after 3 direction reversals
Step-scale scheduling: Round 0 → 100%, Round 1 → 70%, Round 2 → 50%, Round 3+ → 40%
Score-adaptive boost: ×1.2 when hybrid_score < 0.65

Prerequisites

OS: Linux (tested on Ubuntu 20.04+)
GPU: NVIDIA GPU with ≥24 GB VRAM for rendering; ≥80 GB for VLM inference
Python: 3.10+
Blender: 4.24
CUDA: Compatible with your PyTorch version

Required Models

Model	Purpose	Approx. Size
Qwen-Image-2512	T2I generation	~56 GB
SAM3	Foreground segmentation	~2 GB
Hunyuan3D-2.1	Image-to-3D reconstruction	~20 GB
Qwen3.5-35B-A3B	VLM quality reviewer	~35 GB
Blender 4.24	3D rendering engine	~300 MB

Quick Start

# Clone the repo
git clone https://github.com/Kamisato520/DataEvolver.git
cd DataEvolver

# Configure model paths in each pipeline stage:
#   pipeline/stage1_text_expansion.py  → Anthropic API key
#   pipeline/stage2_t2i_generate.py    → MODEL_PATH (Qwen-Image-2512)
#   pipeline/stage2_5_sam2_segment.py  → SAM3_CKPT
#   pipeline/stage3_image_to_3d.py     → HUNYUAN3D_REPO, MODEL_HUB
#   pipeline/stage5_5_vlm_review.py    → Qwen3.5-35B model path
#   configs/scene_template.json        → blend_path, blender_binary

# Place your .blend scene file in assets/scene/
# Place HDRI environment maps in assets/hdri/

# Run the full pipeline
bash pipeline/run_all.sh

Project Structure

DataEvolver/
├── pipeline/                          # Core pipeline stages
│   ├── stage1_text_expansion.py             # LLM prompt generation
│   ├── stage2_t2i_generate.py               # Text-to-image (Qwen-Image-2512)
│   ├── stage2_5_sam2_segment.py             # SAM3 foreground extraction
│   ├── stage3_image_to_3d.py                # 3D mesh reconstruction (Hunyuan3D-2.1)
│   ├── stage4_scene_render.py               # Blender scene-aware rendering
│   ├── stage5_5_vlm_review.py               # VLM quality review (Qwen3.5-35B-A3B)
│   ├── stage5_6_feedback_apply.py           # Action selection & anti-oscillation
│   ├── asset_lifecycle.py                   # Asset lifecycle management
│   └── rotation_geomodal_dataset.py         # Training dataset loader
├── configs/
│   ├── scene_action_space.json              # 24 atomic actions definition
│   ├── scene_template.json                  # Blender scene template config
│   ├── vlm_review_schema.json               # VLM review output schema
│   ├── dataset_profiles/                    # Dataset configuration profiles
│   └── seed_concepts/                       # Seed object definitions (20/50 objects)
├── scripts/                           # Utility & build scripts
│   ├── run_scene_agent_monitor.py           # VLM loop agent monitor
│   ├── run_scene_agent_step.py              # Single-step agent execution
│   ├── export_rotation8_from_best_object_state.py  # Consistent rotation export
│   ├── build_rotation8_trainready_dataset.py       # Build training pairs
│   ├── build_object_split_for_rotation_dataset.py  # Object-disjoint split
│   ├── run_full_pipeline.py                 # Full pipeline orchestrator
│   ├── run_vlm_quality_gate_loop.py         # VLM quality gate loop
│   ├── feedback_loop/                       # Feedback loop utilities
│   └── ...                                  # Additional build & eval scripts
├── assets/
│   ├── hdri/                                # HDRI environment maps
│   └── scene/                               # Blender scene files (.blend)
├── paper/                             # Technical report (LaTeX source)
└── web/                               # Project website (GitHub Pages)

Action Space

The AI agent selects from 24 structured atomic actions organized in 5 groups:

Group	Actions	Parameters
Lighting (4)	Key light intensity ↑↓, key light yaw ±15°	Multiplicative ×1.2/×0.8 or additive, bounded
Object (6)	Elevation ±0.02, yaw ±15°, scale ×1.1/×0.9	Bounded within safe ranges
Scene (5)	Env rotation ±30°, env intensity ↑↓, contact shadow	HDRI and environment controls
Material (9)	Saturation, value/brightness, hue offset, roughness, specular/sheen	Fine-grained material tuning
Camera (0)	Reserved for future use	—

Full action definitions: configs/scene_action_space.json

DataEvolver-Rotate

The first benchmark dataset produced by DataEvolver — for rotation-conditioned image editing.

Metric	Value
Unique Objects	50
Rotation Angles	8 (0°, 45°, 90°, 135°, 180°, 225°, 270°, 315°)
Training Pairs	350 (front → 7 target views)
Train / Val / Test	245 / 49 / 56 pairs (object-disjoint, seed=42)
Modalities	RGB, mask, depth, normal

Methodology

Each object uses a single canonical yaw-0° best state as the base. The object is then rotated while the scene, camera, lighting, and material remain fixed — ensuring cross-angle consistency.

Loading the Dataset

import json
from pathlib import Path
from PIL import Image

root = Path("path/to/dataset_split")
rows = []
with (root / "pairs" / "train_pairs.jsonl").open("r") as f:
    for line in f:
        rows.append(json.loads(line))

row = rows[0]
source = Image.open(root / row["source_image"]).convert("RGB")
target = Image.open(root / row["target_image"]).convert("RGB")
instruction = row["instruction"]  # e.g., "Rotate the object 45 degrees clockwise"

Using with Claude Code

DataEvolver is designed to work with Claude Code as an AI-powered development and operations assistant. Claude Code reads a project-level CLAUDE.md file to understand your environment, then helps you run pipelines, analyze results, and build datasets through natural language.

Step 1: Install Claude Code

npm install -g @anthropic-ai/claude-code

Step 2: Create your `CLAUDE.md`

Create a CLAUDE.md in the project root (it's gitignored — each user maintains their own). This file tells Claude Code about your specific environment:

# CLAUDE.md

## Remote Server
- SSH alias: `my-server`
- GPU: 3x A800 80GB (or your setup)
- Python: `/path/to/python3` (3.10+, with PyTorch)
- Blender: `/path/to/blender` (4.24)
- Code directory: `/path/to/DataEvolver`

## Model Paths
- Qwen-Image-2512: `/path/to/Qwen-Image-2512`
- SAM3 checkpoint: `/path/to/sam3/sam3.pt`
- Hunyuan3D-2.1 repo: `/path/to/Hunyuan3D-2.1`
- Hunyuan3D-2.1 weights: `/path/to/model_hub/Hunyuan3D-2.1`
- Qwen3.5-35B-A3B: `/path/to/Qwen3.5-35B-A3B`

## Scene Config
- Blender scene file: `/path/to/scene.blend`
- HDRI directory: `/path/to/hdri/`
- Render engine: CYCLES, 512 samples, 1024x1024

## Key Configs
- Action space: `configs/scene_action_space.json` (24 atomic actions)
- Scene template: `configs/scene_template.json`
- Dataset profiles: `configs/dataset_profiles/`

## Pipeline
Stage 1 (Text Expansion) → Stage 2 (T2I) → Stage 2.5 (SAM3)
→ Stage 3 (3D Reconstruction) → Stage 4 (Blender Render) → Stage 5 (VLM Loop)

## Working Rules
- Always test on the remote server, not locally
- Use tmux for long-running tasks (screen not available on all servers)
- Read trace.json free-form text for VLM results — not just agg.json scores
- Only stop VLM loop when reviewer explicitly says "keep"
- Check GPU usage before launching new jobs

Step 3: Create Claude Code Skills (Optional)

You can create reusable skills in a skills/ directory for common workflows:

mkdir -p skills/scene-agent-loop

# skills/scene-agent-loop/SKILL.md
---
name: scene-agent-loop
description: Manage the VLM review loop for scene rendering
---
# Scene Agent Loop
Monitor and continue the VLM review → render → agent decision loop.
Read trace.json to understand current state, then decide next action.

Step 4: Launch Claude Code

cd DataEvolver
claude

Claude Code will read your CLAUDE.md and understand the full project context. Example commands:

> Check GPU usage on the server
> Run the full pipeline for 10 new furniture objects
> Export rotation8 dataset from the latest best states
> Build train-ready dataset with object-disjoint split

Citation

@misc{dataevolver2026,
  title   = {DataEvolver: Autonomous Synthetic Data Construction
             via VLM-Guided Iterative Rendering},
  year    = {2026},
  url     = {https://github.com/Kamisato520/DataEvolver}
}

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github/workflows		.github/workflows
assets		assets
configs		configs
paper		paper
pipeline		pipeline
scripts		scripts
web		web
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
eval_inference.py		eval_inference.py
run_eval_inference.sh		run_eval_inference.sh
run_scene_evolution_loop.py		run_scene_evolution_loop.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DataEvolver

Key Features

Pipeline Overview

The VLM Review Loop (Stage 5)

Prerequisites

Required Models

Quick Start

Project Structure

Action Space

DataEvolver-Rotate

Methodology

Loading the Dataset

Using with Claude Code

Step 1: Install Claude Code

Step 2: Create your `CLAUDE.md`

Step 3: Create Claude Code Skills (Optional)

Step 4: Launch Claude Code

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DataEvolver

Key Features

Pipeline Overview

The VLM Review Loop (Stage 5)

Prerequisites

Required Models

Quick Start

Project Structure

Action Space

DataEvolver-Rotate

Methodology

Loading the Dataset

Using with Claude Code

Step 1: Install Claude Code

Step 2: Create your CLAUDE.md

Step 3: Create Claude Code Skills (Optional)

Step 4: Launch Claude Code

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Step 2: Create your `CLAUDE.md`

Packages