Skip to content

DA-Open/DV-World

Repository files navigation

DVWorld Logo
Benchmarking Data Visualization Agents in Real-World Scenarios

  🌐 Website  |   📑 Paper  |   🤗 Dataset  |   🐥 Twitter  

📰 News

Overview

DV-Sheet focuses on native spreadsheet visualization workflows. Instead of generating standalone plotting code, an agent must directly manipulate spreadsheet workbooks to create charts, repair broken visualizations, and assemble dashboards under realistic software constraints.

DV-Evolution targets cross-modal and cross-framework visualization adaptation. Given a reference visual artifact, a new dataset, and modification requirements, the agent must infer the original visual semantics and produce an updated executable visualization in a target framework such as Python, D3.js, Plotly.js, Vega-Lite, or Apache ECharts.

DV-Interact evaluates proactive clarification and intent alignment in ambiguous visualization tasks. The agent operates in a stateful environment and interacts with a user simulator, testing whether it can ask appropriate questions, resolve ambiguity through interaction, and avoid assumption-first execution.

DV-World overview figure

🔍 Installation

Set up the environment using the following commands:

conda create -n dvworld python=3.12
conda activate dvworld

pip install -r requirements.txt

🚀 Quick access DV-World Dataset

The dataset is hosted at:

https://huggingface.co/datasets/DV-World/dvworld

After downloading, place the files into the corresponding gold and tasks folders:

  • dv-evolution/gold and dv-evolution/tasks
  • dv-interact/gold and dv-interact/tasks
  • dv-sheet/gold and dv-sheet/tasks

🚀 Quickstart

Each task family has its own baseline runner:

  • dv-evolution/dvworld-agent-evolution
  • dv-interact/dvworld-agent-interact
  • dv-sheet/dvworld_agent_sheet

The typical workflow is:

  1. Download the dataset into the task-specific gold and tasks folders.
  2. Configure the model in dvworld_agent_fcmode/agent/config.py inside the corresponding agent directory.
  3. Run the agent with run.py.
  4. Convert raw outputs into evaluation format with get_results.py.
  5. Evaluate the converted results with the matching script in evaluation_suite.

Agent-specific usage guides:

⚖️ Evaluation

Evaluation is organized by task family inside evaluation_suite.

Converted candidate outputs are expected under:

evaluation_suite/results/<run_name>

Evaluation outputs are written to:

evaluation_suite/model_score/<run_name>

Task-specific evaluators:

  • evaluation_suite/dv_evolution/run_eval.py
  • evaluation_suite/dv_interact/run_eval.py
  • evaluation_suite/dvsheet_create/run_eval.py
  • evaluation_suite/dvsheet_dashboards/run_eval.py
  • evaluation_suite/dvsheet_fix/run_eval.py

Evaluation guides:

⚠️ Platform Notes

  • DV-Evolution and DV-Interact can be run in a standard Python environment.
  • DV-Sheet evaluation should be run on Windows.
  • In particular, dvsheet-create, dvsheet-dashboards, and dvsheet-fix rely on Excel-related workflows during evaluation.

📋 Leaderboard Submission

To submit your agent results to the leaderboard, please follow the instructions in DV-World Submission Guidelines.

✍️ Citation

If you find our work helpful, please cite as

@misc{meng2026dvworldbenchmarkingdatavisualization,
      title={DV-World: Benchmarking Data Visualization Agents in Real-World Scenarios}, 
      author={Jinxiang Meng and Shaoping Huang and Fangyu Lei and Jingyu Guo and Haoxiang Liu and Jiahao Su and Sihan Wang and Yao Wang and Enrui Wang and Ye Yang and Hongze Chai and Jinming Lv and Anbang Yu and Huangjing Zhang and Yitong Zhang and Yiming Huang and Zeyao Ma and Shizhu He and Jun Zhao and Kang Liu},
      year={2026},
      eprint={2604.25914},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2604.25914}, 
}

About

DV-World: Benchmarking Data Visualization Agents in Real-World Scenarios

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors