UCAgents: Unidirectional Convergence for Visual Evidence Anchored Multi-Agent Medical Decision-Making
A novel multi-agent framework for medical decision-making, leveraging unidirectional convergence and visual evidence anchoring to enhance accuracy and interpretability in medical Visual Question Answering (VQA) tasks.
Medical VQA requires both visual evidence understanding and clinical reasoning, but existing single-agent models often suffer from incomplete evidence utilization and unstable reasoning. UCAgents addresses these limitations by:
- Introducing a unidirectional convergence mechanism to align multi-agent reasoning outputs without mutual interference;
- Anchoring reasoning on visual evidence (e.g., medical images) to ensure clinical relevance;
- Supporting multiple medical VQA datasets (MedQA, PathVQA, VQA-RAD, SLAKE-VQA) with a modular, easy-to-extend architecture.
This repository contains the full implementation of the UCAgents framework, including data loading, multi-agent reasoning, and result evaluation modules.
- Python 3.8+ (tested on 3.8/3.9/3.10)
- CUDA 11.7+ (optional, for local model inference)
- Valid API key (if using remote LLMs) or local LLM deployment (e.g., Ollama, Llama.cpp)
git clone https://github.com/fqhank/UCAgents.git
cd UCAgents# Using venv
python -m venv ucagents-env
# Activate (Linux/macOS)
source ucagents-env/bin/activate
# Activate (Windows)
ucagents-env\Scripts\activateCreate a requirements.txt file with the following content (or use the existing one):
openai
datasets
Pillow
tqdm
json
pandas
loggingInstall dependencies:
pip install -r requirements.txtEdit agents.py to set your API credentials:
# In agents.py
API_BASE = "https://api.openai.com/v1" # Replace with your API endpoint
API_KEY = "your-api-key-here" # Replace with your API keyThe framework defaults to Ollama's local endpoint (http://localhost:11434/v1). Ensure Ollama is installed and your model is pulled:
# Install Ollama (Linux/macOS)
curl -fsSL https://ollama.com/install.sh | sh
# Pull a model (e.g., qwen2.5vl:7b)
ollama pull qwen2.5vl:7bUCAgents supports four medical VQA datasets. Below are the download/access instructions:
| Dataset | Source Link | Notes |
|---|---|---|
| MedQA | https://github.com/jind11/MedQA | Place JSONL files in ./data/medqa/ |
| PathVQA | Hugging Face Datasets | Auto-loaded via datasets library |
| VQA-RAD | Hugging Face Datasets | Auto-loaded via datasets library |
| SLAKE-VQA | Hugging Face Datasets | Auto-loaded via datasets library |
For datasets requiring local files (e.g., MedQA), ensure the directory structure is:
UCAgents/
└── data/
└── medqa/
├── train.jsonl
├── val.jsonl
└── test.jsonl
python main.pypython main.py \
--dataset vqa-rad \
--unify_model qwen2.5vl:7b \ # Local/Ollama model name or remote model (e.g., gpt-3.5-turbo)
--num_samples -1 \ # -1 for full dataset
--resume 0 \ # 1 to resume from last run
--log_dir ./exp_logs \ # Log directory
--checkapi \ # Validate API/local LLM connection before run| Parameter | Type | Default | Description |
|---|---|---|---|
--dataset |
str | pathvqa |
Target dataset: medqa/pathvqa/vqa-rad/slake-vqa |
--unify_model |
str | llama3:8b |
LLM name (remote: gpt-3.5-turbo; local: ollama model name) |
--num_samples |
int | -1 |
Number of samples to process (-1 = all) |
--resume |
int | 0 |
Resume from previous checkpoint (1 = enable) |
--checkapi |
flag | False |
Check LLM API/local connection before execution |
--log_dir |
str | ./logs |
Directory to store execution logs |
--disable_logging |
flag | False |
Disable logging (only print to console) |
| File/Module | Core Function |
|---|---|
main.py |
Entry point: argument parsing, dataset initialization, pipeline execution |
agents.py |
Defines Agent class: LLM interaction (remote/local), prompt construction |
dataset.py |
DataLoader class: load/preprocess medical datasets, shuffle options |
hierachy_diagnosis.py |
Core multi-agent reasoning: unidirectional convergence + visual evidence anchoring |
utils.py |
Helper functions: API validation, option extraction, token counting, accuracy calculation |
logger_util.py |
Logging system: track experiments, token usage, and error messages |
output/ |
Stores evaluation results (JSON) with accuracy, sample count, timestamp |
logs/ |
Default log directory (execution logs, token statistics) |
UCAgents outperforms single-agent baselines on medical VQA datasets:
| Dataset | Single-Agent (GPT-3.5) | UCAgents (GPT-3.5) | Single-Agent (Llama3-8B) | UCAgents (Llama3-8B) |
|---|---|---|---|---|
| PathVQA | 78.2% | 85.7% | 72.5% | 81.3% |
| VQA-RAD | 80.1% | 87.9% | 74.8% | 83.5% |
| SLAKE-VQA | 76.5% | 84.2% | 71.2% | 79.8% |
| MedQA | 79.8% | 86.4% | 75.1% | 82.7% |
If you use UCAgents in your research, please cite the original paper:
@article{ucagents2025,
title={UCAgents: Unidirectional Convergence for Visual Evidence Anchored Multi-Agent Medical Decision-Making},
author={Qianhan Feng, Zhongzhen Huang, Yakun Zhu, Xiaofan Zhang, Qi Dou},
journal={arXiv preprint arXiv:2512.02485},
year={2025},
url={https://arxiv.org/pdf/2512.02485}
}- API Connection Errors: Verify
API_BASEandAPI_KEYinagents.py; check network connectivity to the LLM API. - Local LLM Timeouts: Ensure Ollama is running (
ollama serve) and the model is correctly pulled. - Dataset Loading Errors: For MedQA, confirm the JSONL files are in the correct path; for Hugging Face datasets, install
huggingface-hubif missing.