UE-DPO The code implementation for the CVPR 2026 paper “Uncertainty-Aware Exploratory Direct Preference Optimization for Multimodal Large Language Models”.