Solve Visual Understanding with Reinforced VLMs
-
Updated
May 11, 2025 - Python
Solve Visual Understanding with Reinforced VLMs
Explore the Multimodal “Aha Moment” on 2B Model
🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.
Proposed fuzzy reward model with GRPO to improve VLM's abilities in crowd counting task.
simpleR1: A Simple Framework for Training R1-like Models
Add a description, image, and links to the r1-zero topic page so that developers can more easily learn about it.
To associate your repository with the r1-zero topic, visit your repo's landing page and select "manage topics."