SSL-R1: Self-Supervised Visual Reinforcement Post-Training for Multimodal Large Language Models

Jiahao Xie^1,2 Alessio Tonioni³ Nathalie Rauschmayr³ Federico Tombari³ Bernt Schiele^1,2

¹Max Planck Institute for Informatics ²VIA Research Center ³Google

We present SSL-R1, a generic self-supervised RL post-training framework that derives intrinsically verifiable rewards from input images. SSL-R1 is vision-centric, cost-effective, and scalable, requiring neither human nor external model supervision.

🤩 Key Properties

Covers multiple SSL tasks

Supports one-time and one-stage training

Transferable to broad downstream tasks

📖 For more results, please refer to our paper

📣 News

[04/2026] 🔥 SSL-R1 is released on arXiv.

🌟 Method

SSL-R1 is a generic self-supervised RL-based post-training framework. We re-purpose five different self-supervised tasks widely used in the vision literature as examples amenable to being used within an RLVR framework, targeting different aspects of visual information and providing comprehensive coverage of vision-centric reasoning capabilities.

🥰 Qualitative Examples

We provide some qualitative examples of the baseline model (Qwen2.5-VL-7B) vs. our SSL-R1 on three types of vision-centric multimodal benchmarks.

📘 Citation

If you find this work useful for your research, please consider citing our paper:

@article{xie2026ssl,
  title = {SSL-R1: Self-Supervised Visual Reinforcement Post-Training for Multimodal Large Language Models},
  author = {Xie, Jiahao and Tonioni, Alessio and Rauschmayr, Nathalie and Tombari, Federico and Schiele, Bernt},
  journal = {arXiv preprint arXiv:2604.20705},
  year = {2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
assets		assets
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SSL-R1: Self-Supervised Visual Reinforcement Post-Training for Multimodal Large Language Models

📣 News

🌟 Method

🥰 Qualitative Examples

📘 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Folders and files

Latest commit

History

Repository files navigation

SSL-R1: Self-Supervised Visual Reinforcement Post-Training for Multimodal Large Language Models

📣 News

🌟 Method

🥰 Qualitative Examples

📘 Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Packages