Skip to content

Jiahao000/SSL-R1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 

Repository files navigation

SSL-R1: Self-Supervised Visual Reinforcement Post-Training for Multimodal Large Language Models

1Max Planck Institute for Informatics  2VIA Research Center  3Google

We present SSL-R1, a generic self-supervised RL post-training framework that derives intrinsically verifiable rewards from input images. SSL-R1 is vision-centric, cost-effective, and scalable, requiring neither human nor external model supervision.

🀩 Key Properties

  • Covers multiple SSL tasks
  • Supports one-time and one-stage training
  • Transferable to broad downstream tasks
  • πŸ“– For more results, please refer to our paper


    πŸ“£ News

    • [04/2026] πŸ”₯ SSL-R1 is released on arXiv.

    🌟 Method

    SSL-R1 is a generic self-supervised RL-based post-training framework. We re-purpose five different self-supervised tasks widely used in the vision literature as examples amenable to being used within an RLVR framework, targeting different aspects of visual information and providing comprehensive coverage of vision-centric reasoning capabilities.

    πŸ₯° Qualitative Examples

    We provide some qualitative examples of the baseline model (Qwen2.5-VL-7B) vs. our SSL-R1 on three types of vision-centric multimodal benchmarks.

    πŸ“˜ Citation

    If you find this work useful for your research, please consider citing our paper:

    @article{xie2026ssl,
      title = {SSL-R1: Self-Supervised Visual Reinforcement Post-Training for Multimodal Large Language Models},
      author = {Xie, Jiahao and Tonioni, Alessio and Rauschmayr, Nathalie and Tombari, Federico and Schiele, Bernt},
      journal = {arXiv preprint arXiv:2604.20705},
      year = {2026}
    }

    About

    No description, website, or topics provided.

    Resources

    Stars

    Watchers

    Forks

    Releases

    No releases published

    Packages