Skip to content
View hijkzzz's full-sized avatar

Block or report hijkzzz

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

This repository provides valuable reference for researchers in the field of multimodality, please start your exploratory travel in RL-based Reasoning MLLMs!

516 30 Updated Mar 28, 2025

🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation

Python 14,621 1,690 Updated Mar 29, 2025

MM-EUREKA: Exploring Visual Aha Moment with Rule-based Large-scale Reinforcement Learning

Python 458 16 Updated Mar 28, 2025

A project to improve skills of large language models

Python 260 55 Updated Mar 29, 2025

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 1,761 114 Updated Mar 27, 2025

Democratizing Reinforcement Learning for LLMs

Python 2,155 186 Updated Feb 16, 2025

An Easy-to-use, Scalable and High-performance RLHF Framework designed for Multimodal Models.

Python 94 5 Updated Mar 10, 2025

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

6,947 230 Updated Mar 4, 2025

🐫 CAMEL: The first and the best multi-agent framework. Finding the Scaling Law of Agents. https://www.camel-ai.org

Python 11,247 1,170 Updated Mar 29, 2025

Official Repo for Open-Reasoner-Zero

Python 1,685 80 Updated Mar 5, 2025

Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.

Python 659 38 Updated Mar 28, 2025

MoBA: Mixture of Block Attention for Long-Context LLMs

Python 1,694 101 Updated Mar 7, 2025

🙌 OpenHands: Code Less, Make More

Python 51,515 5,713 Updated Mar 29, 2025

A series of technical report on Slow Thinking with LLM

Python 600 33 Updated Mar 28, 2025

Autonomous coding agent right in your IDE, capable of creating/editing files, executing commands, using the browser, and more with your permission every step of the way.

TypeScript 37,390 3,880 Updated Mar 29, 2025
Python 572 20 Updated Mar 14, 2025
Python 493 46 Updated Mar 25, 2025

Simple RL training for reasoning

Python 3,325 247 Updated Mar 26, 2025

Repo of "Quantification of Large Language Model Distillation"

Python 73 4 Updated Mar 26, 2025

The official repo of MiniMax-Text-01 and MiniMax-VL-01, large-language-model & vision-language-model based on Linear Attention

Python 2,422 178 Updated Mar 18, 2025

A visuailzation tool to make deep understaning and easier debugging for RLHF training.

Python 179 6 Updated Feb 20, 2025

Mulberry, an o1-like Reasoning and Reflection MLLM Implemented via Collective MCTS

Python 1,127 100 Updated Mar 28, 2025

Retrieval and Retrieval-augmented LLMs

Python 9,142 657 Updated Mar 20, 2025

My learning notes/codes for ML SYS.

Python 1,604 91 Updated Mar 29, 2025

Recipes to train reward model for RLHF.

Python 1,260 92 Updated Feb 9, 2025

An Open Large Reasoning Model for Real-World Solutions

Python 1,477 76 Updated Mar 4, 2025

Code and data for "MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning" (ICLR 2024)

Jupyter Notebook 364 49 Updated Aug 25, 2024

Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering

58 2 Updated Dec 5, 2024
Next
Showing results