hijkzzz

hijkzzz

551 followers · 52 following

Achievements

x4 x2

Achievements

x4 x2

Stars

Sun-Haoyuan23 / Awesome-RL-based-Reasoning-MLLMs

This repository provides valuable reference for researchers in the field of multimodality, please start your exploratory travel in RL-based Reasoning MLLMs!

516 30 Updated Mar 28, 2025

camel-ai / owl

🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation

Python 14,621 1,690 Updated Mar 29, 2025

ModalMinds / MM-EUREKA

MM-EUREKA: Exploring Visual Aha Moment with Rule-based Large-scale Reinforcement Learning

Python 458 16 Updated Mar 28, 2025

NVIDIA / NeMo-Skills

A project to improve skills of large language models

Python 260 55 Updated Mar 29, 2025

hiyouga / EasyR1

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 1,761 114 Updated Mar 27, 2025

agentica-project / deepscaler

Democratizing Reinforcement Learning for LLMs

Python 2,155 186 Updated Feb 16, 2025

OpenRLHF / OpenRLHF-M

An Easy-to-use, Scalable and High-performance RLHF Framework designed for Multimodal Models.

Python 94 5 Updated Mar 10, 2025

deepseek-ai / open-infra-index

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

6,947 230 Updated Mar 4, 2025

camel-ai / camel

🐫 CAMEL: The first and the best multi-agent framework. Finding the Scaling Law of Agents. https://www.camel-ai.org

Python 11,247 1,170 Updated Mar 29, 2025

Open-Reasoner-Zero / Open-Reasoner-Zero

Official Repo for Open-Reasoner-Zero

Python 1,685 80 Updated Mar 5, 2025

TideDra / lmm-r1

Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.

Python 659 38 Updated Mar 28, 2025

MoonshotAI / MoBA

MoBA: Mixture of Block Attention for Long-Context LLMs

Python 1,694 101 Updated Mar 7, 2025

All-Hands-AI / OpenHands

🙌 OpenHands: Code Less, Make More

Python 51,515 5,713 Updated Mar 29, 2025

RUCAIBox / Slow_Thinking_with_LLMs

A series of technical report on Slow Thinking with LLM

Python 600 33 Updated Mar 28, 2025

GuanghaoYe / Emergence-of-Thinking

Forked from OpenRLHF/OpenRLHF

Python 48 4 Updated Feb 11, 2025

cline / cline

Autonomous coding agent right in your IDE, capable of creating/editing files, executing commands, using the browser, and more with your permission every step of the way.

TypeScript 37,390 3,880 Updated Mar 29, 2025