w5688414

w5688414 w5688414

Deep learning, hashtag recommendation, reinforcement learning, natural language processing

81 followers · 71 following

https://blog.csdn.net/w5688414

Achievements

x3 x3

Achievements

x3 x3

Starred repositories

naver / goal-co

Python 25 Updated Feb 17, 2025

microsoft / Tutel

Tutel MoE: An Optimized Mixture-of-Experts Implementation

Python 790 95 Updated Mar 24, 2025

SUFE-AIFLM-Lab / Fin-R1

385 51 Updated Mar 27, 2025

facebookresearch / sweet_rl

Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks

Python 134 8 Updated Mar 28, 2025

volcengine / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 5,827 584 Updated Mar 28, 2025

willccbb / verifiers

Verifiers for LLM Reinforcement Learning

Python 724 75 Updated Mar 23, 2025

0russwest0 / Agent-R1

Python 215 12 Updated Mar 27, 2025

RUCAIBox / R1-Searcher

R1-searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning

Python 391 25 Updated Mar 23, 2025

LightChen233 / Awesome-Long-Chain-of-Thought-Reasoning

137 10 Updated Mar 13, 2025

Hub-Tian / UAVs_Meet_LLMs

189 18 Updated Mar 26, 2025

hiyouga / EasyR1

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 1,758 114 Updated Mar 27, 2025

PeterGriffinJin / Search-R1

Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL

Python 1,453 103 Updated Mar 27, 2025

hzy312 / knowledge-r1

Knowledge-Reasoning Synergy Reinforcement Learning.

Python 34 1 Updated Mar 1, 2025

browser-use / browser-use

Make websites accessible for AI agents

Python 49,933 5,228 Updated Mar 29, 2025

modelscope / ms-swift

Use PEFT or Full-parameter to finetune 500+ LLMs (Qwen2.5, InternLM3, GLM4, Llama3.3, Mistral, Yi1.5, Baichuan2, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-VL, Qwen2.5-Omni, Qwen2-Audio, Llama3.2-Vi…

Python 6,616 567 Updated Mar 28, 2025

xingyaoww / code-act

Official Repo for ICML 2024 paper "Executable Code Actions Elicit Better LLM Agents" by Xingyao Wang, Yangyi Chen, Lifan Yuan, Yizhe Zhang, Yunzhu Li, Hao Peng, Heng Ji.

Python 952 77 Updated May 23, 2024

RAGEN-AI / RAGEN

RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.

Python 1,261 87 Updated Mar 26, 2025

OpenManus / OpenManus-RL

A live stream development of RL tunning for LLM agents

Python 2,003 266 Updated Mar 28, 2025

mannaandpoem / OpenManus

No fortress, purely open ground. OpenManus is Coming.

Python 40,544 6,793 Updated Mar 27, 2025

szqwu / Motion-Agent

Official repo of "Motion-Agent: A Conversational Framework for Human Motion Generation with LLMs"

Python 49 2 Updated Mar 2, 2025

pavanjava / mixture_of_workflows

this is a repository that gives the power of mixture of workflows a concept inspired by the mixture of agents.

Python 13 Updated Aug 19, 2024

RUCAIBox / Slow_Thinking_with_LLMs

A series of technical report on Slow Thinking with LLM

Python 600 33 Updated Mar 28, 2025

THU-KEG / Agentic-Reward-Modeling

Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems

Python 76 2 Updated Mar 7, 2025

Open-Reasoner-Zero / Open-Reasoner-Zero

Official Repo for Open-Reasoner-Zero

Python 1,685 80 Updated Mar 5, 2025

microsoft / qlib

Qlib is an AI-oriented quantitative investment platform that aims to realize the potential, empower research, and create value using AI technologies in quantitative investment, from exploring ideas…

Python 17,845 2,991 Updated Mar 24, 2025

pkargupta / tree-of-debate

Tree-of-Debate converts scientific papers into LLM personas that debate their respective novelties. To emphasize structured, critical reasoning rather than focusing solely on outcomes, Tree-of-Deba…

Python 6 1 Updated Feb 23, 2025

yxbian23 / aLLM4TS

[ICML2024] Official repo for paper "Multi-Patch Prediction: Adapting LLMs for Time Series Representation Learning"

Python 65 8 Updated May 14, 2024

IAAR-Shanghai / SurveyX

Academic Survey Paper Generation.

TeX 792 66 Updated Mar 19, 2025

qiqiApink / MotionGPT

The official PyTorch implementation of the paper "MotionGPT: Finetuned LLMs are General-Purpose Motion Generators"

Python 221 15 Updated Dec 28, 2023

brendanhogan / DeepSeekRL-Extended

Exploring Applications of GRPO

Python 111 9 Updated Mar 28, 2025

w5688414 w5688414

Starred repositories

table-detection

hashtag-recommendation