sdpkjc

Follow

🐢

Focusing

Yanxiao Zhao sdpkjc

🐢

Focusing

Follow

🧑‍🎓 CS PhD Student @ UCAS | 🤖 Reinforcement Learning | 🏄‍♂️ Research Intern @zai-org | 🦶 Ex-Intern @ LiAuto @SenseTime @ ZeronTruck.com

64 followers · 162 following

University of Chinese Academy of Sciences
Beijing, China
13:58 (UTC +08:00)
sdpkjc.me
https://orcid.org/0000-0001-9842-4706
@sdpkjc_adam

Achievements

Achievements

Pinned Loading

SATQuest SATQuest Public

SATQuest: A Verifier for Logical Reasoning Evaluation and Reinforcement Fine-Tuning of LLMs

Python 3
vwxyzjn/cleanrl vwxyzjn/cleanrl Public

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

Python 7.9k 843
openrlbenchmark/openrlbenchmark openrlbenchmark/openrlbenchmark Public

Python 237 14
abcdrl abcdrl Public

Modular Single-file Reinfocement Learning Algorithms Library

Python 38 1
xlang-ai/OSWorld xlang-ai/OSWorld Public

[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

Python 2.1k 293
sgl-project/sglang sgl-project/sglang Public

SGLang is a fast serving framework for large language models and vision language models.

Python 17.9k 2.9k