simonucl

Simon Yu simonucl

Achievements

ChenmienTan/RL2 Public

Python 195 17
spiral-rl/spiral Public

SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

Python 99 9
LeonGuertler/TextArena Public

A Collection of Competitive Text-Based Games for Language Model Evaluation and Reinforcement Learning

Python 209 44
Cohere-Labs-Community/iterative-data-selection Public

Python 28 5
hanxuhu/SeqIns Public

The repository of the project "Fine-tuning Large Language Models with Sequential Instructions", code base comes from open-instruct and LAVIS

Jupyter Notebook 29 2
HJCL Public

Python 15 3