Pinned Loading
-
-
spiral-rl/spiral
spiral-rl/spiral PublicSPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning
-
LeonGuertler/TextArena
LeonGuertler/TextArena PublicA Collection of Competitive Text-Based Games for Language Model Evaluation and Reinforcement Learning
-
Cohere-Labs-Community/iterative-data-selection
Cohere-Labs-Community/iterative-data-selection Public -
hanxuhu/SeqIns
hanxuhu/SeqIns PublicThe repository of the project "Fine-tuning Large Language Models with Sequential Instructions", code base comes from open-instruct and LAVIS
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.