The official implementation of Self-Play Preference Optimization (SPPO)
-
Updated
Jul 6, 2024 - Python
The official implementation of Self-Play Preference Optimization (SPPO)
OpenDILab Decision AI Engine
[NeurIPS 2023 Spotlight] LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios (awesome MCTS)
A very fast implementation of AlphaZero, applied to games like Splendor, Santorini, The Little Prince, … Browser version available
Backgammon OpenAI Gym
The official implementation of Self-Play Fine-Tuning (SPIN)
An artificial intelligence platform for the StarCraft II with large-scale distributed training and grand-master agents.
AI agents for the bavarian card game Schafkopf trained with reinforcement learning
Recreating Bill Seiler's 1985 version of Space War and training RL agents with Self-Play
Play Bor-Bor Zan strategically!
Code base for Social Robot Tree Search (SoRTS).
A clean and easy implementation of MuZero, AlphaZero and Self-Play reinforcement learning algorithms for any game.
Emulator and AI of Shadowverse
TD-Gammon implementation
Donald Michie's MENACE approach to an unbeatable self-learning Tic-Tac-Toe AI game
Implementation of RPPO(Risk-sensitive PPO) and RPBT(Population-based self-play with RPPO)
基于PyTorch的分布式强化学习框架
Using an DQN agent trained on Tic-Tac-Toe and Connect Four as a base for an dynamically balanced opponent. (Student Project)
A Massively Parallel Large Scale Self-Play Framework
Implementation of Alpha Go Zero - Reinforcement Learning Project, COL870 @iit-delhi
Add a description, image, and links to the self-play topic page so that developers can more easily learn about it.
To associate your repository with the self-play topic, visit your repo's landing page and select "manage topics."