On-policy MCTS combined with deep learning to train an actor-critic neural network that plays Hex (Con-tac-tix).
-
Updated
Jan 12, 2024 - Python
On-policy MCTS combined with deep learning to train an actor-critic neural network that plays Hex (Con-tac-tix).
My content of CS294 Deep Reinforcement Learning course, conduced by Sergey Levine from UC Berkeley.
PyTorch implementation of V-MPO
Monte Carlo Search Tree for training shared Actor-Critic-Network on the game Hex🏋️
Reinforcement learning, Policy Gradient, Actor-Critic, AC, Agent-based Simulation, Simple-world
Deep Reinforcement Learning by using an on-policy adaptation of Maximum a Posteriori Policy Optimization (MPO)
Deep Reinforcement Learning by using Truly Proximal Policy Optimization in Tensorflow 2 and Pytorch
Clean baseline implementation of PPO using an episodic TransformerXL memory
Add a description, image, and links to the on-policy topic page so that developers can more easily learn about it.
To associate your repository with the on-policy topic, visit your repo's landing page and select "manage topics."