Not pure RL #286

itschenxi · 2022-12-17T10:40:37Z

It seems your model is a mixture of supervised learning and reinforcement learning, not pure RL as described in the alphazero paper.

suragnair · 2022-12-17T12:16:11Z

Quoting the README: “A simplified, highly flexible, commented and (hopefully) easy to understand implementation of self-play based reinforcement learning based on the AlphaGo Zero paper (Silver et al)”

suragnair closed this as completed Dec 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Not pure RL #286

Not pure RL #286

itschenxi commented Dec 17, 2022

suragnair commented Dec 17, 2022

Not pure RL #286

Not pure RL #286

Comments

itschenxi commented Dec 17, 2022

suragnair commented Dec 17, 2022