In this project,I implemented an agent that can learn to control a cartpole using C51 algorithm which is introduced by "A Distributional Perspective on Reinforcement Learning".Also I used double q-learning instead q-learning to keep stability during training.
It could achieve score of 195.19 over 100 episodes!