Firstly, thanks for the great collection of code and articles. The articles were very useful in understanding DQN and implementing it.
However, my code is very bad in learning. I am not sure what is wrong with my code. I am using DDQN and passing rewards based on different criteria. Also the state is just a normalized version of the board itself.
My code repo is here https://github.com/codetiger/MachineLearning-2048
Let me know if you can review and help me understanding why my code doesnot learn anything even after 1000 episodes.