This is my solution for RLCOMP2020 reaching top 10. We used many approaches and different techniques including variations of deep q learning, imitation learning. After a month of testing and researching, I found that the key component of initial agent is the use of CNN. Because of difference between maps, so using 2D state would result a robust performance and generalize well across maps. Secondly, I tested many smaller combinations of RAINBOW algorithm, and the conbination including double deep q learning, prioritized experience memory, n-step learning give the most promising metrics. I aslo wrote a code that implement the learning from human demonstation paper. My teamates coded a heuristic bot based on A* algorithm then I used it as a demonstration generator instead of a human. However, due to lack of memory so this approach hasn't evalutated but I considered it as very promising approach.
forked from xphongvn/rlcomp2020
-
Notifications
You must be signed in to change notification settings - Fork 0
huyphan168/RLCOMP-2020
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
This is my solution for RLCOMP2020 reaching top 10. We used many approaches and different techniques including variations of deep q learning, imitation learning.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published
Languages
- Jupyter Notebook 89.8%
- Python 10.2%