Here it is, the simplest deep reinforcement learning algorithm ever written in pytorch.
- Pick your model and create your agent
- Play a lot of sessions
- Select the most successful sessions in terms of reward
- Trair your agent with that sessions
- Repeat
This is a fancy genetic algorithm, the model can learn from the best training data of the sessions. The problem with this is that the unsuccessful sessions will be discarted, so our agent will not learn from it's failures.
This algorithm has a poor performance, a lot of sessions are just discarted, but it works well at the end.
This project is under MIT License, use it as you want.
I have a lot of fun projects, check this:
- https://github.com/HectorPulido/Amazon-QLDB-Login-Example
- https://github.com/HectorPulido/Decentralized-Twitter-with-blockchain-as-base
- https://github.com/HectorPulido/Machine-learning-Framework-Csharp
- https://github.com/HectorPulido/Evolutionary-Neural-Networks-on-unity-for-bots
- https://github.com/HectorPulido/Imitation-learning-in-unity
- https://github.com/HectorPulido/Chatbot-seq2seq-C-
- Twitter: https://twitter.com/Hector_Pulido_
- Youtube: http://youtube.com/c/hectorandrespulidopalmar
- Twitch: https://www.twitch.tv/hector_pulido_