Behaviour Clonning On OpenAI Environment
- Train Behavioral Policy Network from human data or any perfect System ( We used pre trained Nural Network Model of Lunar Lander as our Master Network)
D_collection
- Re-run the Behavioural Policy network on new simulation and colect the data.
- Correctly label the generated data
- Aggrigate the generated and corrected data with the
D_collection
and retrain - Repeat step 2-5 and measure the performance
Special Thanks to : @nikhilbarhate99 for his PPO Model