inzva AI Projects #2 - Game Playing with Reinforcement Learning Project
This repository provides code, exercises and solutions for popular Reinforcement Learning algorithms.
We aim to learn RL algorithms and try to implement that to the RL games.
- Cartpole-v0 with VPG
- Pong: The pendulum starts upright, and the goal is to prevent it from falling over.
- The Taxi: In this lab, you will train a taxi to pick up and drop off passengers.
- MountainCarContinuous-v0 with Actor Critic/PPO
- BipedalWalker-v2 with Deep Deterministic Policy Gradients (DDPG)/ Genetic Algorithms.
** Cross-Entropy Method
The cross-entropy (CE) method is a Monte Carlo method for importance sampling and optimization. It is applicable to both combinatorialand continuous problems, with either a static or noisy objective. The method approximates the optimal importance sampling estimator by repeating two phases:
Draw a sample from a probability distribution.
Minimize the cross-entropy between this distribution and a target distribution to produce a better sample in the next iteration.