Skip to content
inzva AI Projects #2 - Game Playing with Reinforcement Learning Project
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.
Cross Entropy Method


inzva AI Projects #2 - Game Playing with Reinforcement Learning Project


This repository provides code, exercises and solutions for popular Reinforcement Learning algorithms.


We aim to learn RL algorithms and try to implement that to the RL games.

Descreate Space

  • Cartpole-v0 with VPG
  • Pong: The pendulum starts upright, and the goal is to prevent it from falling over.
  • The Taxi: In this lab, you will train a taxi to pick up and drop off passengers.

Continuous Space

  • MountainCarContinuous-v0 with Actor Critic/PPO
  • BipedalWalker-v2 with Deep Deterministic Policy Gradients (DDPG)/ Genetic Algorithms.

RL Algorithms

** Cross-Entropy Method

The cross-entropy (CE) method is a Monte Carlo method for importance sampling and optimization. It is applicable to both combinatorialand continuous problems, with either a static or noisy objective. The method approximates the optimal importance sampling estimator by repeating two phases:

  • Draw a sample from a probability distribution.

  • Minimize the cross-entropy between this distribution and a target distribution to produce a better sample in the next iteration.

You can’t perform that action at this time.