This projects apply basic RL concepts to various games with pytorch
More games (e.g., Baseball games) and RL methods (e.g., Double, Dueling DQN) will be updated soon.
We need python 3.x, pytorch, numpy libraries.
According to games types, gym and other libraries are needed.
The list of additional libraries
- Dots - (X)
- CartPole - gym
- LunarLander - gym, box2d
the way of installation is below
# 1) gym and Box2D
pip install gym Box2D
- Dots
- CartPole
- LunarLander
- Baseball game (Coming Soon)
In case of OpenAI games, the goal of the each game is here and you can find more OpenAI games here
The agent (blue) moves to maximize score!
There are 4 kinds of blocks which are blue, red, green, white.
Blue - Agent. we can only move this agent
Red - Obstacle. the score is decreased when the agent meet the obstacle
Green - Item. the score is increased when the agent meet the item
White - Edge. It represents the end of frame and the score is also decreased when the agent force to go to edge
States are given with colored map, the actions are 4 (Up:0, Down:1, Left:2, Right:3)
You can control the size of map and the default value is 5x5 (total 7x7 with frame)
the original code of game is below https://github.com/awjuliani/DeepRL-Agents/blob/master/gridworld.py
We move cart to keep pole stand up within frame !
States are given with 4 values, the actions are 2 (left, right)
You can easily apply this game using gym library
env = gym.make('cartpole-v1')
We aim to land the agent within two flags of moon surface.
env = gym.make('LunarLander-v2') # discrete
or
env = gym.make('LunarLanderContinuous-v2') # continuous
Discrete version : States and actions are given with 8 float values and 4 integer values, respectively.
Continuous version : States and actions are given with 8 float values and 2 float values (-1 ~ +1), respectively.
Based on this game, the agent learned by vanila policy gradient and deep q learning.
Basic Q learning (without memory buffer)
Actor-Critic with Advantage function
Soft Actor-Critic