Skip to content
in progress
Python Shell
Branch: master
Clone or download
carpedm20 Merge pull request #3 from reiinakano/reiinakano-patch-2
Do not jump back to 0 during a jump
Latest commit 2cca458 Mar 22, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
assets
envs
models
utils
.gitignore
LICENSE
README.md
agent.py
config.py
install.sh
main.py
replay.py
requirements.txt
rl_utils.py
run.py
trainer.py

README.md

SPIRAL in TensorFlow (in progress)

TensorFlow implementation of Synthesizing Programs for Images using Reinforced Adversarial Learning (SPIRAL).

model

SPIRAL is an adversarially trained agent that generates a program which is executed by a graphics engine to interpret and sample images. This agent is trained to fool a discriminator with a distributed reinforcement learning without any supervision.

In short, Distributed RL + GAN + Program synthesis.

Prerequisites

Usage

Install prerequisites with:

./install.sh
pip install -r requirements.txt

To debug a SPIARL model:

python run.py --num_workers 8 --env simple --episode_length=1 \
              --location_size=8 --conditional=True \
              --loss=l2 --policy_batch_size=1

To train a SPIARL model:

python run.py --num_workers 16 --env simple_mnist --episode_length=3 \
              --color_channel=1 --location_size=32 --loss=gan --num_gpu=1 \
              --disc_dim=8 --conditional=False \
              --mnist_nums=1,7 --jump=False --curve=False

python run.py --num_workers 24 --env simple_mnist --episode_length=6 \
              --color_channel=1 --location_size=32 --loss=gan --num_gpu=2 \
              --disc_dim=64 --conditional=False \
              --mnist_nums=0,1,2,3,4,5,6,7,8,9 --jump=True

python run.py --num_workers 12 --env simple_mnist --episode_length=2 \
              --color_channel=1 --location_size=32 --conditional=True \
              --mnist_nums=1 --loss=gan

python run.py --num_workers 24 --env simple_mnist --episode_length=3 \
              --color_channel=1 --location_size=32 --conditional=True \
              --mnist_nums=1,2,7 --loss=l2

python run.py --num_workers 24 --env simple_mnist --episode_length=3 \
              --color_channel=1 --location_size=32 --conditional=True \
              --mnist_nums=1,2,7 --loss=gan --num_gpu=2

python run.py --num_workers 24 --env simple_mnist --episode_length=5 \
              --color_channel=1 --location_size=32 --conditional=True \
              --mnist_nums=0,1,2,7 --loss=gan --num_gpu=2

Results

(in progress)

Random generated samples at early stage:

model

Incorrectly converged samples at early stage:

model

Tensorboard:

model

To-do

  • IMPALA A2C
  • IMPALA V-trace
  • Simple environment (debugging)
  • Find a correct libmypaint setting
  • MNIST environment
  • ReplayThread (--loss=gan)
  • --num_gpu=2 test
  • --conditional=True (need more details)
  • Replay memory needs more detailed information
  • Population Based Training (to be honest, I don't have any plan for this)

References

This code is heavily based on openai/universe-starter-agent.

Author

Taehoon Kim / @carpedm20

You can’t perform that action at this time.