Jiminy - An imitation learning library

(This repository is in pre-alpha stage. Expect a lot of errors.)

Jiminy is an imitation learning library that uses VNC as an interface, and is meant to train agents on any environment(starting with World-of-Bits) tasks.

Minimum requirements:

Python >=3.5 (preferably in virtualenv or conda)
Golang >=1.10
numpy
docker-compose (install on Linux using ./utils/install_compose.sh)

How to run sample:

1. Run remote environment:

docker run -it -p 5900:5900 -p 15900:15900 -p 90:6080 --ipc host --privileged --cap-add SYS_ADMIN sibeshkar/jiminywob:latest

You can view the environment at HOSTNAME:90 in your browser.

2. Install Jiminy in virtualenv/conda :

git clone https://github.com/sibeshkar/jiminy
cd jiminy
pip install -e .

3.Install requirements for running agents, and run sample random agent:

cd examples
pip install -r requirements.txt
./wob_remotes_new.py

Wait a few moments for the remote environment to reset to the sample environment that the agent uses : wob.mini.BisectAngle-v0. Check docker logs of the remote environment container if this agent fails to connect. This agent interacts with the environment inside the remote container and returns a tuple of the form (observation, reward, is_done, info) with every interaction.

Infrastructure Overview:

The core Jiminy infrastructure allows agents to train in parallel environments for asynchronous methods for RL (i.e A3C). This remote docker image starts a TigerVNC server and boots a Python control server, which uses Selenium to open a Chrome browser to an in-container page which loads MiniWoB environments. The utils folder contains helpful bash files to handle the architecture.

Follow these instructions to use:

Change directory to utils
Start 8 remote environments in docker containers, and dashboard - docker-compose -f docker-compose-remotes.yml --compatibility up -d
View the environments while training at HOSTNAME:80 in the browser
After completion, clean up using docker-compose -f docker-compose-remotes.yml down

Training an A3C agent from scratch:

Jiminy contains an example for training agents using A3C methods from scratch. Follow these steps to reproduce:

Clone and install this repository (preferably in virtualenv or conda). pip install -e .
Start 8 remote environments after changing into utils directory : docker-compose -f docker-compose-remotes.yaml --compatibility up -d
[OPTIONAL] Open VNC viewer dashboard in browser : HOSTNAME:80
Move into the examples directory : cd examples
Install requirements for the agent: pip install -r requirements.txt
Train the agent : ./wob_click_train.py -n t1 --cuda (t1 is the name of the iteration)

All runs are stored in the ./examples/runs/ directory , and best models are stored in ./examples/saves/. You can inspect the training by starting tensorboard --logdir=runs in a separate terminal.

On a GTX 1050Ti, the above takes one hour, i.e. 100K-150K frames to get to a mean reward of 0.9.

If you just want to see how Jiminy handles arrays of tuples of the form (observation_n, reward_n, done_n, info) from the parallel environments, just run ./wob

Recording Demonstrations:

Start the environment wob.mini.ClickTest2-v0 in a container. This exposes a port with rewarder and VNC proxy.

Change directory to utils
Set environment variable : export ENV_NAME=wob.mini.ClickTest2-v0
Run docker-compose -f docker-compose-demo.yaml --compatibility up -d
To record demonstrations, visit the noVNC client at HOSTNAME:6080 and connect using password : boxware
After recording demonstrations, disconnect noVNC using the panel or close the tab.

All recorded demonstrations are stored inside /tmp/completed-demos inside the container, and will be automatically transferred to the examples/completed-demos directory on your machine.

Training From Demonstrations:

This example lets you play a few games of TicTacToe, and have an A3C agent imitate (and then optimize) from your demonstrations.

Set export ENV_NAME=wob.mini.TicTacToe-v0
Record demonstrations by following the instructions above. This stores the demonstrations in examples/completed-demos
Change the directory to examples, and run the training process by running the following :

./wob_click_train.py -n t1 --env wob.mini.TicTacToe --demo completed-demos/ --host localhost --cuda

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
.vscode		.vscode
bin		bin
examples		examples
jiminy		jiminy
utils		utils
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
newfile		newfile
setup.py		setup.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Jiminy - An imitation learning library

(This repository is in pre-alpha stage. Expect a lot of errors.)

How to run sample:

1. Run remote environment:

2. Install Jiminy in virtualenv/conda :

3.Install requirements for running agents, and run sample random agent:

Infrastructure Overview:

Training an A3C agent from scratch:

Recording Demonstrations:

Training From Demonstrations:

About

Releases

Packages

Contributors 2

Languages

License

sibeshkar/jiminy

Folders and files

Latest commit

History

Repository files navigation

Jiminy - An imitation learning library

(This repository is in pre-alpha stage. Expect a lot of errors.)

How to run sample:

1. Run remote environment:

2. Install Jiminy in virtualenv/conda :

3.Install requirements for running agents, and run sample random agent:

Infrastructure Overview:

Training an A3C agent from scratch:

Recording Demonstrations:

Training From Demonstrations:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages