Jiminy is an imitation learning library that uses VNC as an interface, and is meant to train agents on any environment(starting with World-of-Bits) tasks.
Minimum requirements:
- Python >=3.5 (preferably in virtualenv or conda)
- Golang >=1.10
- numpy
- docker-compose (install on Linux using
./utils/install_compose.sh
)
docker run -it -p 5900:5900 -p 15900:15900 -p 90:6080 --ipc host --privileged --cap-add SYS_ADMIN sibeshkar/jiminywob:latest
You can view the environment at HOSTNAME:90
in your browser.
git clone https://github.com/sibeshkar/jiminy
cd jiminy
pip install -e .
cd examples
pip install -r requirements.txt
./wob_remotes_new.py
Wait a few moments for the remote environment to reset to the sample environment that the agent uses : wob.mini.BisectAngle-v0
. Check docker logs
of the remote environment container if this agent fails to connect. This agent interacts with the environment inside the remote container and returns a tuple of the form (observation, reward, is_done, info)
with every interaction.
The core Jiminy infrastructure allows agents to train in parallel environments for asynchronous methods for RL (i.e A3C). This remote docker image starts a TigerVNC server and boots a Python control server, which uses Selenium to open a Chrome browser to an in-container page which loads MiniWoB environments. The utils
folder contains helpful bash files to handle the architecture.
Follow these instructions to use:
- Change directory to
utils
- Start 8 remote environments in docker containers, and dashboard -
docker-compose -f docker-compose-remotes.yml --compatibility up -d
- View the environments while training at
HOSTNAME:80
in the browser - After completion, clean up using
docker-compose -f docker-compose-remotes.yml down
Jiminy contains an example for training agents using A3C methods from scratch. Follow these steps to reproduce:
- Clone and install this repository (preferably in virtualenv or conda).
pip install -e .
- Start 8 remote environments after changing into
utils
directory :docker-compose -f docker-compose-remotes.yaml --compatibility up -d
- [OPTIONAL] Open VNC viewer dashboard in browser :
HOSTNAME:80
- Move into the examples directory :
cd examples
- Install requirements for the agent:
pip install -r requirements.txt
- Train the agent :
./wob_click_train.py -n t1 --cuda
(t1 is the name of the iteration)
All runs are stored in the ./examples/runs/
directory , and best models are stored in ./examples/saves/
. You can inspect the training by starting tensorboard --logdir=runs
in a separate terminal.
On a GTX 1050Ti, the above takes one hour, i.e. 100K-150K frames to get to a mean reward of 0.9.
If you just want to see how Jiminy handles arrays of tuples of the form (observation_n, reward_n, done_n, info)
from the parallel environments, just run ./wob
Start the environment wob.mini.ClickTest2-v0
in a container. This exposes a port with rewarder and VNC proxy.
- Change directory to
utils
- Set environment variable :
export ENV_NAME=wob.mini.ClickTest2-v0
- Run
docker-compose -f docker-compose-demo.yaml --compatibility up -d
- To record demonstrations, visit the noVNC client at
HOSTNAME:6080
and connect using password :boxware
- After recording demonstrations, disconnect noVNC using the panel or close the tab.
All recorded demonstrations are stored inside /tmp/completed-demos
inside the container, and will be automatically transferred to the examples/completed-demos
directory on your machine.
This example lets you play a few games of TicTacToe, and have an A3C agent imitate (and then optimize) from your demonstrations.
- Set
export ENV_NAME=wob.mini.TicTacToe-v0
- Record demonstrations by following the instructions above. This stores the demonstrations in
examples/completed-demos
- Change the directory to
examples
, and run the training process by running the following :
./wob_click_train.py -n t1 --env wob.mini.TicTacToe --demo completed-demos/ --host localhost --cuda