# MiniWoB intro


Mini World of Bits (MiniWoB) is an RL benchmark introduced by OpenAI researchers in 2017.

Their original paper http://proceedings.mlr.press/v70/shi17a/shi17a.pdf

The core idea is to create a set of browser-based tasks to be solved using RL methods. Every task is a small dynamic webpage, which could be interacted using a mouse or keyboard. The reward is given for executing correct sequence of actions. Description of the goal is included into the webpage.

In total it introduced 80 problems of varying complexity -- from the trivial like clicking the form button to very challenging, for example booking the flight following the criterias.

The problems are available here: https://stanfordnlp.github.io/miniwob-plusplus/

Unfortunately, OpenAI discontinued MiniWoB project, so, it hasn't gained popularity it deserves. After OpenAI paper in 2017, MiniWoB was used in several research papers, the most notable ones:

* [1802.08802 Reinforcement Learning on Web Interfaces Using Workflow-Guided Exploration](https://arxiv.org/abs/1802.08802)
* [1812.09195 Learning to Navigate the Web](https://arxiv.org/abs/1812.09195v1)
* [1902.07257v1 DOM-Q-NET: Grounded RL on Structured Language](https://arxiv.org/abs/1902.07257v1)

But that's almost nothing in comparison to Atari games popularity. To fix this mistake, let's play with MiniWoB in this tutorial.

This tutorial uses the original MiniWoB. There exists the imroved version from Stanford researches, called [MiniWoB++](https://stanfordnlp.github.io/miniwob-plusplus/)


## Architecture

MiniWoB is implemented as a part of [OpenAI Universe](https://github.com/openai/universe) (another frozen project of OpenAI). The idea of Universe is to use VNC protocol to connect RL agent with GUI applications. As VNC is a cross-platform protocol used by humans to communicate with remote GUI applications, RL agent also achieves this ability (is RL agent smart enough to communicate is a different question). 

MiniWoB is a part of Universe, where GUI app is a browser with loaded dynamic webpages.

Overall architecture of Universe is shown below

![Arch](images/arch.png)

The original MiniWoB docker image is available [on quay.io](https://quay.io/repository/openai/universe.world-of-bits), but I suggest you to use my version with fixed stability issues. The fixed version is available [on dockerhub](https://cloud.docker.com/u/shmuma/repository/docker/shmuma/miniwob). If you want to build your own version of the fixed image, you can follow [instructions here](https://github.com/PacktPublishing/Deep-Reinforcement-Learning-Hands-On/tree/master/Chapter13/wob_fixes)


## Starting the container

If you have docker installed, you can start the single container running the following command:

`docker run -d -p 5900:5900 -p 15900:15900 --privileged --ipc host --cap-add SYS_ADMIN shmuma/miniwob run`

Here, the options:
* `-d` detaches container from terminal and keeps it running in background
* `-p 5900:5900` forwards VNC port to the host machine
* `-p 15900:15900` forwards rewarder port 
* `--privileged` gives extended privileges to this container (not sure is this needed or not, that was written in OpenAI manual)
* `--ipc host` uses host's IPC namespace
* `--cap-add SYS_ADMIN` extends container privileges
* `shmuma/miniwob` name of the container image to start, you can use `quay.io/openai/universe.world-of- bits:0.20.0` to start original MiniWoB image
* `run` command to start inside container

During the training, several containers could be started (to decrease training samples correlation), to simplify this process, this repo includes two scripts: `containers_run.sh` and `containers_stop.sh`. The first starts the required amount of containers (given in command line), the second script stops all started containers (be careful, it just stops ALL containers, not only started by `containers_run.sh` script).

In [1]:
!cat containers_run.sh

#!/usr/bin/env bash
IMAGE_NAME=shmuma/miniwob

count=`docker ps -q | wc -l`

if test $count -ne 0 ; then
    echo You already have $count containers running, are you sure you want more?
    exit
fi

for i in `seq 1 ${1:-1}`; do
    echo Starting container $i
    P1=$((5900+$i-1))
    P2=$((15900+$i-1))
    docker run -d -p $P1:5900 -p $P2:15900 --privileged --ipc host --cap-add SYS_ADMIN $IMAGE_NAME run
done


In [2]:
!cat containers_stop.sh

#!/usr/bin/env bash

docker stop `docker ps -q`


In [4]:
!./containers_run.sh 1

Starting container 1
af325a2823471cf62f88a0f8a750da09b283faaf698a66de5befff5bd6e66884


In [6]:
!docker ps

CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                                                        NAMES
af325a282347        shmuma/miniwob      "/app/universe-envs/…"   20 seconds ago      Up 19 seconds       0.0.0.0:5900->5900/tcp, 5899/tcp, 0.0.0.0:15900->15900/tcp   quirky_elgamal


After this you can connect to the started container using one of many VNC clients available. If you're using MacOS, VNC client is already included in OS, in Finder press `Command+K` and then connect to 'vnc://localhost:5900'. Password for connection is `openai`

![](images/vnc.png)

Do not forget to stop containers, as they are quite CPU-hungry 

In [7]:
!./containers_stop.sh

af325a282347


## Gym actions/observations