# Xmen

## The Python API

Experiments can be defined as python classes. Parameters are treated as special attributes and the run method defines experiment execution:

```python
from xmen.experiment import Experiment, experiment_parser
import os
import time


class AnExperiment(Experiment):
    def __init__(self):
        """A basic python experiment demonstrating the features of the xmen api."""
        super(AnExperiment, self).__init__()
        # Lets define some parameters. This is done using the @p identifier.
        # By using Python typing you are able to define the name, type, default
        # And help for a parameter all in a single line. The TypedMeta metaclass
        # automatically takes care of identifying parameters and managing
        # doucmentation even before the class is instantiated!
        self.a: str = 'h'  # @p A parameter
        self.b: int = 17  # @p Another parameter

        # Normal attributes are still allowed
        self.c: int = 5  # This is not a parameter

    def run(self):
        # The execution of an experiement is created by overloading the run method.
        print(f'a = {self.a}, b = {self.b}')

        # In order to be executed an experiment must be linked with a directory
        # in which a params.yml file is created recording designed to allow the
        # execution environement to be reproduced. The experiments status is updated
        # to running within the experiment.
        print(f'The experiment state inside run is {self.status}')

        # The experiment class ``message`` facilitates experiment communication.
        # The time is now added to the _messages dictionary in params.yml.
        self.message({'time': time.time()})

        # Each experiment has its own unique directory. You are encourage to
        # write out data acumulated through the execution (snapshots, logs etc.)
        # to this directory.
        with open(os.path.join(self.directory, 'logs.txt'), 'w') as f:
            f.write('This was written from a running experiment')


if __name__ == '__main__':
    # Now we expose the command line interface
    exp = AnExperiment()
    args = exp.parse_args()
    exp.main(args)
```

The above is a copy of `experiment.py` module defined in this repo. The `TypedMeta` meta class is responsible for identifying parameters and managing documentation. The file `experiment.py` contains a copy of the above as a module. We are then able to call:

In [1]:
!python3 experiment.py --help

usage: experiment.py [-h] [--update YAML_STRING] [--execute PARAMS]
                     [--to_root DIR] [--to_defaults DIR]
                     [--register ROOT NAME] [--debug DEBUG] [--name]

A basic python experiment demonstrating the features of the xmen api.

optional arguments:
  -h, --help            show this help message and exit
  --update YAML_STRING  Update the parameters given by a yaml string. Note this will be called beforeother flags and can be used in combination with --to_root, --to_defaults,and --register.
  --execute PARAMS      Execute the experiment from the given params.yml file. Cannot be called with update.
  --to_root DIR         Generate a run script and defaults.yml file for interfacing with the experiment manager. If the directory does not exist then it is first created.
  --to_defaults DIR     Generate a defaults.yml file from the experiment defaults in the given directory
  --register ROOT NAME  Register an experiment at root (1st positional) name (2ndpo

The help information is also available using `help`

In [2]:
# Documentation is automatically added to the doc string of the class
from experiment import AnExperiment
help(AnExperiment)

Help on class AnExperiment in module experiment:

class AnExperiment(xmen.experiment.Experiment)
 |  Parameters:
 |      a (str): A parameter (default='h')
 |      b (int): Another parameter (default=17)
 |  
 |  Method resolution order:
 |      AnExperiment
 |      xmen.experiment.Experiment
 |      builtins.object
 |  
 |  Methods defined here:
 |  
 |  __init__(self)
 |      A basic python experiment demonstrating the features of the xmen api.
 |      
 |      Parameters:
 |          a (str): A parameter (default='h')
 |          b (int): Another parameter (default=17)
 |  
 |  run(self)
 |  
 |  ----------------------------------------------------------------------
 |  Methods inherited from xmen.experiment.Experiment:
 |  
 |  __call__(self, *args, **kwargs)
 |      Used to run experiment. Upon entering the experiment status is updated to ``'running`` before ``args`` and
 |      ``kwargs`` are passed to ``run()``. If ``run()`` is successful the experiment ``status`` is updated to


## The Command Line Interface

The `xmen` command line tool is used to quickly setup and manage experiments. First some set up:

In [3]:
# Clear previous examples
%cd ~
%rm -rf /private/tmp/experiments/set_a 
# These need to be set in order to work with a jupyter notebook
%alias xmen ~/xmen/python/xmen/main.py
%xmen config --disable_prompt

/Users/robweston


The tool is accessed through the `xmen` command:

In [4]:
%xmen --help

usage: xmen [-h]
            {config,note,init,register,list,run,reset,clean,rm,unlink,relink}
            ...

A helper module for the quick setup and management of experiments

positional arguments:
  {config,note,init,register,list,run,reset,clean,rm,unlink,relink}
    init                Initialise an experiment set.
    register            Register a set of experiments.
    list                List (all) experiments to screen
    run                 Run experiments matching glob in experiment set that
                        have not yetbeen run.
    reset               Reset an experiment to registered status
    clean               (DESTRUCTIVE) Remove unlinked experiments
    rm                  (DESTRUCTIVE) Remove an experiment set.
    unlink              Unlink experiments from experiment set
    relink              Relink experiments to global configuration or to a set
                        root

optional arguments:
  -h, --help            show this help message and exit

### Initialising an experiment set
Lets create a new set of experiments. First we add the experiment we defined earlier to the global configuration:

In [5]:
# Lets add the experiment we just defined to the local manager
%xmen config --add_path ~/xmen/python
%xmen config --add ~/xmen/examples/experiment.py
%xmen config -H ''

Lets create a new folder a set of experiments

In [6]:
%mkdir -p /tmp/experiments/set_a
%cd /tmp/experiments/set_a

/private/tmp/experiments/set_a


Now lets initialise an experiment set from our class definition earlier

In [7]:
%xmen init -n AnExperiment

AnExperiment
#!/bin/bash
# File generated on the 04:42PM March 09, 2020
# GIT:
# - repo /Users/robweston/xmen
# - branch master
# - remote ssh://git@mrgbuild.robots.ox.ac.uk:7999/~robw/xmen.git
# - commit bbe75180debc80b03c520c5574f4d7e88d262909

export PYTHONPATH="${PYTHONPATH}:/Users/robweston/xmen/examples"
python3 /Users/robweston/xmen/examples/experiment.py --execute ${1}
AnExperiment
Experiment root created at /private/tmp/experiments/set_a


An experiment set share default parameters and a run script.sh

In [8]:
!ls /private/tmp/experiments/set_a

defaults.yml   experiment.yml script.sh


The defaults.yml file is generated automatically from ``AnExperiment``. Parameters, their defaults and helps are all recorded. So to are the version and system information.

In [9]:
!echo "defaults.yml"
!echo "------------"
%cat defaults.yml
!echo ""
!echo "script.sh"
!echo "---------"
%cat script.sh

defaults.yml
------------
_created: 2020-03-09-16-42-13  #  The date the experiment was created (default=now_time)
_version: #  Experiment version information. See `get_version` (default=None)
  module: /Users/robweston/xmen/examples/experiment.py
  class: AnExperiment
  git:
    local: /Users/robweston/xmen
    remote: ssh://git@mrgbuild.robots.ox.ac.uk:7999/~robw/xmen.git
    commit: bbe75180debc80b03c520c5574f4d7e88d262909
    branch: master
_meta: #  The global configuration for the experiment manager (default=None)
  mac: '0x6c96cfdb71b9'
  host: dhcp45.robots.ox.ac.uk
  user: robweston
  home: /Users/robweston
a: h #  A parameter (default='h')
b: 17 #  Another parameter (default=17)

script.sh
---------
#!/bin/bash
# File generated on the 04:42PM March 09, 2020
# GIT:
# - repo /Users/robweston/xmen
# - branch master
# - remote ssh://git@mrgbuild.robots.ox.ac.uk:7999/~robw/xmen.git
# - commit bbe75180debc80b03c520c5574f4d7e88d262909

export PYTHONPATH="${PYTHONPATH}:/Users/robwest

### Registering Experiments

There are currently no experiments registered within the set:

In [10]:
%xmen list

No experiments found which match glob pattern /private/tmp/experiments/set_a. With parameter filter = None and type filter = None.


Lets register some...

In [11]:
# Lets register some experiments. The '|' operator acts like an or
%xmen register "{a: 1.0 | 2.0, b: None | cat}"

The parameters passed as a yaml dictionary will be overloaded for each experiment. The `|` acts as an or operator. Experiments for all possible combinations of parameters will be generated:

In [12]:
%xmen list -v -p ".*"

    root           name              created          type purpose             mac                    host       user              home      status                                    commit    a     b
0  set_a  a=1.0__b=None  2020-03-09-16-42-25  AnExperiment          0x6c96cfdb71b9  dhcp45.robots.ox.ac.uk  robweston  /Users/robweston  registered  bbe75180debc80b03c520c5574f4d7e88d262909  1.0  None
1  set_a   a=1.0__b=cat  2020-03-09-16-42-25  AnExperiment          0x6c96cfdb71b9  dhcp45.robots.ox.ac.uk  robweston  /Users/robweston  registered  bbe75180debc80b03c520c5574f4d7e88d262909  1.0   cat
2  set_a  a=2.0__b=None  2020-03-09-16-42-25  AnExperiment          0x6c96cfdb71b9  dhcp45.robots.ox.ac.uk  robweston  /Users/robweston  registered  bbe75180debc80b03c520c5574f4d7e88d262909  2.0  None
3  set_a   a=2.0__b=cat  2020-03-09-16-42-25  AnExperiment          0x6c96cfdb71b9  dhcp45.robots.ox.ac.uk  robweston  /Users/robweston  registered  bbe75180debc80b03c520c5574f4d7e88d262909  2.0  

### Note Taking

Notes can be added to the experiment for consultation at a later date:

In [13]:
%xmen note "A test of the xmen experiment suite"
%xmen note "Another random thought worth mentioning"
%xmen list -l

 0 /private/tmp/experiments/set_a
     |- a=1.0__b=None
     |- a=1.0__b=cat
     |- a=2.0__b=None
     |- a=2.0__b=cat
     Purpose: 
     Created: 2020-03-09-16-42-13
     Type: AnExperiment
     Notes: 
       A test of the xmen experiment suite
       Another random thought worth mentioning


### Running experiments

Now lets run the experiments:

In [14]:
%xmen list -sm

    root           name      status
0  set_a  a=1.0__b=None  registered
1  set_a   a=1.0__b=cat  registered
2  set_a  a=2.0__b=None  registered
3  set_a   a=2.0__b=cat  registered

Roots relative to: /private/tmp/experiments


In [15]:
xmen run "*" sh


Running: sh /private/tmp/experiments/set_a/a=1.0__b=cat/run.sh
a = 1.0, b = cat
The experiment state inside run is running
AnExperiment

Running: sh /private/tmp/experiments/set_a/a=1.0__b=None/run.sh
a = 1.0, b = None
The experiment state inside run is running
AnExperiment

Running: sh /private/tmp/experiments/set_a/a=2.0__b=cat/run.sh
a = 2.0, b = cat
The experiment state inside run is running
AnExperiment

Running: sh /private/tmp/experiments/set_a/a=2.0__b=None/run.sh
a = 2.0, b = None
The experiment state inside run is running
AnExperiment


The experiments have now finished. Any messages sent during execution are recorded:

In [16]:
%xmen list -m

    root           name          time
0  set_a  a=1.0__b=None  1.583772e+09
1  set_a   a=1.0__b=cat  1.583772e+09
2  set_a  a=2.0__b=None  1.583772e+09
3  set_a   a=2.0__b=cat  1.583772e+09

Roots relative to: /private/tmp/experiments


Xmen allows the user to define how experiments should be run. In this case we used `sh` but `docker`, `sbatch` and `screen` are all equally valid for example. In order to control experiments requiring configuration a header can be added to the global config which is then prepended to each `script.sh` generated by the experiment manager. For example an sbatch header might look something like:

In [17]:
!cat ~/xmen/examples/header.txt

#SBATCH --nodes=1
#SBATCH --job-name=single_job
#SBATCH --time=1-00:00:00
#SBATCH --gres=gpu:rtx:1
#SBATCH --partition=htc-nova
#SBATCH --cpus-per-task=2
#SBATCH --mail-user=robw@robots.ox.ac.uk
#SBATCH --mail-type=ALL
#SBATCH --account=engs-a2i
# Author: Rob Weston
# Email: robw@robots.ox.ac.uk

# Lets activate an environment
conda active xmen


# Maybe we want to set some environement variables
PATH1="..."
PATH2="..."


### Scaling Up
Xmen allows experiements to be run remotely and linked locally. For example locally the command:

```bash
xmen list --csv -v "*"
```

Gives me the following:

In [18]:
import pandas as pd
pd.set_option('max_columns', 1000)
pd.set_option('max_rows', 30)
pd.read_csv('~/xmen/examples/all.csv', skipfooter=1, index_col=0, engine='python')

Unnamed: 0,root,name,created,type,purpose,origin,mac,host,user,home,status,commit,Unnamed: 13,last_checkpoint,step
0,arc/there_and_back_2/7-02-2019/test_2,w_cycle_a=2.0__w_cycle_b=2.0,"05:21PM February 12, 2020",ThereAndBack,Another experimemt,/data/engs-a2i/kebl4674/experiments/there_and_...,0x848f69fd4f1c,arcus-htc-login02.arcus-htc.arc.local,kebl4674,/home/kebl4674,registered,b2abee15800a6afd7de17bb8cac380c3a8b06765,,,
1,arc/there_and_back_2/7-02-2019/test_2,w_cycle_a=2.0__w_cycle_b=3.0,"05:21PM February 12, 2020",ThereAndBack,Another experimemt,/data/engs-a2i/kebl4674/experiments/there_and_...,0x848f69fd4f1c,arcus-htc-login02.arcus-htc.arc.local,kebl4674,/home/kebl4674,registered,b2abee15800a6afd7de17bb8cac380c3a8b06765,,,
2,arc/there_and_back_2/7-02-2019/test_2,w_cycle_a=2.0__w_cycle_b=4.0,"05:21PM February 12, 2020",ThereAndBack,Another experimemt,/data/engs-a2i/kebl4674/experiments/there_and_...,0x848f69fd4f1c,arcus-htc-login02.arcus-htc.arc.local,kebl4674,/home/kebl4674,registered,b2abee15800a6afd7de17bb8cac380c3a8b06765,,,
3,arc/there_and_back_2/7-02-2019/test_2,w_cycle_a=3.0__w_cycle_b=2.0,"05:21PM February 12, 2020",ThereAndBack,Another experimemt,/data/engs-a2i/kebl4674/experiments/there_and_...,0x848f69fd4f1c,arcus-htc-login02.arcus-htc.arc.local,kebl4674,/home/kebl4674,registered,b2abee15800a6afd7de17bb8cac380c3a8b06765,,,
4,arc/there_and_back_2/7-02-2019/test_2,w_cycle_a=3.0__w_cycle_b=3.0,"05:21PM February 12, 2020",ThereAndBack,Another experimemt,/data/engs-a2i/kebl4674/experiments/there_and_...,0x848f69fd4f1c,arcus-htc-login02.arcus-htc.arc.local,kebl4674,/home/kebl4674,registered,b2abee15800a6afd7de17bb8cac380c3a8b06765,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
184,touchstone/there_and_back/initial_test_3,impl=cycle_gan,"03:01PM November 12, 2019",ThereAndBackCycleGan,After discriminator bug fix was found,/home/robw/data/experiments/there_and_back/ini...,0xac220bb0d28f,touchstone,robw,/home/robw,error,3f09ce7ed8ff8411be8096a38f98e6c547f1e084,,"step 126000, epoch 2",
185,touchstone/there_and_back/initial_test_4,impl=there_and_back,"12:49PM November 13, 2019",ThereAndBackCycleGan,To test new debugs,/home/robw/data/experiments/there_and_back/ini...,0xac220bb0d28f,touchstone,robw,/home/robw,error,ada2a4e6deb7d5804f8acd42cb59790162ab9ccd,,"step 275000, epoch 5",
186,touchstone/there_and_back/initial_test_4,impl=cycle_gan,"12:49PM November 13, 2019",ThereAndBackCycleGan,To test new debugs,/home/robw/data/experiments/there_and_back/ini...,0xac220bb0d28f,touchstone,robw,/home/robw,error,ada2a4e6deb7d5804f8acd42cb59790162ab9ccd,,"step 275000, epoch 5",
187,touchstone/there_and_back/original_working,impl=cycle_gan,"07:29PM November 11, 2019",ThereAndBackCycleGan,To test that original models still work,/home/robw/data/experiments/there_and_back/ori...,0xac220bb0d28f,touchstone,robw,/home/robw,error,1768ec2065f5450d0cb640ecddfe7ef63d2f8dd3,,"step 93000, epoch 1",


It is also possible to search for experiments by name and parameters for example:

```bash
xmen list --csv "*" -p "w_.*" -n "ThereAndBack"
```

produces...

In [19]:
import pandas as pd
pd.read_csv('~/xmen/examples/param_match.csv', skipfooter=1, index_col=0, engine='python')

Unnamed: 0,root,name,w_id,w_cycle_a,w_cycle_b,w_align_a,w_align_b
0,arc/there_and_back_2/7-02-2019/test_2,w_cycle_a=2.0__w_cycle_b=2.0,,2.0,2.0,,
1,arc/there_and_back_2/7-02-2019/test_2,w_cycle_a=2.0__w_cycle_b=3.0,,2.0,3.0,,
2,arc/there_and_back_2/7-02-2019/test_2,w_cycle_a=2.0__w_cycle_b=4.0,,2.0,4.0,,
3,arc/there_and_back_2/7-02-2019/test_2,w_cycle_a=3.0__w_cycle_b=2.0,,3.0,2.0,,
4,arc/there_and_back_2/7-02-2019/test_2,w_cycle_a=3.0__w_cycle_b=3.0,,3.0,3.0,,
5,arc/there_and_back_2/7-02-2019/test_2,w_cycle_a=3.0__w_cycle_b=4.0,,3.0,4.0,,
6,arc/there_and_back_2/7-02-2019/test_2,w_cycle_a=4.0__w_cycle_b=2.0,,4.0,2.0,,
7,arc/there_and_back_2/7-02-2019/test_2,w_cycle_a=4.0__w_cycle_b=3.0,,4.0,3.0,,
8,arc/there_and_back_2/7-02-2019/test_2,w_cycle_a=4.0__w_cycle_b=4.0,,4.0,4.0,,
9,arc/there_and_back_2/7-02-2019/rad2real,use_gan=True__w_align_a=1.0__w_align_b=1.0,,,,1.0,1.0
