# HTMap

This is a notebook to show how the prototype `htmap` library (https://github.com/htcondor/htmap) works.

If you've messed up your cache directory somehow, running `htmap.clean()` will delete everything so you can start fresh.
Note that each map call is given a `map_id`, which are unique.
To run a new map with the same `map_id`, you must remove the old one.
Tools for managing existing maps are shown in the section on management, after going through the interface sections.

In [1]:
import htmap

In [2]:
htmap.clean()

These cells control which delivery mechanism is used to get Python to the execute node. Which cell is run last wins!

In [3]:
# switch to 'assume' delivery
htmap.settings['PYTHON_DELIVERY'] = 'assume'

In [4]:
# switch to 'docker' delivery
htmap.settings['PYTHON_DELIVERY'] = 'docker'
htmap.settings['DOCKER.IMAGE'] = 'maventree/htmap:latest'

## Functional Interface (`map`-like)

`htmap` currently has two interfaces. The first is a very "functional", map-based interface.

In [5]:
def double(x):
    return 2 * x

Python's built-in `map` function works like this:

In [6]:
doubled = list(map(double, range(10)))
doubled

[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

To do the same with `htmap`, we just use the `map` function it provides instead. Note that `htmap` has persistence for completed jobs, so if you get a `clusterid` of `None`, you already have the outputs for all of your inputs cached.

In [7]:
result = htmap.map('double', double, range(10))
result

<MapResult(map_id = double)>

That function returns a `MapResult` which we can use to get information about the running jobs.

We can get a snapshot of the map progress by using the `status()` method on the `MapResult`:

In [8]:
result.status()

Map double (10 inputs): Held = 0 | Idle = 6 | Run = 4 | Done = 0

We can wait on the results of the map with a progress bar.

In [9]:
result.wait(show_progress_bar = True)

double: 100%|################################| 10/10 [00:05<00:00,  1.85input/s]


datetime.timedelta(0, 5, 18984)

To see the results, we iterate over the `MapResult` (passing it into the `list` constructor does this internally).

In [10]:
doubled_htc = list(result)
doubled_htc

[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

## Functional Interface w/ Decorator

The second interface has the same functional flavor to it, but uses a decorator on the function itself.

For those who care, the first interface is doing the same thing, but just hides the decorator from you.

I'll also use a slightly more complicated function to show off some other features. This function has two arguments, and one of them is a keyword argument.

In [11]:
@htmap.htmap
def power(x, p = 1):
    return x ** p

power

<MappedFunction(func = <function power at 0x7fad7c10ac80>, map_options = {'request_memory': '100MB', 'request_disk': '1GB'})>

As you can see, `power` is not actually a function, but instead a `HTMapper` which has a reference to the real function inside it. Because of Python voodoo, you can still call it like a normal function, running entirely locally:

In [12]:
power(5, 3)

125

We can't use `map` now because it only accepts a one-dimensional input. Instead, we'll use `starmap`. Both `map` and `starmap` are now methods of the `HTMapper` object. That does mean we have to contort things a little so that we're passing lists of tuples and dictionaries to `starmap`, which looks a little weird.

In [13]:
xs = [(x,) for x in range(10)]
powers = [{'p': p} for p in range(10)]

power_result = power.starmap('power', xs, powers)
power_result

<MapResult(map_id = power)>

We can iterate over the result ourselves. By doing it this way, they'll come back in order as soon as possible. The outputs should be 0^0, 1^1, 2^2, 3^3, etc. We'll use the `iter_with_inputs` method to see how the inputs are mapped to the outputs.

In [14]:
for inp, out in power_result.iter_with_inputs():
    print(f'{inp} -> {out}')

((0,), {'p': 0}) -> 1
((1,), {'p': 1}) -> 1
((2,), {'p': 2}) -> 4
((3,), {'p': 3}) -> 27
((4,), {'p': 4}) -> 256
((5,), {'p': 5}) -> 3125
((6,), {'p': 6}) -> 46656
((7,), {'p': 7}) -> 823543
((8,), {'p': 8}) -> 16777216
((9,), {'p': 9}) -> 387420489


## Looping Interface

The other interface is built to look like the same looping constructs that people are probably using before they start doing any HTC.

It relies on Python's `with` statement, which lets you run code before and after a block of code runs. It looks like this.

In [15]:
def triple(x):
    return 3 * x

In [16]:
with htmap.build_map('triple', triple) as map_builder:
    for x in range(10):
        map_builder(x)
        
triple_result = map_builder.result
triple_result

<MapResult(map_id = triple)>

Note that once we create the `MapBuilder`, stored in the variable `map_builder`, we can just call it as if it was the function we wanted to do a map on. The `MapBuilder` catches the calls and feeds them into the same backend that does the mapping above. I really like this because it's super-simple: you don't need to do anything weird with the arguments to fit them into the right shape for the map. If you can call your function normally, you can slap it in this `with` block, replace it with the `MapBuilder`, and do the map.

This time we'll iterate in an unordered way, as jobs come back (the previous iterators went in order, as available).

In [17]:
for r in triple_result.iter_as_available():
    print(r)

0
3
6
9
12
15
18
21
24
27


## Looping Interface w/ Decorator

Again, it's essentially the same, it's just that `build_map` is a method of the decorated function.

In [18]:
@htmap.htmap
def quadruple(x):
    return 4 * x

In [19]:
with quadruple.build_map('quadruple') as map_builder:
    for x in range(10):
        map_builder(x)
        
quadruple_result = map_builder.result
quadruple_result

<MapResult(map_id = quadruple)>

In [20]:
for r in quadruple_result:
    print(r)

0
4
8
12
16
20
24
28
32
36


## Controlling Maps

You can interact with the jobs behind a map by calling methods on the `MapResult`. Let's define a sleepy function so that we have time to interact with the jobs while they're running.

I'll use the command line `condor_q` here to prove that it's really working, along with the `MapResult`'s own `status()` method.

In [21]:
import time

@htmap.htmap
def sleep_and_double(x):
    time.sleep(10)
    return 2 * x

We can kill all the jobs associated with a `MapResult` using the `remove()` method.
This also removes all of the input, output, and log files associated with that map.
Therefore, this also frees up the `map_id` to use for another map.

In [22]:
sleepy_result = sleep_and_double.map('sleepy', range(10))

time.sleep(3)

!condor_q
print(sleepy_result.status())

sleepy_result.remove()

time.sleep(3)

!condor_q
# print(sleepy_result.status())



-- Schedd: jupyter0000.chtc.wisc.edu : <127.0.0.1:9618?... @ 09/21/18 16:18:35
OWNER  BATCH_NAME    SUBMITTED   DONE   RUN    IDLE  TOTAL JOB_IDS
karpel sleepy       9/21 16:18      _      1      9     10 242.0-9

Total for query: 10 jobs; 0 completed, 0 removed, 9 idle, 1 running, 0 held, 0 suspended 
Total for karpel: 10 jobs; 0 completed, 0 removed, 9 idle, 1 running, 0 held, 0 suspended 
Total for all users: 16 jobs; 0 completed, 0 removed, 9 idle, 1 running, 6 held, 0 suspended

Map sleepy (10 inputs): Held = 0 | Idle = 9 | Run = 1 | Done = 0


-- Schedd: jupyter0000.chtc.wisc.edu : <127.0.0.1:9618?... @ 09/21/18 16:18:38
OWNER BATCH_NAME      SUBMITTED   DONE   RUN    IDLE   HOLD  TOTAL JOB_IDS

Total for query: 0 jobs; 0 completed, 0 removed, 0 idle, 0 running, 0 held, 0 suspended 
Total for karpel: 0 jobs; 0 completed, 0 removed, 0 idle, 0 running, 0 held, 0 suspended 
Total for all users: 6 jobs; 0 completed, 0 removed, 0 idle, 0 running, 6 held, 0 suspended



We can also hold and release jobs (and the rest of the job actions, but I won't go over them here).

In [23]:
sleepy_result = sleep_and_double.map('sleepy', range(10))

time.sleep(3)

!condor_q
print(sleepy_result.status())

hold_output = sleepy_result.hold()

time.sleep(1)

print(sleepy_result.hold_reasons())

!condor_q
print(sleepy_result.status())

sleepy_result.release()

time.sleep(1)

!condor_q
print(sleepy_result.status())



-- Schedd: jupyter0000.chtc.wisc.edu : <127.0.0.1:9618?... @ 09/21/18 16:18:41
OWNER  BATCH_NAME    SUBMITTED   DONE   RUN    IDLE  TOTAL JOB_IDS
karpel sleepy       9/21 16:18      _      _     10     10 243.0-9

Total for query: 10 jobs; 0 completed, 0 removed, 10 idle, 0 running, 0 held, 0 suspended 
Total for karpel: 10 jobs; 0 completed, 0 removed, 10 idle, 0 running, 0 held, 0 suspended 
Total for all users: 16 jobs; 0 completed, 0 removed, 10 idle, 0 running, 6 held, 0 suspended

Map sleepy (10 inputs): Held = 0 | Idle = 10 | Run = 0 | Done = 0
 Input Index │ Hold Reason Code │                Hold Reason                
─────────────┼──────────────────┼───────────────────────────────────────────
      0      │        1         │ Python-initiated action. (by user karpel)
      1      │        1         │ Python-initiated action. (by user karpel)
      2      │        1         │ Python-initiated action. (by user karpel)
      3      │        1         │ Python-initiated action.

## Map ID Management

To get a list of all of the `map_id`s you have stored, do

In [24]:
maps = htmap.map_ids()
maps

('double', 'quadruple', 'sleepy', 'triple', 'power')

You can look at the status of all your maps using

In [25]:
print(htmap.status())

   Map ID  │ Held │ Idle │ Run │ Done │   Data  
───────────┼──────┼──────┼─────┼──────┼─────────
   double  │  0   │  0   │  0  │  10  │ 19.9 KB
 quadruple │  0   │  0   │  0  │  10  │ 20.0 KB
   sleepy  │  0   │  10  │  0  │  0   │  5.9 KB
   triple  │  0   │  0   │  0  │  10  │ 19.6 KB
   power   │  0   │  0   │  0  │  10  │ 20.1 KB
───────────┴──────┴──────┴─────┴──────┴─────────


To recover an existing `map_id`, use the module-level `recover` function:

In [26]:
recovered_result = htmap.recover(maps[0])
print(list(recovered_result))

[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]


## Error Handling

Let's make a job that we know will experience an exception on the execute node.

In [27]:
@htmap.htmap
def bad(x):
    return x / 0

In [28]:
bad_result = bad.map('bad', range(10))
bad_result

<MapResult(map_id = bad)>

Wait for the maps to finish. We can't use the `wait()` method because we aren't going to manage to produce any output files, which is what it's watching for. Instead, look at the cluster log (this runs forever, so you'll need to cancel the cell at some point using the black box in the notebook menu):

In [None]:
bad_result.tail()

Now we can inspect the stdout and stderr of each job using the `output` and `error` methods on the `MapResult`.
The argument is the index of the input.

In [30]:
print(bad_result.output(0))

Landed on execute node karpel-244.0-jupyter0000.chtc.wisc.edu (172.17.0.3) at 2018-09-21 21:19:10.760341
Local directory contents:
    /var/lib/condor/execute/dir_225944/d238998870ae18a399d03477dad0c0a8.in
    /var/lib/condor/execute/dir_225944/_condor_stderr
    /var/lib/condor/execute/dir_225944/docker_stderror
    /var/lib/condor/execute/dir_225944/.job.ad
    /var/lib/condor/execute/dir_225944/_condor_stdout
    /var/lib/condor/execute/dir_225944/func
    /var/lib/condor/execute/dir_225944/.docker_sock
    /var/lib/condor/execute/dir_225944/.update.ad
    /var/lib/condor/execute/dir_225944/.machine.ad
    /var/lib/condor/execute/dir_225944/.chirp.config
    /var/lib/condor/execute/dir_225944/condor_exec.exe

Running
    <function bad at 0x7f1a0a130a60>
with args
    (0,)
and kwargs
    {}



In [31]:
print(bad_result.error(0))

Traceback (most recent call last):
  File "./condor_exec.exe", line 60, in <module>
    main(arg_hash = sys.argv[1])
  File "./condor_exec.exe", line 56, in main
    run_func(arg_hash = arg_hash)
  File "./condor_exec.exe", line 46, in run_func
    output = fn(*args, **kwargs)
  File "<ipython-input-27-ed5913b3e3bc>", line 3, in bad
ZeroDivisionError: division by zero



In [32]:
bad_result.remove()