##Hello cruel world demo of HTCondor
This uses a simple Python script to demonstrate a general HTCondor submit file. Of course, the intent is that the trivial example be replaced with meaningful work!

The general idea is that in many applications, it is useful to run a sequence of executions, each tagged with a number that they can use as input, internalize, and use to tag their output. To get going, make a simple Python script that accepts a single value as input, prints that value to a file and also to `stdout`

### script `cruel.py`
The function `cruel` accepts a single number as input and prints both to a file and to the screen. It also sleeps for five seconds just to simulate that any executable should take a little time to run.

In [None]:
import sys
import time


# define a little hello cruel world function
def cruel(run_number):
    # print out a simple statement to a file and include the run number
    with open('outfile{0}.dat'.format(run_number), 'w') as ofp:
        outmsg = 'Hello cruel world! I am but a number and it is {0}'.format(run_number)
        ofp.write(outmsg)

    # write some stuff to the screen as well just for fun
    print ('Hi there, from number {0}\n'.format(run_number))
    print ('Waiting for 5 long seconds')
    for i in range(5):
        sys.stdout.write(u'.')
        sys.stdout.flush()
        time.sleep(1)
    print ('\n')
    return outmsg

### Run it with argument `99` and see the output
Note that is printed to `stdout` and also wrote a file called `outfile99.dat`

In [None]:
cruel(99)

#Parallel on local machine
This simple example acn be run for a variety of input values. Since each one is independent, this is "embarassingly" or "pleasingly" parallel. So, we should be able to run it in parallel without needing to modify the function `cruel` at all. The simplest way is using the `multiprocessing` function in Python.

###script `parallel_cruel.py` in `HolaCruelWorld/demo_parallel_local`

To run this script, simply enter `python parallel_cruel.py 0 99 8 2 4` where the space-delimited sequence of numbers are values for which the function `cruel` will be run. The code below is modified from `parallel_cruel.py` slightly to run in this directory. Better to run from the command line, though, as it is a little clunky within the iPython notebook.


In [None]:
# import the hello cruel world function
# from cruel import cruel --> don't need to import since 
from multiprocessing import Pool
import sys

# read in the arguments passed from command line 
# args = [int(i) for i in sys.argv[1:]] --> just specify in place to run in the notebook
args = [0, 99, 8, 2, 4]
# set up a pool with 4 cores working
pool = Pool(4)

# run all the arguments
results = pool.map(cruel, args)

# now write out the results to a file
with open('alldata.dat', 'w') as ofp:
    for cresult in results:
        ofp.write('{0}\n'.format(cresult))



#Parallel on HTCondor --> simplest case
In the directory `HolaCruelWorld` are both a submit file `test0.sub` and the executable it will call `worker0.sh`

###`test0.sub`

Important notes:
 * `notification=Never` is __VITAL__ to avoid getting an email for every run!
 * The executable is `worker0.sh` with the argument being the sequential number assigned to each run (`$(Process)`) and this is passed on to the `cruel` function.
 * No information given on transferring output, so all new files generated on each worker will be returned to the submitting directory.
 * The `queue` argument specifies how many workers to run

```
notification=Never
universe = vanilla
log = log/worker_$(Cluster).log
output = log/worker_$(Cluster)_$(Process).out
error = log/worker_$(Cluster)_$(Process).err
executable = worker0.sh
stream_output = True
stream_error = True
arguments = $(Process)
requirements =((Target.OpSys=="LINUX")  && (Target.Arch=="X86_64"))
request_memory = 500
when_to_transfer_output = ON_EXIT
should_transfer_files = yes
transfer_input_files =  cruel.py
queue 2 
```

###`worker0.sh`
Very simple shell script that accepts the run sequential number as input and passes it on to the `cruel` function.

```
#!/bin/sh

# grab the number passed in from the master
run_number=$1
# run python script with the number passed in
python cruel.py $run_number
```

##Test this out!

Simply type `condor_status` to make sure resources are available. Output should look like:

```
Name                               OpSys      Arch   State     Activity LoadAv Mem  ActivityTime

slot1@igsarmewfsm000.xx.xxx.edu   LINUX      X86_64 Unclaimed Idle      0.070 2460  10+01:41:19
slot2@igsarmewfsm000.xx.xxx.edu   LINUX      X86_64 Unclaimed Idle      0.000 2460  10+01:41:52
... ... ... ...
slot1@IGSARMEWWSM46.xx.xxx.edu     WINDOWS    X86_64 Unclaimed Idle      0.570 32573   9+14:39:53
slot1@IGSARMEWWSM47.xx.xxx.edu     WINDOWS    X86_64 Unclaimed Idle      0.620 32708   3+00:34:42
slot1@IGSARMEWWSM48.xx.xxx.edu     WINDOWS    X86_64 Unclaimed Idle      0.600 32708   3+00:34:44
                     Machines Owner Claimed Unclaimed Matched Preempting

        X86_64/LINUX       44     0       0        44       0          0
      X86_64/WINDOWS       41     0       0        41       0          0

               Total       85     0       0        85       0          0

```


Then `condor_submit test0.sub` will produce:
```
Submitting job(s)..
2 job(s) submitted to cluster 16.
```

`condor_q` will show what is running:
```
-- Submitter: licon31.xx.xxx.edu : <###.##.###.##:####> : licon31.xx.xxx.edu
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD               
  17.0   someperson      8/13 16:46   0+00:00:01 R  0   0.0  worker0.sh 0      
  17.1   someperson      8/13 16:46   0+00:00:01 R  0   0.0  worker0.sh 1      

```


#Parallel on HTCondor --> better data management case
In the directory `HolaCruelWorld` are both a submit file `test0.sub` and the executable it will call `worker0.sh`

###`test1.sub`

Important notes:
 * `notification=Never` is __VITAL__ to avoid getting an email for every run!
 * The executable is now `worker1.sh` which uncompresses the `data.tar` file (in this case only contains `cruel.py` but typically would include all executables, dependencies, and data needed for each worker (unless `worker1.sh` is designed to pull data from a cache location, FTP, etc.)
 * This time, no files are generated in the base run directory because runs are performed in `data` directory (see `worker1.sh`). So, specific files are requested and the `remap` moves them into a `results` subdirectory on the submit machine.
 * You will need to tar up the data folder using `tar czf data.tar data` in the submit directory.

```
notification=Never
universe = vanilla
log = log/worker_$(Cluster).log
output = log/worker_$(Cluster)_$(Process).out
error = log/worker_$(Cluster)_$(Process).err
executable = worker1.sh
stream_output = True
stream_error = True
arguments = $(Process)
requirements =((Target.OpSys=="LINUX")  && (Target.Arch=="X86_64"))
request_memory = 500
when_to_transfer_output = ON_EXIT
should_transfer_files = yes
transfer_output_files = data/outfile$(Process).dat
transfer_output_remaps = "outfile$(Process).dat = results/outfile$(Process).dat"
transfer_input_files =data.tar
queue 1000  
```

###`worker1.sh`

```
#!/bin/sh

# grab the number passed in from the master
run_number=$1

# keeping our focus
SqUiRReL!!!!!!!!!!!

# uncompress the data folder and jump into it
tar xzf data.tar
cd data

# run python script with the number passed in
python cruel.py $run_number
```

Run this the same way -- `condor_submit test1.sub` and look at results in the `results` directory. Also, can run `consol_data.py` to dump all the results output files into a single file `alldata.dat`. This would, in reality, be a more sophisticated postprocessing routine, of course!

##Wrapup
Note with `test0.sub` what happens in the `log` directory. The main log file shows how the cluster ran overall (HTCondor's information). The `<file>.out` and `<file>.err` return standard out and standard error from each worker. What's different with `<file>.err` for `test1.sub`?