## Background

For this example, we will use computational methods to estimate pi. First,
we will define a square inscribed by a unit circle from which we will 
randomly sample points. The ratio of the points outside the circle to 
the points in the circle is calculated which approaches pi/4. 

This method converges extremely slowly, which makes it great for a 
CPU-intensive exercise (but bad for a real estimation!).

## Job Script

Our code is a simple R script that does the estimation. It takes in a single argument, simply 
for the purposes of differentiating the jobs. 

In [1]:
cat mcpi.R

#!/usr/bin/env Rscript

args = commandArgs(trailingOnly = TRUE)
iternum = as.numeric(args[[1]]) + 100

montecarloPi <- function(trials) {
  count = 0
  for(i in 1:trials) {
    if((runif(1,0,1)^2 + runif(1,0,1)^2)<1) {
      count = count + 1
    }
  }
  return((count*4)/trials)
}
 
montecarloPi(iternum)


The header at the top of the file indicates that this script is 
meant to be run using R. 

> If you want to test the script, start a separate terminal window and then run the following 
> two cmmands to start an R container, and then run 
> the script using `Rscript`: 
> 
>     $ singularity shell \
	   /cvmfs/singularity.opensciencegrid.org/opensciencegrid/osgvo-r:3.5.0
     Singularity osgvo-r:3.5.0:~> ./mcpi.R 10
     [1] 3.14
     Singularity osgvo-r:3.5.0:~> exit
     $ 

If we were running a more intensive script, we would want to test our pipeline 
with a shortened, test script first.


## Create a Submit File and Log Directory

Now that we have our R script written and tested, 
we can begin building the submit file for our job. If we want to submit several 
jobs, we need to track log, out and error files for each
job. An easy way to do this is to use the Cluster and Process ID
values assigned by HTCondor to create unique files for each job in our 
overall workflow.

In the submit file below, we are separating the standard error and HTCondor 
log files from the standard output file, because the standard output file 
will have our results. 

In [2]:
cat R.container.submit

universe = container
container_image = /cvmfs/singularity.opensciencegrid.org/opensciencegrid/osgvo-r:3.5.0

executable = mcpi.R
arguments = $(Process)

#transfer_input_files 	= 
should_transfer_files 	= YES
when_to_transfer_output = ON_EXIT

output 	= output/mcpi.out.$(Cluster).$(Process)

log 	= logs/job.log.$(Cluster).$(Process)
error 	= logs/job.error.$(Cluster).$(Process)

request_cpus = 1
request_memory = 200MB
request_disk = 300MB

queue 100


Note the `queue 100`.  This tells Condor to enqueue 100 copies of this job
as one cluster. Also, notice the use of `$(Cluster)` and `$(Process)` to specify unique 
output files. HTCondor will replace these with the Cluster and Process ID numbers for each 
individual process within the cluster. 

## Submit the Jobs

Now it is time to submit our job! You'll see something like the following upon submission:

In [3]:
condor_submit R.container.submit

Submitting job(s)....................................................................................................
100 job(s) submitted to cluster 1.


Run `condor_q` to see this job
progress. Check your `logs` and `output` folder to see the individual output files.

In [4]:
condor_q



-- Schedd: jovyan@jupyter-email-3achristinakconnect-40gmail-2ecom : <127.0.0.1:9618?... @ 03/26/23 02:53:59
OWNER  BATCH_NAME    SUBMITTED   DONE   RUN    IDLE  TOTAL JOB_IDS
jovyan ID: 1        3/26 02:53      _      _    100    100 1.0-99

Total for query: 100 jobs; 0 completed, 0 removed, 100 idle, 0 running, 0 held, 0 suspended 
Total for all users: 100 jobs; 0 completed, 0 removed, 100 idle, 0 running, 0 held, 0 suspended



## Post Process⋅

Once the jobs are completed, you can use the information in the output files 
to calculate an average of all of our computed estimates of Pi.

To see this, we can use the command: 

In [31]:
cat output/mcpi.out* | awk '{ sum += $2; print $2"   "NR} END { print "---------------\n Grand Average = " sum/NR }'

2.88   1
3.168317   2
3.2   3
3.027027   4
3.285714   5
2.973451   6
3.263158   7
3.026087   8
3.137931   9
3.282051   10
2.949153   11
3.092437   12
2.980392   13
3.3   14
3.107438   15
3.04918   16
2.861789   17
3   18
2.944   19
3.047619   20
3.11811   21
3.03125   22
3.131783   23
3.145631   24
3.107692   25
2.870229   26
3.181818   27
3.157895   28
3.253731   29
3.288889   30
3.294118   31
3.416058   32
3.217391   33
3.165468   34
3.230769   35
3.2   36
3.245283   37
2.953271   38
3.296296   39
3.155963   40
---------------
 Grand Average = 3.12593
