- Create an OSG Connect account. https://osgconnect.net/signup
- Join the project SimPrily
- Create an ssh key pair
Log onto Open Science Grid Connect :
ssh user-name@login01.osgconnect.net
Clone the entire repository. We only need the pegasus_workflow directory :
git clone https://github.com/agladstein/SimPrily.git
Start the Singularity container and run a small test. :
[agladstein@login02 ~]$ singularity shell --home $PWD:/srv --pwd /srv /cvmfs/singularity.opensciencegrid.org/agladstein/simprily\:latest
Singularity: Invoking an interactive shell within container...
$ bash
agladstein@login02:~$ export PATH=/usr/local/bin:/usr/bin:/bin
agladstein@login02:~$ python /app/simprily.py examples/eg2/Param_file_eg2.txt examples/eg2/model_file_eg2.csv 2 out_dir
All components of the Pegasus workflow are located in the directory pegasus_workflow
.
Start the workfow by running submit
on the command line from the pegasus_workflow
directory. There are 3 required arguments and 2 optional arguments :
./submit -p PARAM -m MODEL -j NUM [-g MAP] [-a ARRAY]
Required
-p PARAM The location of the parameter file -m MODEL The location of the model file -j NUM The number of jobs to run. The ID
will go from 1 to NUM
.
Optional
-g MAP The location of the genetic map file -a ARRAY The location of the array template file, in bed form
We recommend that all testing be done before submiting workflows to OSG. Therefore we do not include the verbose options. Pegasus provides run information, so we do not include the profile option with the OSG workflow.
e.g. (No pseudo array and no recombination map) :
./submit -p ../examples/eg2/param_file_eg2.txt -m ../examples/eg2/model_file_eg2.csv -j 10
e.g. (include pseudo array, but no recombination map) :
./submit -p ../examples/eg2/param_file_eg2_asc.txt -m ../examples/eg2/model_file_eg2_asc.csv -j 10 -a ../array_template/ill_650_test.bed
e.g. (recombination map, but no pseudo array) :
./submit -p ../examples/eg2/param_file_eg2.txt -m ../examples/eg2/model_file_eg2.csv -j 10 -g ../genetic_map_b37/genetic_map_GRCh37_chr1.txt.macshs
e.g. (include pseudo array, and recombination map) :
./submit -p ../examples/eg2/param_file_eg2_asc.txt -m ../examples/eg2/model_file_eg2_asc.csv -j 10 -a ../array_template/ill_650_test.bed -g ../genetic_map_b37/genetic_map_GRCh37_chr1.txt.macshs
To find the run times of the executable: :
pegasus-statistics -s all
Then, look at Transformation statistics
.
submit
-> tools/dax-generator
-> wrappers/run-sim.sh
submit
will run tools/dax-generator
, which constructs the workflow. The dax-generator
is the main Pegasus file. The dax-generator
creates the HTCondor dag file. It also tells Pegasus where the local files are and transfers files (from submit host to compute node) so they are available for the job. It also defines how to handle output files.
wrappers/run-sim.sh
is the wrapper that runs in the container. It modifies the environment, and runs SimPrily.
Coming soon!
In the meantime see this example of running SimPrily on an HPC cluster with PBS https://github.com/agladstein/ECOL-346-HPC-demo