Skip to content

Commit

Permalink
Merge pull request #1 from IBIC/master
Browse files Browse the repository at this point in the history
update my fork
  • Loading branch information
jflournoy committed Oct 17, 2018
2 parents 9897265 + 4b4bde8 commit e961d91
Show file tree
Hide file tree
Showing 19 changed files with 193 additions and 47 deletions.
8 changes: 3 additions & 5 deletions README.md
Expand Up @@ -17,15 +17,13 @@ The `neuropointillist` package has functions to combine multiple
sets of neuroimaging data, run arbitrary R code (a "model") on each
voxel in parallel, output results, and reassemble the data. Included
are three standalone programs. `npoint` and `npointrun` use the
`neuropointillist` package, and `npointmerge` uses FSL commands to
reassemble results.
`neuropointillist` package, and `npointmerge` uses reassembles results.


There are some examples included in this package that use data that we
cannot release. These are useful only for looking at modeling code or
for inspiration. However, we have simulated two timepoints of fMRI
data and have a complete example and a worked vignette.




Please direct all comments and complaints to Tara Madhyastha (madhyt@uw.edu).

42 changes: 39 additions & 3 deletions docs/usage.md
@@ -1,7 +1,7 @@
# npoint
## Usage
`npoint --set1 listoffiles1.txt --setlabels1 file1.csv --set2 listoffiles2.txt --setlabels2 file2.csv`
`--covariates covariatefile.csv --mask mask.nii.gz --model code.R [ -p N | --sgeN N] --output output`
`--covariates covariatefile.csv --mask mask.nii.gz --model code.R [ -p N | --sgeN N | --slurmN ] --output output`
`--debugfile outputfile `

If a file called `readargs.R` exists that sets a vector called `cmdargs`, this file will be read to obtain the arguments for `npoint` instead of taking them from the command line. This is intended to make it a little easier to remember the long lists of arguments.
Expand Down Expand Up @@ -30,7 +30,9 @@ The setlabel files are csv files that specify variables that correspond to the f

`-p x` The `-p` argument specifies that multicore parallelism will be implemented using `x` processors. An warning is given if the number of processors specified exceeds number of cores. **See notes below on running a model using multicore parallelism.**

`--sgeN N` Alternatively, the `--sge` argument specifies to read the data and divide it into `N` jobs that can be submitted to the SGE (using a script that is generated called, suggestively, `runme`) or divided among machines by hand and run using GNU make. If SGE parallelism is used, we assume that the directory that the program is called from is read/writeable from all cluster nodes. **See notes below on running a model using SGE parallelism.**
`--sgeN N` The `--sgeN` argument specifies to read the data and divide it into `N` jobs that can be submitted to the SGE (using a script that is generated called, suggestively, `runme.sge`) or divided among machines by hand and run using GNU make. If SGE parallelism is used, we assume that the directory that the program is called from is read/writeable from all cluster nodes. **See notes below on running a model using SGE parallelism.**

`--slurmN N` The `--slurmN` argument specifies to read the data and divide it into `N` jobs that can be submitted to a Slurm scheduler (using a script that is generated called, suggestively, `runme.slurm`) or divided among machines by hand and run using GNU make. If Slurm is used, the template file **slurmjob.bash** must be edited!!! Unlike SGE, Slurm works best if you give good estimates of the time your program will take to run, the amount of memory it needs, and if you select the number of jobs to make each one not very small. The file that is written is currently a template based on Harvard's cluster configuration. Like with SGE, we assume that the directory that the program is called from is read/writeable from all cluster nodes. At the risk of oversharing, Slurm's name derives from Simple Linux Utility for Resource Management, but I find it rather funny to sound it out in my head as I have been adding this feature. **See notes below on running a model using the Slurm Workload Manager.**

`--output` Specify an output prefix that is prepended to output files. This is useful for organizing output for SGE runs; you can specify something like `--output model-stressXtime/mod1` to organize all the output files and execution scripts into a subdirectory. In addition, the model that you used and the calling arguments will be copied with this prefix so that you can remember what you ran. This is modeled off of how FSL FEAT copies the .fsf file into the FEAT directory (so simple and so handy)! (**required**)

Expand Down Expand Up @@ -60,12 +62,46 @@ different machines.

The `readargs.R` file in `example.rawfmri` is configured so that it will create a directory called `sgetest` with the assembled design matrix file (in rds format), the split up fMRI data (also in rds format), and files to run the job. These files are:

`Makefile` This file contains the rules for running each subjob and assembling the results. Note that the executables `npointrun` and `npointmerge` must be in your path environment. You can run your job by typing `make -j <ncores>` at the command line in the `sgetest` directory, or by calling the script `runme.local`. You can also type `make mostlyclean` to remove all the intermediate files once your job has completed and you have reassembled your output (by any method). If instead you type `make clean`, you can remove all the rds files also.
`Makefile` This file contains the rules for running each subjob and assembling the results. Note that the executables `npointrun` and `npointmerge` must be in your path environment. You can run your job by typing `make -j <ncores>` at the command line in the `sgetest` directory, or by calling the script `runme.local`, which will use 4 cores by default. You can also type `make mostlyclean` to remove all the intermediate files once your job has completed and you have reassembled your output (by any method). If instead you type `make clean`, you can remove all the rds files also.

`sgejob.bash` This is the job submission script for processing the data using SGE. Note that `npointrun` needs to be in your path. The commands in the job submission script are bash commands.

`runme.sge` This script will submit the job to the SGE and call Make to merge the resulting files when the job has completed. It is an SGE/Make hybrid.

## Running a model using the Slurm Workload Manager


`Makefile` This file contains the rules for running each subjob and
assembling the results. Note that the executables `npointrun` and
`npointmerge` must be in your path environment. You can run your job
by typing `make -j <ncores>` at the command line, or by calling the script `runme.local`, which will use 4 cores by default. You can also type
`make mostlyclean` to remove all the intermediate files once your job
has completed and you have reassembled your output (by any method). If
instead you type `make clean`, you can remove all the rds files also.

`slurmjob.bash` This is the job submission script for submitting the
job to the Slurm Workload Manager. **Note that you must edit this file
before submitting the job.** The defaults that are written here
probably won't work for you; they are modeled after Harvard's NCF
cluster and should be thought of as placeholders. The first thing to
change is the partition, which is set to `ncf_holy` by default. You
will need to change this to a partition that you have access to on
your Slurm system. Next, you need to give a good estimate for the
amount of memory, in MB, your job will use (`--mem`). You can get a
reasonable estimate by running `make` on your local machine to run one
job sequentially. The `time` command will give you statistics on how
much memory each job uses. Assuming your jobs are approximately the
same size, you should be able to double this and use that figure as an
estimate. You also need to provide an estimate for the time you expect
each job to take; it will be terminated if the job does not complete
within that time.

`runme.slurm` This script will submit the job to the Slurm Workload
manager. The job is an array job that includes as many tasks as you
specified. You will get an email when your job has completed. At that
point, you can then come back to this directory and type `make` to
merge the output files.

## Running a model using multicore parallelism

The `readargs.R` file in the `example.flournoy` directory is configured so that it will use 24 cores to compare two models. You should change this number to be lower if your machine does not have 24 cores. Note that data are not included for `example.flournoy`.
Expand Down
2 changes: 1 addition & 1 deletion neuropointillist/DESCRIPTION
Expand Up @@ -10,4 +10,4 @@ Depends:
License: GPL(>=2), doParallel,argparse,Rniftilib
Encoding: UTF-8
LazyData: true
RoxygenNote: 5.0.1
RoxygenNote: 6.1.0.9000
1 change: 1 addition & 0 deletions neuropointillist/NAMESPACE
Expand Up @@ -8,3 +8,4 @@ export(npointWriteCallingInfo)
export(npointWriteMakefile)
export(npointWriteOutputFiles)
export(npointWriteSGEsubmitscript)
export(npointWriteSlurmsubmitscript)
3 changes: 3 additions & 0 deletions neuropointillist/R/npointWriteMakefile.R
Expand Up @@ -23,6 +23,7 @@ npointWriteMakefile <- function(prefix, resultnames, modelfile, designmat, makef
fileConn <- file(localscript)
writeLines(c("make -j 4\n"), fileConn)
Sys.chmod(localscript, "775")
close(fileConn)

fileConn <- file(makefile)
alltarget <- "all: $(outputs) "
Expand Down Expand Up @@ -60,4 +61,6 @@ npointWriteMakefile <- function(prefix, resultnames, modelfile, designmat, makef
paste(mostlyclean,collapse=""),
clean),
fileConn)
close(fileConn)

}
50 changes: 50 additions & 0 deletions neuropointillist/R/npointWriteSlurmsubmitscript.R
@@ -0,0 +1,50 @@
#' Write an output Slurm submit script
#'
#' Generate an Slurm submit script for the given workflow
#' @param prefix Prefix for output, to be prepended to outputs
#' @param resultnames List of names for the expected outputs
#' @param modelfile Name of the model file that contains the processVoxel command
#' @param designmat Design matrix
#' @param masterscript Name of the master submit script
#' @param jobscript Name of the job submission script
#' @param njobs Number of jobs to submit
#' @export
#' @examples
#'
#' npointWriteSlurmsubmitscript()
npointWriteSlurmsubmitscript <- function(prefix, resultnames, modelfile, designmat,masterscript,jobscript,njobs) {
dir <- dirname(prefix)
if (!dir.exists(dir)) {
dir.create(dir, recursive=TRUE)
}
# the name of one of the outputfiles that is created
outputfile <- paste(resultnames[1], ".nii.gz",sep="")
fileConnMaster <- file(masterscript)
fileConnJob <- file(jobscript)
writeLines(c("#!/bin/bash",
"# This script will submit jobs to Slurm. You can also run this job locally by typing make.",
paste("sbatch --array=1-", njobs, " ", basename(jobscript), sep=""),
"echo When you get mail from slurm that your job has completed, cd to this directory and type:",
"echo make"),
fileConnMaster)


writeLines(c("#!/bin/bash",
"\n",
"#Slurm submission options",
"#LOOK AT THESE AND EDIT TO OVERRIDE FOR YOUR JOB",
"#SBATCH -p ncf_holy",
"#SBATCH --mem 4000",
"#SBATCH --time 0-6:00",
"#SBATCH --mail-type=END",
"#SBATCH -o npoint_%A_%a.out",
"#SBATCH -o npoint_%A_%a.err", "export OMP_NUM_THREADS=1",
paste("MODEL=",modelfile,sep=""),
paste("DESIGNMAT=",designmat,sep=""),
"num=$(printf \"%04d\" $SLURM_ARRAY_TASK_ID)",
paste("npointrun -m ", basename(prefix), "${num}.nii.gz --model ${MODEL} -d ${DESIGNMAT}",sep=""),
"\n"),
fileConnJob)
Sys.chmod(masterscript, "775")
}

1 change: 0 additions & 1 deletion neuropointillist/man/npointCheckArguments.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 0 additions & 1 deletion neuropointillist/man/npointCheckSetLabels.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 0 additions & 1 deletion neuropointillist/man/npointMergeDesignmatWithCovariates.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 0 additions & 1 deletion neuropointillist/man/npointReadDataSets.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 0 additions & 1 deletion neuropointillist/man/npointReadSetFiles.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 0 additions & 1 deletion neuropointillist/man/npointSplitDataSize.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 0 additions & 1 deletion neuropointillist/man/npointWarnIfNiiFileExists.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions neuropointillist/man/npointWriteFile.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 4 additions & 2 deletions neuropointillist/man/npointWriteMakefile.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 0 additions & 1 deletion neuropointillist/man/npointWriteOutputFiles.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 0 additions & 1 deletion neuropointillist/man/npointWriteSGEsubmitscript.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

31 changes: 31 additions & 0 deletions neuropointillist/man/npointWriteSlurmsubmitscript.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0 comments on commit e961d91

Please sign in to comment.