added docs subdirectory

IBIC · May 22, 2017 · 5c5f797 · 5c5f797
1 parent 2e32869
commit 5c5f797
Show file tree

Hide file tree

Showing 21 changed files with 2,565 additions and 1,191 deletions.
diff --git a/README.md b/README.md
@@ -5,18 +5,27 @@
 # Overview
 
 This project contains an in-development R package (called
-`neuropointillist`) which defines functions to help combine multiple
+`neuropointillist`) which defines functions to help scientists to run voxel-wise models using R on neuroimaging data. Why would one like to do this, rather than using a dedicated fMRI analysis packages?
+
+First, fMRI analysis packages are generally quite limited in the models that they can run. This package can help you run structural equation models on your fMRI data if you wish.
+
+Second, it is really instructive to understand what fMRI software is doing with your data. T statistics are so much sweeter when you have generated them with your own R code.
+
+The `neuropointillist` package has functions to combine multiple
 sets of neuroimaging data, run arbitrary R code (a "model") on each
 voxel in parallel, output results, and reassemble the data. Included
-are three standalone programs. `npointillist` and `npointrun` use the
+are three standalone programs. `npoint` and `npointrun` use the
 `neuropointillist` package, and `npointmerge` uses FSL commands to
 reassemble results.
 
+There are some examples included in this package that use data that we cannot release. These are useful only for looking at modeling code or for inspiration. However, we have simulated two timepoints of fMRI data and have a complete example and a worked vignette.
+
+
 # Installation
 
-You will need the R packages `Rniftilib`, `argparse`, and `doParallel` to be installed. In addition, to locally install the development `neuropointillist` R library, it might help to have the R package `devtools`. If you are actually doing development, you should also install the R package `roxygen2`.
+You will need the R packages `Rniftilib`, `argparse`, and `doParallel` to be installed. 
+
 
-`devtools` depends on the Debian package `libcurl4-openssl-dev`, so you might need a system administrator to make sure that is installed.
 
 `argparse` requires Python version >= 2.7 and Python packages `argparse` and `json`. 
 
@@ -28,9 +37,13 @@ Once all prerequisites are installed and you have pulled the `neuropointillist`
 install.packages("neuropointillist", repos=NULL, type="source")
 ```
 
-or 
+If you are planning to do development on the R package, it might help to have the R package `devtools`. If you are actually doing development, you should also install the R package `roxygen2`.
+
+`devtools` depends on the Debian package `libcurl4-openssl-dev`, so you might need a system administrator to make sure that is installed. If you have installed `devtools`, you can locally install the package as follows.
+
 
 ``` R
+library(devtools)
 install("neuropointillist")
 ```
 **! Note that `neuropointillist` requires R version >= 3.2.3**
@@ -71,9 +84,9 @@ The setlabel files are csv files that specify variables that correspond to the f
 
 `--sgeN N` Alternatively, the `--sge` argument specifies to read the data and divide it into `N` jobs that can be submitted to  the SGE (using a script that is generated called, suggestively, `runme`) or divided among machines by hand and run using GNU make. If SGE parallelism is used, we assume that the directory that the program is called from is read/writeable from all cluster nodes. **See notes below on running a model using SGE parallelism.**
 
-`--output` Specify an output prefix that is prepended to output files. This is useful for organizing output for SGE runs; you can specify something like `--output model-stressXtime/mod1` to organize all the output files and execution scripts into a subdirectory. In addition, the model that you used and the calling arguments will be copied with this prefix so that you can remember what you ran. This is modeled off of how FSL FEAT copies the .fsf file into the FEAT directory (so simple and so handy)!
+`--output` Specify an output prefix that is prepended to output files. This is useful for organizing output for SGE runs; you can specify something like `--output model-stressXtime/mod1` to organize all the output files and execution scripts into a subdirectory. In addition, the model that you used and the calling arguments will be copied with this prefix so that you can remember what you ran. This is modeled off of how FSL FEAT copies the .fsf file into the FEAT directory (so simple and so handy)! (**required**)
 
-`--debug debugfile` Write out external representations of the design matrix, the fMRI data, and a function called `imagecoordtovertex`, which maps three-dimensional image coordinates (e.g. from fslview) into a vertex number, to the file `debugfile`. This may be useful for development and testing of your model, or troubleshooting problems with the setfiles or covariate files. 
+`--debug debugfile` Write out external representations of the design matrix, the fMRI data, and a function called `imagecoordtovertex`, which maps three-dimensional image coordinates (e.g. from fslview) into a vertex number, to the file `debugfile`. This may be useful for development and testing of your model, or troubleshooting problems with the setfiles or covariate files. The debugfile will be prefixed by the output prefix. See the Vignette for instructions for how to use the debugfile.
 
 ## Writing the processVoxel function
 This function takes a single value `v` which is a numeric index for a voxel. The code should also load any libraries that you need to support your model (e.g., `nlme`). Before calling `processVoxel`, the code will have attached to the design matrix, so that you have access to all of the named columns in this matrix. Note that we attach to minimize memory copies that might be necessary when running in multicore mode.
@@ -99,40 +112,45 @@ different machines.
 
 The `readargs.R` file in `example.rawfmri` is configured so that it will create a directory called `sgetest` with the assembled design matrix file (in rds format), the split up fMRI data (also in rds format), and files to run the job. These files are:
 
-`Makefile` This file contains the rules for running each subjob and assembling the results. Note that the executables `npointrun` and `npointmerge` must be in your path environment. You can run your job by typing `make -j <ncores>` at the command line in the `sgetest` directory. You can also type `make mostlyclean` to remove all the intermediate files once your job has completed and you have reassembled your output (by any method). If instead you type `make clean`, you can remove all the rds files also. 
-
+`Makefile` This file contains the rules for running each subjob and assembling the results. Note that the executables `npointrun` and `npointmerge` must be in your path environment. You can run your job by typing `make -j <ncores>` at the command line in the `sgetest` directory, or by calling the script `runme.local`. You can also type `make mostlyclean` to remove all the intermediate files once your job has completed and you have reassembled your output (by any method). If instead you type `make clean`, you can remove all the rds files also. 
 
 `sgejob.bash` This is the job submission script for processing the data using SGE. Note that `npointrun` needs to be in your path. The commands in the job submission script are bash commands.
 
-`runme` This script will submit the job to the SGE and call Make to merge the resulting files when the job has completed. It is an SGE/Make hybrid.
+`runme.sge` This script will submit the job to the SGE and call Make to merge the resulting files when the job has completed. It is an SGE/Make hybrid.
 
 ## Running a model using multicore parallelism
 
-The `readargs.R` file in the `example.flournoy` directory is configured so that it will use 24 cores to compare two models. Make sure that you change this number to be lower if your  machine does not have 24 cores.  
+The `readargs.R` file in the `example.flournoy` directory is configured so that it will use 24 cores to compare two models. You should change this number to be lower if your  machine does not have 24 cores. Note that data are not included for `example.flournoy`.
 
 Note that the syntax for trapping errors is a little bit different. We check to see whether the error inherits from `try-error` as in this example. 
 
 ``` R
+library(nlme)
+
 processVoxel <-function(v) {
     Y <- voxeldat[,v]
-    e <- try(mod1 <- lme(Y ~ AllCue+AllCueDerivative+AllTarget+AllTargetDerivat
-ive+conf1+conf2+conf3+conf4+conf5+conf6+conf7+conf8+conf9+conf10 +conf11+conf12
-+conf13+conf14, random=~1|idnum, method=c("ML"), na.action=na.omit, control=lme
-Control(returnObject=TRUE,singular.ok=TRUE)))
+    e <- try(mod <- lme(Y ~ High+Low, random=~1|subject, method=c("ML"), na.action=na.omit, corr=corAR1(form=~1|subject), control=lmeControl(returnObject=TRUE,singular.ok=TRUE)))
     if(inherits(e, "try-error")) {
-        mod1 <- NULL
+        mod <- NULL
     }
-    if(!is.null(mod1)) {
-        retvals <- list(summary(mod1)$tTable["AllCue", "t-value"],
-                        summary(mod1)$tTable["AllTarget", "t-value"])
+    if(!is.null(mod)) {
+        contr <- c(0, 1,-1)
+        out <- anova(mod,L=contr)
+        t.stat <- (t(contr)%*%mod$coefficients$fixed)/sqrt(t(contr)%*%vcov(mod)%*%contr)
+        p <- 1-pt(t.stat,df=out$denDF)
+        retvals <- list(summary(mod)$tTable["High", "t-value"],
+                        summary(mod)$tTable["Low", "t-value"], t.stat, p)
     } else {
-        retvals <- list(999,999)
+    # If we are returning 4 dimensional data, we need to be specify how long
+    # the array will be in case of errors
+        retvals <- list(999,999,999,999)
     }
-    names(retvals) <- c("tstat-AllCue", "tstat-AllTarget")
+    names(retvals) <- c("tstat-High", "tstat-Low", "tstat-High.gt.Low", "p-High.gt.Low")
     retvals
 }
-```
 
+
+```
 # npointrun
 Run a model on a single data set sequentially, without data splitting
 

diff --git a/docs/index.md b/docs/index.md
@@ -0,0 +1 @@
+# Documentation
diff --git a/example.flournoy/README b/example.flournoy/README
diff --git a/example.flournoy/mask.nii.gz b/example.flournoy/mask.nii.gz
diff --git a/example.flournoy/model.R b/example.flournoy/model.R
@@ -6,14 +6,14 @@ library(nlme)
 
 processVoxel <-function(v) {
 Y <- voxeldat[,v]
-tryCatch({
-    mod <- lme(Y ~ time + domain + target + target*domain, random=~1+time|idnum, method=c("ML"), na.action=na.omit, control=lmeControl(returnObject=TRUE,singular.ok=TRUE))}, error=function(e){
-        message("error thrown at voxel",v)
-        message(e)
-        return(NULL)})
+e <- try(mod <- lme(Y ~ time + domain + target + target*domain, random=~1+time|idnum, method=c("ML"), na.action=na.omit, control=lmeControl(returnObject=TRUE,singular.ok=TRUE)))
+if (inherits(e, "try-error")) {
+    mod <- NULL
+}
 if (!is.null(mod)) {
     mod.tt <- summary(mod)$tTable
-    retvals <- list(mod.tt["time", "p-value"],
+    retvals <- list(
+        mod.tt["time", "p-value"],
                 mod.tt["domain", "p-value"],
                 mod.tt["target", "p-value"],
                 mod.tt["domain:target", "p-value"])

diff --git a/example.flournoy/model2.R b/example.flournoy/model2.R
@@ -5,29 +5,58 @@ library(nlme)
 
 processVoxel <-function(v) {
     Y <- voxeldat[,v]
-    tryCatch({
-        mod1 <- lme(Y ~ age+ time + domain + target, random=~1+time|idnum, method=c("ML"), na.action=na.omit, control=lmeControl(returnObject=TRUE,singular.ok=TRUE))}, error=function(e){
+    e <-    try( mod1 <- lme(Y ~ age+ time + domain + target, random=~1+time|idnum, method=c("ML"), na.action=na.omit, control=lmeControl(returnObject=TRUE,singular.ok=TRUE)))
+    if (inherits(e, "try-error")) {
             message("error thrown at voxel",v)
             message(e)
-            return(NULL)})
-    tryCatch({
-        mod2 <- lme(Y ~ age + time + domain + target + target*domain, random=~1+time|idnum, method=c("ML"), na.action=na.omit, control=lmeControl(returnObject=TRUE,singular.ok=TRUE))}, error=function(e){
-            message("error thrown at voxel",v)
-            message(e)
-            return(NULL)})
+            mod1 <- NULL
+        }
+    e <- try( mod2 <- lme(Y ~ age + time + domain + target + target*domain, random=~1+time|idnum, method=c("ML"), na.action=na.omit, control=lmeControl(returnObject=TRUE,singular.ok=TRUE)))
+    if (inherits(e, "try-error")) {
+        message("error thrown at voxel",v)
+        message(e)
+        mod2 <- NULL
+    }
     if (!is.null(mod1) & !is.null(mod2)) {
         comparison <- anova(mod1, mod2)
         mod2v1.aic <- comparison$AIC[2]-comparison$AIC[1] 
         mod2v1.bic <- comparison$BIC[2]-comparison$BIC[1] 
         mod2v1.n2ll <- -2*(comparison$logLik[2]-comparison$logLik[1])
         mod2v1.n2llp <- comparison$`p-value`[2]
+        mod2.tt <- summary(mod2)$tTable
+        mod1.tt <- summary(mod1)$tTable
         retvals <- data.frame(mod2v1.aic,
                       mod2v1.bic,
                       mod2v1.n2ll,
-                      mod2v1.n2llp)
+                              mod2v1.n2llp,
+                              mod2.tt["time", "p-value"],
+                              mod2.tt["age", "p-value"],                        
+                              mod2.tt["domain", "p-value"],
+                              mod2.tt["target", "p-value"],
+                              mod2.tt["domain:target", "p-value"],
+                              mod2.tt["time", "t-value"],
+                              mod2.tt["age", "t-value"],                              
+                              mod2.tt["domain", "t-value"],
+                              mod2.tt["target", "t-value"],
+                              mod2.tt["domain:target", "t-value"],
+                              mod1.tt["time", "p-value"],
+                              mod1.tt["age", "p-value"],                              
+                              mod1.tt["domain", "p-value"],
+                              mod1.tt["target", "p-value"],
+                              mod1.tt["time", "t-value"],
+                              mod1.tt["age", "t-value"],                        
+                              mod1.tt["domain", "t-value"],
+                              mod1.tt["target", "t-value"])
+
     } else {
-        retvals <- list(999,999,999,999)
+        retvals <- list(999,999,999,999,999,999,999,999,999,999,999,999,999,999,999,999,999,999,999,999,999,999)
     }
-    names(retvals) <- c("dAIC", "dBIC", "LR", "LRp")
+    names(retvals) <- c("dAIC", "dBIC", "LR", "LRp",
+                        "mod2time.pvalue", "mod2age.pvalue","mod2domain.pvalue", "mod2target.pvalue", "targetXdomain.pvalue",
+                        "mod2time.tvalue", "mod2age.tvalue","mod2domain.tvalue", "mod2target.tvalue", "targetXdomain.tvalue",
+                        "mod1time.pvalue", "mod1age.pvalue", "mod1domain.pvalue", "mod1target.pvalue",
+                        "mod1time.tvalue", "mod1age.tvalue", "mod1domain.tvalue", "mod1target.tvalue"
+
+                        )                        
     retvals
 }
diff --git a/example.flournoy/readargs.R b/example.flournoy/readargs.R
@@ -1,12 +1,13 @@
 cmdargs <- c("-m","mask.nii.gz", "--set1", "setfilenames1.txt",
              "--set2", "setfilenames2.txt",
-             "--set3", "setfilenames3.txt",
-             "--setlabels1", "setlabels1.csv",
-             "--setlabels2", "setlabels2.csv",
-             "--setlabels3", "setlabels3.csv",
+             "--set3", "setfilenames3.txt",             
+             "--setlabels1", "setlabelsn1.csv",
+             "--setlabels2", "setlabelsn2.csv",
+             "--setlabels3", "setlabelsn3.csv",             
              "--covariates", "Flournoy.new.csv",
-             "--model", "model2.R",
-             "--output", "comparemodels/fl.",
+             "--model", "logModel.R",
+             "--testvoxel", "10000",
+             "--output", "logonly/fl.",
              "--debugfile", "debugfileoutput",
-             "--sgeN", "24")
+             "--sgeN", "50")
 
diff --git a/example.lavaan/correlated_lgc_model.R b/example.lavaan/correlated_lgc_model.R
@@ -1,4 +1,4 @@
-adf <- read.csv('/home/jflournoy/code/neuropointillist/example.lavaan/Flournoy.new.csv')
+adf <- read.csv('Flournoy.new.csv')
 
 #assume v is thet voxel-wise estimate of the self-other contrast
 adf$v <- rnorm(dim(adf)[1], 0, 1)
@@ -34,7 +34,7 @@ summary(cor_lgc_fit)
 #many options in lavaan for extracting information from the model
 #see http://lavaan.ugent.be/tutorial/inspect.html
 param_ests <- parameterEstimates(cor_lgc_fit)
-std_param_ests <- standardizedsolution(cor_lgc_fit)
+std_param_ests <- standardizedsolution(cor_lgc_fit,type="std.all")
 fit_measures <- fitMeasures(cor_lgc_fit)
 
 #the user will almost certainly want the values for the intercepts and covariances of the latent growth parameters

diff --git a/example.rawfmri/fmrimodel.R b/example.rawfmri/fmrimodel.R
@@ -5,19 +5,23 @@ library(nlme)
 
 processVoxel <-function(v) {
     Y <- voxeldat[,v]
-    e <- try(mod1 <- lme(Y ~ AllCue+AllCueDerivative+AllTarget+AllTargetDerivative+conf1+conf2+conf3+conf4+conf5+conf6+conf7+conf8+conf9+conf10 +conf11+conf12+conf13+conf14, random=~1|idnum, method=c("ML"), na.action=na.omit, control=lmeControl(returnObject=TRUE,singular.ok=TRUE)))
+    e <- try(mod <- lme(Y ~ High+Low, random=~1|subject, method=c("ML"), na.action=na.omit, corr=corAR1(form=~1|subject), control=lmeControl(returnObject=TRUE,singular.ok=TRUE)))
     if(inherits(e, "try-error")) {
-        mod1 <- NULL
+        mod <- NULL
     }
-    if(!is.null(mod1)) {
-        retvals <- list(summary(mod1)$tTable["AllCue", "t-value"],
-                        summary(mod1)$tTable["AllTarget", "t-value"])
+    if(!is.null(mod)) {
+        contr <- c(0, 1,-1)
+        out <- anova(mod,L=contr)
+        t.stat <- (t(contr)%*%mod$coefficients$fixed)/sqrt(t(contr)%*%vcov(mod)%*%contr)
+        p <- 1-pt(t.stat,df=out$denDF)
+        retvals <- list(summary(mod)$tTable["High", "t-value"],
+                        summary(mod)$tTable["Low", "t-value"], t.stat, p)
     } else {
     # If we are returning 4 dimensional data, we need to be specify how long
     # the array will be in case of errors
-        retvals <- list(999,999)
+        retvals <- list(999,999,999,999)
     }
-    names(retvals) <- c("tstat-AllCue", "tstat-AllTarget")
+    names(retvals) <- c("tstat-High", "tstat-Low", "tstat-High.gt.Low", "p-High.gt.Low")
     retvals
 }
 
diff --git a/example.rawfmri/readargs.R b/example.rawfmri/readargs.R
@@ -1,11 +1,9 @@
-cmdargs <- c("-m","fmrimask.nii.gz", "--set1", "setfilenames1.txt",
-             "--set2", "setfilenames2.txt",
-             "--set3", "setfilenames3.txt",
+cmdargs <- c("-m","mask_4mm.nii.gz", "--set1", "setfilenames1.txt",
+             "--set2", "setfilenames2.txt",             
              "--setlabels1", "setlabels1.csv",
-             "--setlabels2", "setlabels2.csv",
-             "--setlabels3", "setlabels3.csv",
+             "--setlabels2", "setlabels2.csv",             
              "--model", "fmrimodel.R",
-             "--output", "sgetest/a.",
+             "--output", "sgedata/sim.",
              "--debug", "debug.Rdata",
              "--sgeN", "10")
 
diff --git a/example.rawfmri/setfilenames1.txt b/example.rawfmri/setfilenames1.txt
@@ -1,2 +1,8 @@
-fmri/RC4101_A_func2std.nii.gz
-fmri/RC4103_A_func2std.nii.gz
+fmri/s1_Time01.nii.gz
+fmri/s2_Time01.nii.gz
+fmri/s3_Time01.nii.gz
+fmri/s4_Time01.nii.gz
+fmri/s5_Time01.nii.gz
+fmri/s6_Time01.nii.gz
+fmri/s7_Time01.nii.gz
+fmri/s8_Time01.nii.gz
diff --git a/example.rawfmri/setfilenames2.txt b/example.rawfmri/setfilenames2.txt
@@ -1,2 +1,8 @@
-fmri/RC4101_B_func2std.nii.gz
-fmri/RC4103_B_func2std.nii.gz
+fmri/s1_Time02.nii.gz
+fmri/s2_Time02.nii.gz
+fmri/s3_Time02.nii.gz
+fmri/s4_Time02.nii.gz
+fmri/s5_Time02.nii.gz
+fmri/s6_Time02.nii.gz
+fmri/s7_Time02.nii.gz
+fmri/s8_Time02.nii.gz
diff --git a/example.rawfmri/setfilenames3.txt b/example.rawfmri/setfilenames3.txt