This repo is intended to contain code to reproduce results from the paper, False Discovery Rates: A New Deal.
The hardest part may be making sure R is set up with all the right
packages installed. I have used the R package
packrat to try to
make this easier, but on some systems it may be easier to install all
the packages you need by hand. If things go wrong, look also at
"If things go wrong" below.
Preliminaries: I made the paper with
R v3.2.3, so you might start by installing this version. Or more recent versions should (!) also work. Install
pandoc v1.12.3or higher. You will also need a working
pdflatexinstallation to make the paper. The
mosekneeds to be installed, so do that now, too. Don't forget to follow instructions regarding the license file. Possibly you will come across other dependencies as you run the following steps.
Clone (or download and unzip) this repository.
Rpackages you need. I have tried to use the
packratpackage to automate this process, with some measure of success. To do it this way, start up
R(e.g. from the command line) within the repository directory. The first time you enter
.Rprofilefile will cause
Rto try to install all the packages you need to a local library in the
packratsubdirectory. (Specifically it should create a
packrat/libdirectory with more files in a subdirectory whose name will depend on your architecture.)
To do: Mention that the data generation scripts use the
package; see here.
If this does not work first time - e.g. because you don't have some
dependencies installed - then install the dependencies and try again.
This time on entering
R you will have to tell
packrat to try again
yourself by typing
packrat::restore(). If this still does not work
for you, or you already have the packages you need installed then you
may prefer to remove the packrat subdirectory and install the packages
you need yourself. Quit
Within the repository directory type
make clean. This will remove figure etc files that I have already included in the repository.
Within the repository directory type
make. This will try to:
i) Run all the code for the simulation studies. It will take a while (hours), so you might want to run it overnight. This should create a bunch of output files in the
outputdirectory. Particularly you will know that it worked iff you can find the files
ii) Build/render the .Rmd files in the
analysisdirectory. If successful you should have a file
analysis/index.htmlthat you can open to see a list of all the rendered files.
If you have problems (more than likely!) you might like to try each of
these steps in turn, by sequentially typing
make analysis, and
If things go wrong
If you have trouble installing Rmosek, maybe this will help.
Ultimately you don't need Rmosek to make things run - if you don't have Rmosek installed then the ashr package will use an EM algorithm instead. The results from this method are very similar to those from using the interior point method (but the interior point method is faster and provides better convergence).
If things go wrong in making the output files, try looking at the
.Routfiles created in the appropriate output subdirectory (
output/dsc-robust) to see what went wrong.
If things go wrong in making the analysis files, try looking at the
.htmlfiles produced to see what went wrong.
The directory structure here, and features of the
subdirectory (including the
Makefile), are based on
a brief summary of the directory structure.
analysis: Rmd files for investigations; will generate figures in
R: R scripts/functions used in analysis; no code that is actually run put here
output: largish files of data, typically created by code (e.g. post-processed data, simulations)
code: code used for preprocessing, generation of output files etc ; may be long-running
data: datasets that generally will not change once deposited
paper: the paper
packrat: a directory that contains information about all the R package used. See the R package
packratfor more details.
talks: any presentations