Skip to content

Latest commit

 

History

History
214 lines (156 loc) · 7.92 KB

CONTRIBUTING.md

File metadata and controls

214 lines (156 loc) · 7.92 KB

How to contribute

Outline

Running RStudio Server

After ssh'ing into spudhead, request an interactive session with sufficient memory.

ql 8g

Next start an instance of RStudio Server running in the background:

rstudio-start

rstudio-start is an alias for the following:

rserver --auth-none 0 --auth-validate-users 1 --auth-required-user-group $USER &

Then on your local computer, run the following:

ssh -N -f -L localhost:8787:spudling##:8787 user-name@pps-gateway.uchicago.edu

replacing spudling## with the name of the spudling where you started the RStudio Server instance, e.g. spudling87 or bigmem01, and user-name with your login ID.

Open a browser to the address http://127.0.0.1:8787/. Enter your username and password to access your RStudio instance. From here, you can choose "Open Project" and select singleCellSeq.Rproj.

When you're finished, you can close the browser tab, and then kill the RStudio instance.

rstudio-end

rstudio-end is an alias for the following:

pkill rserver && pkill rsession && find /tmp -name 'rstudio*' -type d -user $USER -exec rm -r {} + 2> /dev/null

If you re-start RStudio Server, you may receive a message about an error due to an unexpected crash. This is not a problem as long as you purposefully ran rstudio-end after having saved your files.

If you have trouble starting a new RStudio instance, remove all RStudio-related temporary files:

rstudio-clean

rstudio-clean is an alias for the following:

rm ~/.rstudio -r && find /tmp -name 'rstudio*' -type d -user $USER -exec rm -r {} + 2> /dev/null

Creating a new analysis

Here are the steps for creating a new analysis:

  • Open RStudio project singleCellSeq.Rproj.
  • Set working directory to analysis with setwd or "Session" -> "Set Working Directory" -> "To Files Pane Location".
  • Create a copy of template.Rmd.
  • Change the author, title, and date at the top of the file.
  • Add the analysis code.
  • Use the RStudio button "Preview HTML" to view the results.
  • Add the analysis to the list in index.Rmd.
  • Add, commit, and push the new analysis.
cd analysis
git add new-analysis.Rmd index.Rmd
git commit -m "Add new analysis on..."
git push origin master

Style guide

For consistency, we'll use the following conventions:

  • Name variables using snakecase, e.g. gene_exp_mat.
  • Name files with dashes, e.g. this-is-a-long-filename.txt.
  • Name directories with camelCase, e.g data, rawData.
  • Use <- for assignment.
  • Surround binary operators with spaces, e.g. 1 + 1, not 1+1.
  • Use two spaces for indentation.
  • Lines should not be longer than 80 characters.

When in doubt, use the style indicated either in Google's R Style Guide or Hadley's R Style Guide.

When writing text, aim to write one sentence per line. This makes it easier to understand edits when reviewing the version control log. The limit of 80 characters for code described above does not need to be applied to text.

Adding figures

We have configured knitr so that plots are automatically saved according to the following format: analysis/figure/<filename>/<chunk-name>.png, e.g. analysis/figure/total-counts.Rmd/counts-by-variables-1.png. Since these are generated by the code, they only appear in the gh-pages branch.

To add your own custom figure, follow the same format. First create a subdirectory in figure:

mkdir -p analysis/figure/<filename>.Rmd

Then add your figure to this directory. To display the figure in your file, use the following markdown syntax:

![Text to display when hovering over fig](figure/<filename>.Rmd/<figurename>.png)

By default we have Git ignore png files. To overide this behavior to add specific image files, use the -f flag (for force):

git add -f analysis/figure/<filename>.Rmd/<figurename>.png

Building the site

git checkout gh-pages
git merge master
cd analysis
make
git add -f *html figure/*
git commit -m "Build site."
git push origin gh-pages

Building the paper

We write the paper in R Markdown. We run the R Markdown files using knitr and then convert to various formats (HTML, PDF, Word) using pandoc. These are essentially the steps performed by the function render from the R package rmarkdown. By performing these steps manually, we have more control. As the main example, we can write each section in its own separate file and combine these into one document.

In order to make these convenient, we have written a Makefile to automate the process. The Makefile commands must be run in the paper subdirectory. Below is a description of the available options:

  • make or make all: Build HTML, PDF, and Word versions of paper
  • make html: Build HTML version of paper
  • make pdf: Build PDF version of paper
  • make word: Build Word version of paper
  • make clean: Delete all intermediate Markdown files and rendered HTML, PDF, and Word documents
  • make bibtex: Format reference file (see next section for details)

Adding citations

We manage citations using BibTeX. To start, select the references you wish to cite in your reference manager and export them in BibTeX format. For example, in Mendeley, choose "File" -> "Export...", making sure the file type is BibTeX. Each entry should have an ID, which by convention is the surname of the first author followed by the year of publication, e.g. "Santander2015". Unfortunately, if you are using EndNote to manage your reference library, you will have to manually add the ID for each entry. Using the example above, the first line of each entry should look like the following: @article{Santander2015,.

In the R Markdown file, cite the reference using the ID:

X does Y [@Santander2015].

If you need to cite multiple references at once, separate the citations with semicolons:

X does Y and Z [@Santander2015; @Zhang2014].

The file that contains all the references cited in the paper is refs.bib. Copy-paste your exported references into this file. Next from within the paper subdirectory, run the command make bibtex. This runs the Python script format-bibtex.py, which alphabetizes your new references and removes unneeded fields.

One problem that could occur is if more than one reference has the same ID. In this case, manually edit the IDs to make them unique. Using the example above, if refs.bib already contained a reference with the ID "Santander2015", you would need to change one to "Santander2015a" and the other to "Santander2015b" (preferably the lettering will be ordered based on publication date, but this is not critical).