Laurent Gatto edited this page Jul 6, 2017 · 11 revisions
Clone this wiki locally

Getting started at CPU

The Computational Proteomics Unit web page: https://lgatto.github.io/cpu-lab/

Essential tools

  • Learn git (and a bit more about git), github and here.
  • Learn the shell
  • Arguably the most important piece of software when doing computational work is an editor. Choose one wisely. It might take time to master it, but it is definitely a good time investment.

R programming

Style guide

  • Consistency is key.
  • R installation.
  • Coding style: we follow the Bioconductor coding style. Also, use TRUE/FALSE instead of T/F.
  • Use <- for assignments.
  • Use a dotted function name for internal function: .internalFunction.
  • We generally prefer camel case. Snake case would be fine for a set of related internal helper functions: something like is_scalar_character, is_logical_character, ... that all return a logical(1). Never ever mix (exported) snake and camel case for one package. Remember, consistency is key.


OO programming

  • For OO programming, prefer S4 over S3. If relevant, use S4 Reference Classes. Consider R6, but discuss/motivate your choice.
  • Only use generics and methods when using a function is not possible at all, i.e. the same function name is used for different classes (within the same or different package).
  • Before writing a new generic, check if it doesn't already exists in BiocGenerics or ProtGenerics and consider asking the new generic to be added in one of those if relevant.

R package development

  • Use Authors@R to define authors and their respective roles in the DESCRIPTION file.
  • Making R packages: maker.
  • devtools and roxygen.
  • Note: roxygen is not only valuable in the frame of package development. Documenting a function with it outside of a package is recommended.
  • Use git/GitHub - all discussion about the package (software architecture, bugs, vignettes, ...) should be done through GitHub issues.
  • use maker to automate and standardise development.
  • Use a .Rbuildignore file to bundle only what it needed (see below)
  • Use testthat for unit testing.
  • Use covr for coverage.
  • Use travis-ci and codecov for continuous integration. Ideally also test on Windows using appveyor.
  • Use BiocStyle::html_document2() with floating toc for vignettes.
  • Write a README.md file. If it contains R code (that would be a good thing), use a README.Rmd files, with a pre-commit hook and use make README from maker to build the md file. The Rmd file should be added to .Rbuildignore so as to only keep the md file the file package bundle (tar ball).
  • Write a NEWS.md file. The NEWS file should then be generated from the latter using make NEWS from maker. NEWS.md to be added to .Rbuildignore and the latter to .gitignore(s).
  • Use pkgdown to generate the package webpage. Do not add the docs directory to the Bioconductor svn server (see https://lgatto.github.io/branch-specific-gitignore/ for details)

General resources about research software and computing

Reproducible research


Modern and digital scholarship

  • How to be a modern scientist is a nice introduction to many aspects of modern, open digital scholarship, that are valued and applied at CPU.

Lab meetings

We recently (November 2016) introduced official lab meetings in addition to more casual and daily interactions. Here's some advice on lab meeting code reviews.