Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Clone this wiki locally
Getting started at CPU
The Computational Proteomics Unit web page: https://lgatto.github.io/cpu-lab/
git(and a bit more about
- Learn the
- Arguably the most important piece of software when doing computational work is an editor. Choose one wisely. It might take time to master it, but it is definitely a good time investment.
- Consistency is key.
- Coding style: we follow the Bioconductor coding style. Also, use
- Use a dotted function name for internal function:
- We generally prefer camel case. Snake case would be fine for a set of related internal helper functions: something like
is_logical_character, ... that all return a
logical(1). Never ever mix (exported) snake and camel case for one package. Remember, consistency is key.
- Does your editor know
R? If you use
emacs, go for
ess; if you use
vim, look at the
vim R plugin. See also
- For OO programming, prefer S4 over S3. If relevant, use S4 Reference Classes. Consider R6, but discuss/motivate your choice.
- Only use generics and methods when using a function is not possible at all, i.e. the same function name is used for different classes (within the same or different package).
- Before writing a new generic, check if it doesn't already exists in
ProtGenericsand consider asking the new generic to be added in one of those if relevant.
R package development
Authors@Rto define authors and their respective roles in the
roxygenis not only valuable in the frame of package development. Documenting a function with it outside of a package is recommended.
- Use git/GitHub - all discussion about the package (software architecture, bugs, vignettes, ...) should be done through GitHub issues.
makerto automate and standardise development.
- Use a
.Rbuildignorefile to bundle only what it needed (see below)
testthatfor unit testing.
- Use travis-ci and codecov for continuous integration. Ideally also test on Windows using appveyor.
BiocStyle::html_document2()with floating toc for vignettes.
- Write a
README.mdfile. If it contains R code (that would be a good thing), use a
README.Rmdfiles, with a pre-commit hook and use
makerto build the
Rmdfile should be added to
.Rbuildignoreso as to only keep the
mdfile the file package bundle (tar ball).
- Write a
NEWSfile should then be generated from the latter using
NEWS.mdto be added to
.Rbuildignoreand the latter to
pkgdownto generate the package webpage. Do not add the
docsdirectory to the Bioconductor svn server (see https://lgatto.github.io/branch-specific-gitignore/ for details)
General resources about research software and computing
- Ten simple rules for making research software more robust
- Good Enough Practices in Scientific Computing
- Before being reproducible, your research should be organised.
RMarkdown [1, 2] vignettes.
- How we make our papers replicable, by Titus Brown.
pandoc, the document converter.
- Using knitr and pandoc to create reproducible scientific reports by Peter Humburg.
- A Reproducibility Reading List
- Teaching material
- R packages by Hadley Wickham
- Advanced R by Hadley Wickham
- R Programming for Bioinformatics by Robert Gentleman (ask me for the pdf)
- Managing Research Software Projects
Modern and digital scholarship
- How to be a modern scientist is a nice introduction to many aspects of modern, open digital scholarship, that are valued and applied at CPU.
We recently (November 2016) introduced official lab meetings in addition to more casual and daily interactions. Here's some advice on lab meeting code reviews.