YRLS2017: R for statistics and reproducible research in the life sciences
RR (Reproducible Research) course material for the YRLS 2017 meeting.
The course, R for statistics and reproducible research in the life sciences, introduces the following topics:
- What's reproducible research?
- What is R
- Why R? R vs Python / matlab / Java / C etc
- How to get R
- Learning R
- R syntax and use basics
- A simple example of reproducible analysis with R
RMarkdown files (with
.Rmd extension) contain the main part of the course (
Pouzat_YRLS_20170516.Rmd) and an actual, short (an not simple enough!)
RR application (
HTML output for both of these files are also included.
To regenerate the
HTML outputs from the source files you need first to install the
rmarkdown package. This is done within
Once this is done, start
R in the directory where the two
.Rmd were downloaded and type:
Pouzat_YRLS_20170516.html, then you have to install rhdf5, and:
Questions and Answers
Here are few questions that came up at the end of the course and some (tentative) answers.
R and Excel
- To import
R, check the R Data Import/Export manual, section 9 covers
Exceldata in depth.
- A collection of links mainly discussing
Excelusers is available from: https://www.r-bloggers.com/search/Excel/.
- There is an
Excel(http://rcom.univie.ac.at/) but I've never tried it.
Modeling "at large" A question came up about general modeling strategies or "how does one go from data to models?". A tricky question! There are no general rule I know of but the issue is touched upon in Philipp K. Janert book Data Analysis with Open Source Tools (mentioned in the course) in chapters 7 to 11 (part II) as well as in his (excellent) book on gnuplot: Gnuplot in Action. Look at part IV of the book.