Over the course of my time working with the Carolina Insitute for Developmental Disabilities (CIDD) and the Infant Brain Imaging Study (IBIS) network, I have seen a great interest in learning how to do basic statistical analyses and data processing among the trainees. Specially, there is an interest in learning how to use R, due to its popularity across the sciences and its zero financial cost. As a statistican in training, I feel it is a great benefit for scientists to learn R. It is vital for scientists to understand the fundamentals of statistics to foster communication with the statisticans they are working with and learning some statistical computing facilitates this understanding. Furthermore, R has great tools for data processing, which is an essential first step in any data analysis.
The objectives of this set of R tutorials are four-fold.
-
Learn the interface of RStudio and understand the fundamentals of how R works (i.e., learn the "language" of R)
-
Learn how to use the data processing tools in R
-
Learn how to do basic data analysis methods in R (plotting, 1-way and 2-way tables, regression modeling, contingency table analyses, comparing means)
-
Learn to present these results in a report directly through R (called R Markdown)
Intermediate and advanced statistical analyses, such as machine learning techniques, are not covered in these tutorials. While exploratory and standard regression analyses are useful for non-statisticans to understand and learn how to do, these other types of analyses are beyond the goal of these tutorials.
Much of the content and structure of these tutorials will be based off of Hadley Wickham's excellent book R for Data Science. For those who want more detail and some exercises for the techniques detailed here, I recommmend going through Wickham's book. All examples and exercises detailed in these tutorials will be based on IBIS data. I hope these tutorials are useful and make R more inviting to use and learn, as after the inital learning curve I think you will find that R is an intuitive software for data analysis and processing.
In order to access the book while online, visit the webpage https://kmdono02.github.io/Data_Analysis_with_R_IBIS/. See below for instructions to access the book while offline/without an internet connection.
The repository contains the following materials:
- Book
The docs folder contains all of the files which comprise the book. This includes each chapter's HTML file, as well as some of the figures used in the book. To open the book offline you can open any of the chapter HTML files. Preferrably, open the file named "index.html", though you can access the entire book through any chapter's HTML file
- RMD Files
All of the R Markdown files (denoted with file type .RMD) used to create this book are contained. These can be used in conjunction with the R Markdown chapter in the book to learn how to use R Markdown; each chapter's RMD file has te same file name as the corresponding chapter title and number.
- Datasets
The data folder also includes all of the datasets referenced in this book are included in .CSV format. This are to be used in conjunction with the RMD files to run the code provided in the tutorials to recreate all of the output seen in the tutorials. The RMD files and datasets provide essential hands-on practice with using R.
Thank you,
Kevin Donovan
PhD student in Biostatistics
University of North Carolina-Chapel Hill