This introductory R workshop aims to teach participants with minimal programming experience the basics of the R statistical programming language for reproducible laboratory data analytics. R is a freely available programming environment that is aimed squarely at common activities in data analysis including complex data manipulation, statistical analysis, automation, and publication-quality data visualization. We will introduce basic concepts of R programming as well as more generalizable best practices in working with laboratory data.
- Instructors:
- Daniel Herman
- Stephan Kadauke
- Patrick Mathias
- Amrom Obstfeld
- Joseph Rudolf
- Preferably two monitors (or two laptops), one for the Zoom conference software, and one in which you will work. (an IPad can serve as the Zoom platform in a pinch).
- We will be utilizing Zoom Meeting for this workshop, please download and install the latest version here.
- The program that we will be using to interact with R during the course is called RStudio. We will be using a cloud based version of RStudio, hosted at RStudio.Cloud, in our workshop.
- Please follow the instructions in this presentation to setup an RStudio.Cloud account.
- Note: Some older internet browsers may not be compatible with RStudio.cloud. See this web page for additional information.
- While not required, we highly recommend installing RStudio Desktop on your laptop as well. See instructions below. While we won't be using it during the workshop, you'll need it for future R work.
- Please complete the following survey so we can better understand your R experience and what you want out of the course: API R Workshop Participant Survey.
We will be utilizing our cloud based RStudio instance in the workshop. However, in the long term, you will need R and RStudio installed on your own computer in order to work on private or PHI containing data. You can find a video with step by step instructions for installing on Mac or PC by following the links below:
Please complete each step in the video in turn including the final step, installing the tidyverse packages.
There are multiple ways to access and interact with the course content, depending on whether you choose to proceed through the workshop using the cloud based RStudio or one on your own laptop.
- All the course content will be pre-loaded in the RStudio Cloud instance and is available for download as a .zip file.
- The coursebook will be emailed to participants ahead of the workshop.
- After the workshop, the content will be available at our course github and can be downloaded from there as well.
Please note that this is first virtual R workshop that our team is hosting, we are preparing extensively to ensure that the workshop is productive for you, however some technical challenges are to be expected
- Please refrain from screen grabbing other user's information, recording the workshop, or otherwise disrupting the flow of the Zoom Meeting.
- Microphones will be muted during workshop main sessions but during breakout exercises you will be asked to collaboratively interact with your colleagues and an instructor. We encourage you to activate your video for the workshop.
We will be holding an A/V and technical check for participants to make sure they are prepared for the workshop. This will take place on Wednesday July 15 at 1pm eastern time. Details to follow.
The workshop is scheduled to begin at 1 pm ET. In order to make sure that we can begin on schedule please make every effort to log into the Zoom conference 10 - 15 minutes early to allow time to get settled, ensure your computer audio and video is set up, and get RStudio cloud up and running.
All of the course instructors have previous experience implementing and executing R workshops at a variety of venues. The workshop we are presenting for the API community is in many ways a product of these past experiences. The workshop also integrates content, best practices, and lessons from a variety of educators in the R community. We would like to specifically acknowledge:
- MSACL Data Science 201, a course produced by Patrick Mathias and several collaborators, presented at the Mass Spectrometry: Applications to the Clinical Lab meeting.
- Stephan Kadauke's R workshop for Pathology trainees and faculty, developed at the Massachusetts General Hospital and the Hospital of the University of Pennsylvania
- Steve Master and Dan Holmes's AACC Introduction to R Workshop
- Data Science in the Tidyverse, a RStudio course with materials posted online
- R for Data Science, the online textbook by Garrett Grolemund and Hadley Wickham, is invaluable in navigating the tidyverse and learning R in general
- Blog posts and documentation by Jenny Bryan helped steer the project content and as well as some discussion about packages
- Amy Willis' Advanced R Course repository as a resource for understanding content in a longer, advanced R course
- Keith Baggerly and Karl Broman's Reproducible Research module at the Summer Institute in Statistics for Big Data - a big thank you to Keith Baggerly for all of his input and guidance!
- Greg Wilson's Teaching Tech Together, which offers practical advice about teaching programming.
- Claus Wilke's Fundamentals of Data Visualization, a compendium of Do's and Don'ts of data visualization.
- Method validation and some other content has been borrowed from the basic R course at AACC
All of the material in this GitHub repository is copyrighted under the Creative Commons BY-SA 4.0 copyright to make the material easy to reuse. We encourage you to reuse it and adapt it for your own teaching as you like!