four week intro to R: basic syntax, data manipulation, data viz
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

Introduction to R


Description: This four-week course is designed to introduce attendees to R statistical programming and its broad applications. Each two hour session will include brief tutorials interspersed with challenge exercises, and assumes attendees have no prior computer coding experience. At the end of this course, you will be able to use R to import, manipulate, and visualize data.

This repository is adapted from content originally appearing in R for data analysis and visualization of Ecological Data, Copyright (c) Data Carpentry.

Software requirements for this course can be found on's Software page.


  • Week 1: R syntax, assigning objects, using functions
  • Week 2: Data types and structures; slicing and subsetting data
  • Week 3: Data manipulation with dplyr
  • Week 4: Data visualization in ggplot2


  • Each week's materials are described in the R script prefaced with the number of the week.
  • 0dataset.R includes code used to derive the original data from the National Cancer Institute's Genomic Data Commons. extra/ holds the original data files used for download during the activities, as well as the intermediate data files for each cancer type directly downloaded from NCI-GDC
  • exercises/ includes a file for each week representing both the aggregated in-class exercises as well as additional supplemental exercises for practice
  • solutions/ includes the solutions for all files in exercises/
  • includes useful links mentioned during lessons; additional information about continued learning in R as well as Hutch-specific resources can be found on the Data Science Wiki
  • hackmdio.txt is an archive of the interactive webpage used during lessons