Data Wrangling and Manipulation in R
It is often said that 80% of data analysis is spent on the process of cleaning and preparing the data. This workshop will introduce tools (notably dplyr and tidyr) that makes data wrangling and manipulation much easier. Participants will learn how to use these packages to subset and reshape data sets, do calculations across groups of data, clean data, and other useful stuff.
Prior knowledge: Previous experience with (basic) R is assumed.
Offered: UC Berkeley D-Lab, August 22, 2016
To participate in this workshop, you will need access to the software described below. In addition, you will need an up-to-date web browser. I recommend Google Chrome.
Once you've installed all of the software below, test your installation by following the instructions at the bottom on this page.
R and RStudio
Mac OS X
You can download the binary files for your distribution from CRAN. Or you can use your package manager (e.g. for Debian/Ubuntu run
sudo apt-get install r-base and for Fedora run
sudo yum install R). Also, please install the RStudio IDE.
Download and unpack the zip-file of the materials by clicking on this link: https://github.com/rochelleterman/r-graphics/archive/master.zip
Testing your installation
Find the Tutorial.Rmd file in the directory you just downloaded and open it in RStudio. If all goes well, RStudio should launch and you should see the tutorial.
Software Carpentry maintains a list of common issues that occur during installation may be useful for our class here: Configuration Problems and Solutions wiki page.
Credit: Thanks to Software Carpentry for providing installation guidelines.