This workshop series is geared toward learning basic data management in R
. This includes tasks like manipulating variables, creating new variables, subsetting data, reshaping data, and merging. We will also cover some introductory regular expression applications. In this workshop series we will cover only basic visualization methods in R
. For a more thorough introduction to to ggplot2
, see my workshops on data visualization and creating maps in R
.
Sessions:
- Intro R 1 (working directories, arithmetic, logical operators, basic indexing, data types, basic functions such as
sum()
,mean()
,names()
,seq()
,rep()
, etc.). - Intro R 2 (reading and writing data, dealing with missing data, data frames, indexing on data frames, getting an overview of the data with multivariate numerical and graphical summaries).
- Basic data management: Introduction to
dplyr
. - Data shaping and reshaping:
tidyr
in connection withdplyr
. - Introduction to string operations with
stringr
. Primer on web-scraping. - Bringing it all together: Advanced data cleaning tasks with
tidyr
,dplyr
, andstringr
.
R
is a programming language for statistical computing and data visualization, that is a open source alternative to commercial statistical packages such as Stata or SPSS. R
is maintained and developed by a vibrant community of programmers and statisticians and offers many user-written packages to extend basic functionality.
In this workshop, we will be using R
together with the integrated development environment (IDE) RStudio. In addition to offering a 'cleaner' programming development than the basic R
editor, RStudio offers a large number of added functionalities for integrating code into documents, built-in tools and web-development. To get started, please download the latest version of RStudio and R
from this website:
https://www.rstudio.com/products/rstudio/download/
The key to learning R
is: Google! This workshop will give you an overview over basic R
functions, but to really learn R
you will have to actively use it yourself, trouble shoot, ask questions, and google! The R
mailing list and other help pages such as http://stackoverflow.com offer a rich archive of questions and answers by the R
community. For example, if you google "recode data in r" you will find a variety of useful websites explaining how to do this on the first page of the search results. Also, don't be surprised if you find a variety of different ways to execute the same task.
RStudio also has a useful help menu. In addition, you can get information on any function or integrated data set in R
through the console, for example:
?plot
In addition, there are a lot of free R
comprehensive guides, such as Quick-R at http://www.statmethods.net or the R
cookbook at http://www.cookbook-r.com.
This workshop series was first taught in 2016 as part of the data science training for the Security and Political Economy Lab at the University of Southern California.
The first two intro sessions to basic R
programming are heavily inspired by the first chapter of Kosuke Imai's (2017) book "Quantitative Social Science. An Introduction" (http://assets.press.princeton.edu/chapters/s11025.pdf).