Skip to content

Materials for the ESRA 2021 workshop "Introduction to Survey Data Cleaning Using Tidyverse in R"

Notifications You must be signed in to change notification settings

jobreu/tidyverse-workshop-esra-2021

Repository files navigation

ESRA 2021 Short Course "Introduction to Survey Data Cleaning Using Tidyverse in R"

Materials for the ESRA 2021 Short Course "Introduction to Survey Data Cleaning Using Tidyverse in R"

Johannes Breuer (johannes.breuer@gesis.org, @MattEagle09)

Stefan Jünger (stefan.juenger@gesis.org, @StefanJuenger)

Please link to the workshop GitHub repository


Course description

Before researchers can start to analyze their data, they first have to wrangle (i.e., clean and transform) them. While it may not the most exciting part of data analysis, it can take up a substantial part of the researchers’ time. An often-used phrase applies the pareto principle to working with research data and states that 80% of the time is spent wrangling the data, and only 20% actually analyzing it. Most statistical software packages offer various options for data wrangling that differ in their accessibility and versatility. Among these options, the R programming language is a very powerful tool for data wrangling. While all data wrangling can be done with base R, the syntax for this is typically verbose and not intuitive and, hence, difficult to learn, remember, and read. The tidyverse, which is “an opinionated collection of R packages designed for data science” in which “all packages share an underlying design philosophy, grammar” (see https://www.tidyverse.org/, addresses this problem by providing a consistent syntax that is also easy to read, learn, and remember. This workshop will introduce participants to the tidyverse and its packages as well as the concepts that it builds on, such as tidy data. In the workshop's practical parts, we will work through examples of common data wrangling steps: importing, tidying, and transforming data.

Prerequisites

The course is meant for people who already have some experience with R looking for an accessible, hands-on introduction to data cleaning with the tidyverse as well as more advanced R users who want to switch from base R to the tidyverse for their data cleaning tasks.

Participants will need a working installation of R and RStudio and should, ideally, also install the tidyverse packages before the course by running the command install.packages(“tidyverse”) in R/RStudio.

Time Table

When? What?
13:00 - 13:20 Introduction: Welcome to the tidyverse
13:20 - 13:30 Exercise 1
13:30 - 13:45 Data Import
13:45 - 14:00 Exercise 2
14:00 - 14:30 Data Wrangling - Part 1
14:30 - 14:45 Exercise 3
14:45 - 15:00 Coffee break
15:00 - 15:30 Data Wrangling - Part 2
15:30 - 15:45 Exercise 4
15:45 - 16:00 Wrap-Up

Slides

Introduction

Data Import

Data Wrangling I

Data Wrangling II

Wrap-Up

Exercises

Exercise 1

Exercise 2

Exercise 3

Exercise 4

Solutions

Exercise 1

Exercise 2

Exercise 3

Exercise 4

About

Materials for the ESRA 2021 workshop "Introduction to Survey Data Cleaning Using Tidyverse in R"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages