This repository provides materials for a session that is part of the I2DS Tools for Data Science workshop run at the Hertie School, Berlin in November 2021. The student-run workshop is part of the course Introduction to Data Science taught by Simon Munzert at the Hertie School, Berlin, in Fall 2021.
This session will introduce you to manipulating categorical variables in R using forcats
- the tidyverse
package for working with factors.
A factor is the R data structure for categorical data, that is, variables having a fixed and known set of possible values. Factors are useful when working with character vectors that need to be ordered, for visualization purposes and for conducting statistical analysis dealing with ordered data (e.g. ordinal regressions).
The goals of this session are to (1) equip you with conceptual knowledge about factors and the forcats package (2) show you key functions of the package, and (3) provide you with practice material as well as some further readings.
- Janine De Vera
- Victor Möslein
- forcats overview at forcats.tidyverse.org
- R for Data Science book - Chapter 15: Factors
- Wrangling categorical data with R
The material in this repository is made available under the MIT license.
Janine De Vera prepared the presentation slides and recording.
Victor Möslein prepared the tutorial and practice materials.