-
Notifications
You must be signed in to change notification settings - Fork 16
Select Variables
Cghlewis edited this page Jan 30, 2023
·
20 revisions
Selecting/removing columns is typically done using the dplyr package. However I do cover a few alternative packages to use, specifically when you want to drop empty rows or columns.
Some common reasons we might want to select columns are:
- We want to remove columns added by a survey platform that are irrelevant for our analysis
- We want to remove identifier variables in our dataset
- We want to select certain variables to perform an operation on (such as filter, recode, or calculate)
Examples of selecting variables by names, as well as selecting variables using tidyselect
selection helpers such as starts_with()
are provided below.
- Select columns
- Remove empty columns
- Select columns using a data dictionary
- Select columns within an operation
- [Select columns from each data frame in a list](see Import Files)
Main functions used in examples
Package | Functions |
---|---|
dplyr | select(); across() |
expss | drop_empty_columns() |
janitor | remove_empty() |
Other functions used in examples
Package | Functions |
---|---|
dplyr | filter(); pull(); mutate(); if_all(); case_when() |
tidyselect | starts_with(); contains(); matches(); all_of(); last_col(); where() |
base | is.numeric(); ncol(); round() |
Resources: