Skip to content

Select Variables

Cghlewis edited this page Jan 30, 2023 · 20 revisions

Selecting/removing columns is typically done using the dplyr package. However I do cover a few alternative packages to use, specifically when you want to drop empty rows or columns.

Some common reasons we might want to select columns are:

  1. We want to remove columns added by a survey platform that are irrelevant for our analysis
  2. We want to remove identifier variables in our dataset
  3. We want to select certain variables to perform an operation on (such as filter, recode, or calculate)

Examples of selecting variables by names, as well as selecting variables using tidyselect selection helpers such as starts_with() are provided below.

Select columns


Main functions used in examples

Package Functions
dplyr select(); across()
expss drop_empty_columns()
janitor remove_empty()

Other functions used in examples

Package Functions
dplyr filter(); pull(); mutate(); if_all(); case_when()
tidyselect starts_with(); contains(); matches(); all_of(); last_col(); where()
base is.numeric(); ncol(); round()

Resources:

Clone this wiki locally