Skip to content

Strings

Cghlewis edited this page Apr 13, 2024 · 54 revisions

When working with education data, the following are a few examples of why you may need to edit strings in variables:

  1. A string has purposefully been inserted into a variable but you want to remove it (ex: $ in salary)
  2. A string has accidentally been inserted into a variable (ex: spaces in a numeric variable) and you need to remove them
  3. A string has been used but you want to replace it with a better string (ex: school name is spelled wrong and you want to replace it with the correct spelling)
  4. Strings need to be combined (ex: a prefix needs to be added variable names)

Regular expressions (regex) are a huge part of matching patterns when working with strings. It is important to get familiar with regex language: https://stat.ethz.ch/R-manual/R-devel/library/base/html/regex.html

Here is a helpful regular expression editor: https://rubular.com/

Truncate strings

Remove values strings

Replace values in strings

Combine strings

Locate strings

  • [Detect patterns in strings](See several examples throughout Filter and Recode)

Split strings

Count strings

Extract values in strings


Main functions used in examples

Package Functions
stringr str_remove(); str_c(); str_count(); str_length(); str_sub(); str_extract(); str_trim(); str_squish(); str_detect(); str_replace(); str_trunc()
tidyr extract()
readr parse_number()

Other functions used in examples

Package Functions
stringr str_to_lower()
dplyr mutate(); across(); group_by(); summarize()
tidyr pivot_wider(), unite()
base paste(); paste0(); basename(); list2env(); ls(); nchar(); max(); lapply()
janitor tabyl()
fs dir_ls()
here here()
purrr set_names()

Resources

Clone this wiki locally