-
Notifications
You must be signed in to change notification settings - Fork 16
Strings
Cghlewis edited this page Apr 13, 2024
·
54 revisions
When working with education data, the following are a few examples of why you may need to edit strings in variables:
- A string has purposefully been inserted into a variable but you want to remove it (ex: $ in salary)
- A string has accidentally been inserted into a variable (ex: spaces in a numeric variable) and you need to remove them
- A string has been used but you want to replace it with a better string (ex: school name is spelled wrong and you want to replace it with the correct spelling)
- Strings need to be combined (ex: a prefix needs to be added variable names)
Regular expressions (regex) are a huge part of matching patterns when working with strings. It is important to get familiar with regex language: https://stat.ethz.ch/R-manual/R-devel/library/base/html/regex.html
Here is a helpful regular expression editor: https://rubular.com/
- Replace strings
- [Replace strings in variable names](See Name Variables)
- Combine strings in long data
- Combine strings in wide data
- [Combine strings in variable names](See Name Variables)
- [Split strings](See Separate)
- Count strings
- [Sum row occurrences of strings](See Calculate Row Values)
Main functions used in examples
Package | Functions |
---|---|
stringr | str_remove(); str_c(); str_count(); str_length(); str_sub(); str_extract(); str_trim(); str_squish(); str_detect(); str_replace(); str_trunc() |
tidyr | extract() |
readr | parse_number() |
Other functions used in examples
Package | Functions |
---|---|
stringr | str_to_lower() |
dplyr | mutate(); across(); group_by(); summarize() |
tidyr | pivot_wider(), unite() |
base | paste(); paste0(); basename(); list2env(); ls(); nchar(); max(); lapply() |
janitor | tabyl() |
fs | dir_ls() |
here | here() |
purrr | set_names() |
Resources
- https://cran.r-project.org/web/packages/stringr/vignettes/regular-expressions.html
- http://edrub.in/CheatSheets/cheatSheetStringr.pdf
- https://luisdva.github.io/RLadiesSTLregex/#/title-slide
- https://stackoverflow.com/questions/68978199/if-any-string-values-in-a-character-vector-are-in-a-column-of-a-data-frame-retu