In [2]:
library(tidyverse)

# Extract and replace substrings from a character vector

**`str_sub`** will recycle all arguments to be the same length as the longest argument. If any arguments are of length 0, the output will be a zero length character vector.

```r
str_sub(string, start = 1L, end = -1L)                           # Like slicing in python: string[i1:i2]

str_sub(string, start = 1L, end = -1L, omit_na = FALSE) <- value
```

# Value

A character vector of substring from start to end (inclusive). Will be length of longest input argument.

# Examples

In [4]:
hw <- "Hadley Wickham"

hw %>% str_sub(1, 6)

In [7]:
hw <- "Hadley Wickham"

str_sub(hw, 1, 6)
str_sub(hw, end = 6)
str_sub(hw, 8, 14)
str_sub(hw, 8)
str_sub(hw, c(1, 8), c(6, 14))

In [8]:
# Negative indices
str_sub(hw, -1)
str_sub(hw, -7)
str_sub(hw, end = -7)

In [9]:
# Alternatively, you can pass in a two colum matrix, as in the
# output from str_locate_all
pos <- str_locate_all(hw, "[aeio]")[[1]]
str_sub(hw, pos)
str_sub(hw, pos[, 1], pos[, 2])

In [14]:
# Vectorisation

str_sub(hw, end = seq_len(str_length(hw)))

In [12]:
# Replacement form
x <- "BBCDEF"
str_sub(x, 1, 1) <- "A"; x
str_sub(x, -1, -1) <- "K"; x
str_sub(x, -2, -2) <- "GHIJ"; x
str_sub(x, 2, -2) <- ""; x

Note that **`str_sub()`** won’t fail if the string is too short: it will just return as much as possible:

In [3]:
'abc' %>% str_sub(2, 10)

In [15]:
# If you want to keep the original (the result is the same of the origin input) if some argument is NA,
# use omit_na = TRUE
x1 <- x2 <- x3 <- x4 <- "AAA"
str_sub(x1, 1, NA) <- "B"
str_sub(x2, 1, 2) <- NA
str_sub(x3, 1, NA, omit_na = TRUE) <- "B"
str_sub(x4, 1, 2, omit_na = TRUE) <- NA
x1; x2; x3; x4