Skip to content

Releases: tidyverse/stringr

stringr 1.5.1

15 Nov 12:37
Compare
Choose a tag to compare
  • Some minor documentation improvements.

  • str_trunc() now correctly truncates strings when side is "left" or
    "center" (@UchidaMizuki, #512).

stringr 1.5.0

04 Dec 17:34
Compare
Choose a tag to compare

Breaking changes

  • stringr functions now consistently implement the tidyverse recycling rules
    (#372). There are two main changes:

    • Only vectors of length 1 are recycled. Previously, (e.g.)
      str_detect(letters, c("x", "y")) worked, but it now errors.

    • str_c() ignores NULLs, rather than treating them as length 0
      vectors.

    Additionally, many more arguments now throw errors, rather than warnings,
    if supplied the wrong type of input.

  • regex() and friends now generate class names with stringr_ prefix (#384).

  • str_detect(), str_starts(), str_ends() and str_subset() now error
    when used with either an empty string ("") or a boundary(). These
    operations didn't really make sense (str_detect(x, "") returned TRUE
    for all non-empty strings) and made it easy to make mistakes when programming.

New features

  • Many tweaks to the documentation to make it more useful and consistent.

  • New vignette("from-base") by @sastoudt provides a comprehensive comparison
    between base R functions and their stringr equivalents. It's designed to
    help you move to stringr if you're already familiar with base R string
    functions (#266).

  • New str_escape() escapes regular expression metacharacters, providing
    an alternative to fixed() if you want to compose a pattern from user
    supplied strings (#408).

  • New str_equal() compares two character vectors using unicode rules,
    optionally ignoring case (#381).

  • str_extract() can now optionally extract a capturing group instead of
    the complete match (#420).

  • New str_flatten_comma() is a special case of str_flatten() designed for
    comma separated flattening and can correctly apply the Oxford commas
    when there are only two elements (#444).

  • New str_split_1() is tailored for the special case of splitting up a single
    string (#409).

  • New str_split_i() extract a single piece from a string (#278, @bfgray3).

  • New str_like() allows the use of SQL wildcards (#280, @rjpat).

  • New str_rank() to complete the set of order/rank/sort functions (#353).

  • New str_sub_all() to extract multiple substrings from each string.

  • New str_unique() is a wrapper around stri_unique() and returns unique
    string values in a character vector (#249, @seasmith).

  • str_view() uses ANSI colouring rather than an HTML widget (#370). This
    works in more places and requires fewer dependencies. It includes a number
    of other small improvements:

    • It no longer requires a pattern so you can use it to display strings with
      special characters.
    • It highlights unusual whitespace characters.
    • It's vectorised over both stringandpattern` (#407).
    • It defaults to displaying all matches, making str_view_all() redundant
      (and hence deprecated) (#455).
  • New str_width() returns the display width of a string (#380).

  • stringr is now licensed as MIT (#351).

Minor improvements and bug fixes

  • Better error message if you supply a non-string pattern (#378).

  • A new data source for sentences has fixed many small errors.

  • str_extract() and str_exctract_all() now work correctly when pattern
    is a boundary().

  • str_flatten() gains a last argument that optionally override the
    final separator (#377). It gains a na.rm argument to remove missing
    values (since it's a summary function) (#439).

  • str_pad() gains use_width argument to control whether to use the total
    code point width or the number of code points as "width" of a string (#190).

  • str_replace() and str_replace_all() can use standard tidyverse formula
    shorthand for replacement function (#331).

  • str_starts() and str_ends() now correctly respect regex operator
    precedence (@carlganz).

  • str_wrap() breaks only at whitespace by default; set
    whitespace_only = FALSE to return to the previous behaviour (#335, @rjpat).

  • word() now returns all the sentence when using a negative start parameter
    that is greater or equal than the number of words. (@pdelboca, #245)

stringr 1.4.1

21 Aug 16:18
Compare
Choose a tag to compare

Hot patch release to resolve R CMD check failures.

stringr 1.4.0

17 Feb 14:18
Compare
Choose a tag to compare
  • str_interp() now renders lists consistently independent on the presence of
    additional placeholders (@amhrasmussen).

  • New str_starts() and str_ends() functions to detect patterns at the
    beginning or end of strings (@jonthegeek, #258).

  • str_subset(), str_detect(), and str_which() get negate argument,
    which is useful when you want the elements that do NOT match (#259,
    @yutannihilation).

  • New str_to_sentence() function to capitalize with sentence case
    (@jonthegeek, #202).

stringr 1.3.1

10 May 21:42
Compare
Choose a tag to compare
  • str_replace_all() with a named vector now respects modifier functions (#207)

  • str_trunc() is once again vectorised correctly (#203, @austin3dickey).

  • str_view() handles NA values more gracefully (#217). I've also
    tweaked the sizing policy so hopefully it should work better in notebooks,
    while preserving the existing behaviour in knit documents (#232).

stringr 1.3.0

19 Feb 23:22
Compare
Choose a tag to compare

API changes

  • During package build, you may see
    Error : object ‘ignore.case’ is not exported by 'namespace:stringr'.
    This is because the long deprecated str_join(), ignore.case() and
    perl() have now been removed.

New features

  • str_glue() and str_glue_data() provide convenient wrappers around
    glue and glue_data() from the glue package
    (#157).

  • str_flatten() is a wrapper around stri_flatten() and clearly
    conveys flattening a character vector into a single string (#186).

  • str_remove() and str_remove_all() functions. These wrap
    str_replace() and str_replace_all() to remove patterns from strings.
    (@Shians, #178)

  • str_squish() removes spaces from both the left and right side of strings,
    and also converts multiple space (or space-like characters) to a single
    space within strings (@stephlocke, #197).

  • str_sub() gains omit_na argument for ignoring NA. Accordingly,
    str_replace() now ignores NAs and keeps the original strings.
    (@yutannihilation, #164)

Bug fixes and minor improvements

  • str_trunc() now preserves NAs (@ClaytonJY, #162)

  • str_trunc() now throws an error when width is shorter than ellipsis
    (@ClaytonJY, #163).

  • Long deprecated str_join(), ignore.case() and perl() have now been
    removed.

stringr 1.2.0

20 Feb 16:31
Compare
Choose a tag to compare

stringr 1.2.0

API changes

  • str_match_all() now returns NA if an optional group doesn't match
    (previously it returned ""). This is more consistent with str_match()
    and other match failures (#134).

New features

  • In str_replace(), replacement can now be a function that is called once
    for each match and who's return value is used to replace the match.
  • New str_which() mimics grep() (#129).
  • A new vignette (vignette("regular-expressions")) describes the
    details of the regular expressions supported by stringr.
    The main vignette (vignette("stringr")) has been updated to
    give a high-level overview of the package.

Minor improvements and bug fixes

  • str_order() and str_sort() gain explicit numeric argument for sorting
    mixed numbers and strings.
  • str_replace_all() now throws an error if replacement is not a character
    vector. If replacement is NA_character_ it replaces the complete string
    with replaces with NA (#124).
  • All functions that take a locale (e.g. str_to_lower() and str_sort())
    default to "en" (English) to ensure that the default is consistent across
    platforms.

stringr 1.1.0

19 Aug 22:11
Compare
Choose a tag to compare
  • Add sample datasets: fruit, words and sentences.
  • fixed(), regex(), and coll() now throw an error if you use them with
    anything other than a plain string (#60). I've clarified that the replacement
    for perl() is regex() not regexp() (#61). boundary() has improved
    defaults when splitting on non-word boundaries (#58, @lmullen).
  • str_detect() now can detect boundaries (by checking for a str_count() > 0)
    (#120). str_subset() works similarly.
  • str_extract() and str_extract_all() now work with boundary(). This is
    particularly useful if you want to extract logical constructs like words
    or sentences. str_extract_all() respects the simplify argument
    when used with fixed() matches.
  • str_subset() now respects custom options for fixed() patterns
    (#79, @gagolews).
  • str_replace() and str_replace_all() now behave correctly when a
    replacement string contains $s, \\\\1, etc. (#83, #99).
  • str_split() gains a simplify argument to match str_extract_all()
    etc.
  • str_view() and str_view_all() create HTML widgets that display regular
    expression matches (#96).
  • word() returns NA for indexes greater than number of words (#112).

stringr 1.0.0

30 Apr 13:17
Compare
Choose a tag to compare
  • stringr is now powered by stringi instead of base R regular expressions. This improves unicode and support, and makes most operations considerably faster. If you find stringr inadequate for your string processing needs, I highly recommend looking at stringi in more detail.

  • stringr gains a vignette, currently a straight forward update of the article that appeared in the R Journal.

  • str_c() now returns a zero length vector if any of its inputs are zero length vectors. This is consistent with all other functions, and standard R recycling rules. Similarly, using str_c("x", NA) now yields NA. If you want "xNA", use str_replace_na() on the inputs.

  • str_replace_all() gains a convenient syntax for applying multiple pairs of pattern and replacement to the same vector:

    input <- c("abc", "def")
    str_replace_all(input, c("[ad]" = "!", "[cf]" = "?"))
  • str_match() now returns NA if an optional group doesn't match (previously it returned ""). This is more consistent with str_extract() and other match failures.

  • New str_subset() keeps values that match a pattern. It's a convenient wrapper for x[str_detect(x)] (#21, @jiho).

  • New str_order() and str_sort() allow you to sort and order strings in a specified locale.

  • New str_conv() to convert strings from specified encoding to UTF-8.

  • New modifier boundary() allows you to count, locate and split by character, word, line and sentence boundaries.

  • The documentation got a lot of love, and very similar functions (e.g. first and all variants) are now documented together. This should hopefully make it easier to locate the function you need.

  • ignore.case(x) has been deprecated in favour of fixed|regexp|coll(x, ignore.case = TRUE), perl(x) has been deprecated in favour of regexp(x).

  • str_join() is deprecated, please use str_c() instead.