Skip to content

Releases: gagolews/stringi

stringi_1.7.3

15 Jul 03:30
Compare
Choose a tag to compare

[BUGFIX] Fixed the previous patch of ICU55 causing a build failure on,
amongst others, CRAN's Solaris-based target.

stringi_1.7.2

14 Jul 10:21
Compare
Choose a tag to compare
  • [BUGFIX] Workaround for a bug in tools::checkFF failing
    when NA_character_ is passed to .Call.

stringi_1.7.1

14 Jul 04:52
Compare
Choose a tag to compare

What Is New in stringi

1.7.1 (2021-07-14)

  • [BACKWARD INCOMPATIBILITY] %s$% and %stri$% now use the new stri_sprintf
    (see below) function instead of base::sprintf.

  • [BACKWARD INCOMPATIBILITY, NEW FEATURE] In stri_sub<- and stri_sub_all<-,
    providing a negative length from now on does not result in the corresponding
    input string being altered.

  • [BACKWARD INCOMPATIBILITY, NEW FEATURE] In stri_sub and stri_sub_all,
    negative length results in the corresponding output being NA
    or not extracted at all, depending on the setting of the new argument
    ignore_negative_length.

  • [BACKWARD INCOMPATIBILITY, BUGFIX, NEW FEATURE] In stri_subset*
    and their replacement versions, pattern and value cannot be longer
    than str (but now they are recycled if necessary).

  • [BACKWARD INCOMPATIBILITY, NEW FEATURE] stri_sub* now accept the
    from argument being a matrix like cbind(from, length=length).
    Unnamed columns or any other names are still interpreted as cbind(from, to).
    Also, the new argument use_matrix can be used to disable
    the special treatment of such matrices.

  • [DOCUMENTATION] It has been clarified that the syntax of *_charclass
    (e.g., used in stri_trim*) differs slightly from regex character
    classes.

  • [NEW FEATURE] #420: stri_sprintf (alias: stri_string_format)
    is a Unicode-aware replacement for and enhancement of the base sprintf:
    it adds a customised handling of NAs (on demand), computing field size
    based on code point width, outputting substrings of at most given width,
    variable width and precision (both at the same time), etc. Moreover,
    stri_printf can be used to display formatted strings conveniently.

  • [NEW FEATURE] #153: stri_match_*_regex now extract capture group names.

  • [NEW FEATURE] #25: stri_locate_*_regex now have a new argument,
    capture_groups, which allows for extracting positions of matches
    to parenthesised subexpressions.

  • [NEW FEATURE] stri_locate_* now have a new argument, get_length,
    whose setting may result in generating from-length matrices
    (instead of from-to ones).

  • [NEW FEATURE] #438: stri_trans_general now supports rule-based
    as well as reverse-direction transliteration.

  • [NEW FEATURE] #434: stri_datetime_format and stri_datetime_parse
    are now vectorised also with respect to the format argument.

  • [NEW FEATURE] stri_datetime_fstr has a new argument, ignore_special,
    which defaults to TRUE for backward compatibility.

  • [NEW FEATURE] stri_datetime_format, stri_datetime_add, and
    stri_datetime_fields now call as.POSIXct more eagerly.

  • [NEW FEATURE] stri_trim* now have a new argument, negate.

  • [NEW FEATURE] stri_replace_rstr converts gsub-style replacement strings
    to stri_replace-style.

  • [INTERNAL] stri_prepare_arg* have been refactored, buffer overruns
    in the exception handling subsystem are now avoided.

  • [BUGFIX] Few functions (stri_length, stri_enc_toutf32, etc.)
    did not throw an exception on an invalid UTF-8
    byte sequence (and merely issues a warning instead).

  • [BUGFIX] stri_datetime_fstr did not honour NA_character_
    and did not parse format strings such as "%Y%m%d" correctly.
    It has now been completely rewritten (in C).

  • [BUGFIX] stri_wrap did not recognise the width of certain Unicode sequences
    correctly.

stringi_1.6.2

15 May 00:19
Compare
Choose a tag to compare
  • [BACKWARD INCOMPATIBILITY] In stri_enc_list(),
    simplify now defaults to TRUE.

  • [NEW FEATURE] #425: The outputs of stri_enc_list(), stri_locale_list(),
    stri_timezone_list(), and stri_trans_list() are now sorted.

  • [NEW FEATURE] #428: In stri_flatten, na_empty=NA now omits missing values.

  • [BUILD TIME] #431: Pre-4.9.0 GCC has ::max_align_t,
    but not std::max_align_t, added a (possible) workaround, see the INSTALL
    file.

  • [BUGFIX] #429: stri_width() misclassified the width of certain
    code points (including grave accent, Eszett, etc.);
    General category Sk (Symbol, modifier) is no longer of width 0,
    UCHAR_EAST_ASIAN_WIDTH of U_EA_AMBIGUOUS is no longer of width 2.

  • [BUGFIX] #354: ALTREP CHARSXPs were not copied, and thus could have been
    garbage collected in the so-called meanwhile (with thanks to @jimhester).

stringi_1.6.1

05 May 01:53
Compare
Choose a tag to compare

What Is New in stringi

1.6.1 (2021-05-05)

  • [GENERAL] #401: stringi is now bundled with ICU4C 69.1 (upgraded from 61.1),
    which is used on most Windows and OS X builds as well as on *nix systems
    not equipped with system ICU. However, if the C++11 support is disabled,
    stringi will be built against the battle-tested ICU4C 55.1.
    The update to ICU brings Unicode 13.0 and CLDR 39 support.

  • [DOCUMENTATION] A draft version of a paper on stringi is now available at
    https://stringi.gagolewski.com/_static/vignette/stringi.pdf

  • [GENERAL] stringi now requires R >= 3.1 (CXX_STD of CXX11 or CXX1X).

  • [NEW FEATURE] #408: stri_trans_casefold() performs case folding;
    this is different from case mapping, which is locale-dependent.
    Folding makes two pieces of text that differ only in case identical.
    This can come in handy when comparing strings.

  • [NEW FEATURE] #421: stri_rank() ranks strings in a character vector
    (e.g., for ordering data frames with regards to multiple criteria,
    the ranks can be passed to order(), see #219).

  • [NEW FEATURE] #266: stri_width() now supports emojis.

  • [NEW FEATURE] %s$% and %stri$% are now vectorised with respect to
    both arguments.

  • [BUGFIX] stri_sort_key() now outputs bytes-encoded strings.

  • [BUGFIX] #415: locale='' was not equivalent to locale=NULL
    in stri_opts_collator().

  • [INTERNAL] #414: Use LEVELS(x) macro instead of accessing (x)->sxpinfo.gp
    directly (@lukaszdaniel).

stringi_1.5.3

04 Sep 06:18
Compare
Choose a tag to compare

1.5.3 (2020-09-04) CRAN

  • [NEW FEATURE] #400: %s$% and %stri$% are now binary operators
    that call base R's sprintf().

  • [NEW FEATURE] #399: The %s*% and %stri*% operators can be used
    in addition to stri_dup(), for the very same purpose.

  • [NEW FEATURE] #355: stri_opts_regex() now accepts the time_limit and
    stack_limit options so as to prevent malformed or malicious regexes
    from running for too long.

  • [NEW FEATURE] #345: stri_startswith() and stri_endswith() are now equipped
    with the negate parameter.

  • [NEW FEATURE] #382: Incorrect regexes are now reported to ease debugging.

  • [DEPRECATION WARNING] #347: Any unknown option passed to stri_opts_fixed(),
    stri_opts_regex(), stri_opts_coll(), and stri_opts_brkiter() now
    generates a warning. In the future, the ... parameter will be removed,
    so that will be an error.

  • [DEPRECATION WARNING] stri_duplicated()'s fromLast argument
    has been renamed from_last. fromLast is now its alias scheduled
    for removal in a future version of the package.

  • [DEPRECATION WARNING] stri_enc_detect2()
    is scheduled for removal in a future version of the package.
    Use stri_enc_detect() or the more targeted stri_enc_isutf8(),
    stri_enc_isascii(), etc., instead.

  • [DEPRECATION WARNING] stri_read_lines(), stri_write_lines(),
    stri_read_raw(): use con argument instead of fname now.
    The argument fallback_encoding is scheduled for removal and is no longer
    used. stri_read_lines() does not support encoding="auto" anymore.

  • [DEPRECATION WARNING] nparagraphs in stri_rand_lipsum() has been renamed
    n_paragraphs.

  • [NEW FEATURE] #398: Alternative, British spelling of function parameters
    has been introduced, e.g., stri_opts_coll() now supports both
    normalization and normalisation.

  • [NEW FEATURE] #393: stri_read_bin(), stri_read_lines(), and
    stri_write_lines() are no longer marked as draft API.

  • [NEW FEATURE] #187: stri_read_bin(), stri_read_lines(), and
    stri_write_lines() now support connection objects as well.

  • [NEW FEATURE] #386: New function stri_sort_key() for generating
    locale-dependent sort keys which can be ordered at the byte level and
    return an equivalent ordering to the original string (@DavisVaughan).

  • [BUGFIX] #138: stri_encode() and stri_rand_strings()
    now can generate strings of much larger lengths.

  • [BUGFIX] stri_wrap() did not honour indent correctly when
    use_width was TRUE.

stringi_1.5.2

01 Sep 06:28
Compare
Choose a tag to compare

1.5.2 (2020-09-01) CRAN

  • [NEW FEATURE] #400: %s$% and %stri$% are now binary operators
    that call base R's sprintf().

  • [NEW FEATURE] #399: The %s*% and %stri*% operators can be used
    in addition to stri_dup(), for the very same purpose.

  • [NEW FEATURE] #355: stri_opts_regex() now accepts the time_limit and
    stack_limit options so as to prevent malformed or malicious regexes
    from running for too long.

  • [NEW FEATURE] #345: stri_startswith() and stri_endswith() are now equipped
    with the negate parameter.

  • [NEW FEATURE] #382: Incorrect regexes are now reported to ease debugging.

  • [DEPRECATION WARNING] #347: Any unknown option passed to stri_opts_fixed(),
    stri_opts_regex(), stri_opts_coll(), and stri_opts_brkiter() now
    generates a warning. In the future, the ... parameter will be removed,
    so that will be an error.

  • [DEPRECATION WARNING] stri_duplicated()'s fromLast argument
    has been renamed from_last. fromLast is now its alias scheduled
    for removal in a future version of the package.

  • [DEPRECATION WARNING] stri_enc_detect2()
    is scheduled for removal in a future version of the package.
    Use stri_enc_detect() or the more targeted stri_enc_isutf8(),
    stri_enc_isascii(), etc., instead.

  • [NEW FEATURE] #398: Alternative, British spelling of function parameters
    has been introduced, e.g., stri_opts_coll() now supports both
    normalization and normalisation.

  • [NEW FEATURE] #393: stri_read_bin(), stri_read_lines(), and
    stri_write_lines() are no longer marked as draft API.
    stri_read_lines() does not support encoding="auto" anymore.

  • [NEW FEATURE] #187: stri_read_bin(), stri_read_lines(), and
    stri_write_lines() now support connection objects as well.

  • [NEW FEATURE] #386: New function stri_sort_key() for generating
    locale-dependent sort keys which can be ordered at the byte level and
    return an equivalent ordering to the original string (@DavisVaughan).

  • [BUGFIX] #138: stri_encode() and stri_rand_strings()
    now can generate strings of much larger lengths.

  • [BUGFIX] stri_wrap() did not honour indent correctly when
    use_width was TRUE.

stringi_1.5.1

31 Aug 10:11
Compare
Choose a tag to compare

1.5.1 (2020-08-31)

  • [NEW FEATURE] #400: %s$% and %stri$% are now binary operators
    that call base R's sprintf().

  • [NEW FEATURE] #399: The %s*% and %stri*% operators can be used
    in addition to stri_dup(), for the very same purpose.

  • [NEW FEATURE] #355: stri_opts_regex() now accepts the time_limit and
    stack_limit options so as to prevent malformed or malicious regexes
    from running for too long.

  • [NEW FEATURE] #345: stri_startswith() and stri_endswith() are now equipped
    with the negate parameter.

  • [NEW FEATURE] #382: Incorrect regexes are now reported to ease debugging.

  • [DEPRECATION WARNING] #347: Any unknown option passed to stri_opts_fixed(),
    stri_opts_regex(), stri_opts_coll(), and stri_opts_brkiter() now
    generates a warning. In the future, the ... parameter will be removed,
    so that will be an error.

  • [DEPRECATION WARNING] stri_duplicated()'s fromLast argument
    has been renamed from_last. fromLast is now its alias scheduled
    for removal in a future version of the package.

  • [DEPRECATION WARNING] stri_enc_detect2()
    is scheduled for removal in a future version of the package.
    Use stri_enc_detect() or the more targeted stri_enc_isutf8(),
    stri_enc_isascii(), etc., instead.

  • [NEW FEATURE] #398: Alternative, British spelling of function parameters
    has been introduced, e.g., stri_opts_coll() now supports both
    normalization and normalisation.

  • [NEW FEATURE] #393: stri_read_bin(), stri_read_lines(), and
    stri_write_lines() are no longer marked as draft API.
    stri_read_lines() does not support encoding="auto" anymore.

  • [NEW FEATURE] #187: stri_read_bin(), stri_read_lines(), and
    stri_write_lines() now support connection objects as well.

  • [NEW FEATURE] #386: New function stri_sort_key() for generating
    locale-dependent sort keys which can be ordered at the byte level and
    return an equivalent ordering to the original string (@DavisVaughan).

  • [BUGFIX] #138: stri_encode() and stri_rand_strings()
    now can generate strings of much larger lengths.

  • [BUGFIX] stri_wrap() did not honour indent correctly when
    use_width was TRUE.

stringi_1.4.6

17 Feb 07:10
Compare
Choose a tag to compare

CRAN release v1.4.5

  • [NEW FEATURE] #369: stri_c() now returns an empty string
    when input is empty and collapse is set.

  • [BUGFIX] #370: fixed an issue in stri_prepare_arg_POSIXct()
    reported by rchk.

  • [BUGFIX] #372: documented arguments not in \usage in
    documentation object stri_datetime_format: ...

stringi_1.4.5

10 Jan 22:11
Compare
Choose a tag to compare
v1.4.5

v1.4.5