Skip to content

Latest commit

 

History

History
123 lines (65 loc) · 6.34 KB

CHANGELOG.md

File metadata and controls

123 lines (65 loc) · 6.34 KB

Changelog

Unicode String v1.4.1

This is the changelog for Unicode String v1.4.1 released on March 14th, 2024. For older changelogs please consult the release tag on GitHub

Bug Fixes

  • Fix performance regressing in Uncode.String.Break.next/4. Added the script bench/next.exs to allow for regression testing. Thanks to @mntns for the report. Closes #6.

Unicode String v1.4.0

This is the changelog for Unicode String v1.4.0 released on March 10th, 2024. For older changelogs please consult the release tag on GitHub

Enhancements

  • Adds dictionary-based work breaking for Chinese (zh, zh-Hant, zh-Hans, zh-Hant-HK, yue, yue-Hans), Japanese (ja), Thai (th), Lao (lo), Khmer (km) and Burmese (my). These languages don't typically use whitespace to separate words so a dictionary lookup is more appropriate - although not perfect. The same dictionary is used for Chinese and Japanese. The dictionaries implemented are those used in the CLDR since they are under an open source license and also for consistency with ICU. Note that these dictionaries need to be downloaded with mix unicode.string.download.dictionaries prior to use. Each dictionary will be parsed and loaded into persistent_term on demand. Each dictionary has a sizable memory footprint as measured by :persistent_term.info/0:
Dictionary Memory Mb
Chinese 104.8
Thai 9.6
Lao 11.4
Khmer 38.8
Burmese 23.1

Unicode String v1.3.1

This is the changelog for Unicode String v1.3.1 released on March 6th, 2024. For older changelogs please consult the release tag on GitHub

Bug Fixes

  • Fix Unicode.String.split/2 and Unicode.String.next/2 when the passing rule is :no_break rule. Thanks to @GregLMcDonald for the report. Closes #5.

Unicode String v1.3.0

This is the changelog for Unicode String v1.3.0 released on February 27th, 2024. For older changelogs please consult the release tag on GitHub

Bug Fixes

  • Fix case folding for codepoints that fold to themselves.

Enhancements

  • Adds case mapping functions Unicode.String.upcase/2, Unicode.String.downcase/2 and Unicode/String.titlecase/2. These functions implement the full Unicode Casing algorithm including conditiional mappings. They are locale-aware and a locale can be specified as a string, atom or a Cldr.LanguageTag thereby providing basic integration between unicode_string and ex_cldr.

  • Case folding always follows the :full path which allows mapping of single code points to multiple code points. There is no practical reason to implement the :simple path. As a result, the type parameter to Unicode.String.Case.Folding.fold/2 is no longer required or supported.

  • Support an ex_cldr Language Tag as a parameter to Unicode.String.Case.Folding.fold/2. In fact any map that has a :language key with a value that is an ISO 639-1 language code as a lower cased atom may be passed as a parameter.

Unicode String v1.2.1

This is the changelog for Unicode String v1.2.1 released on June 2nd, 2023. For older changelogs please consult the release tag on GitHub

Bug Fixes

  • Resolve segments dir at runtime, not compile time. Thanks to @crkent for the report. Closes #4.

Unicode String v1.2.0

This is the changelog for Unicode String v1.2.0 released on March 14th, 2023. For older changelogs please consult the release tag on GitHub

Enhancements

  • Adds Unicode.String.stream/2 to support streaming graphemes, words, sentences and line breaks.

Unicode String v1.1.0

This is the changelog for Unicode String v1.1.0 released on September 21st, 2022. For older changelogs please consult the release tag on GitHub

Enhancements

  • Updates the segmentation supplemental data (including locales) for CLDR. This adds the "sv" and "fi" locale data for sentence break suppressions.

Unicode String v1.0.1

This is the changelog for Unicode String v1.0.1 released on September 15th, 2021. For older changelogs please consult the release tag on GitHub

Bug Fixes

  • Woops, the priv/segments directory was not included in the build artifact

Unicode String v1.0.0

This is the changelog for Unicode String v1.0.0 released on September 14th, 2021. For older changelogs please consult the release tag on GitHub

Enhancements

Unicode String v0.3.0

This is the changelog for Unicode String v0.3.0 released on October 11th, 2020. For older changelogs please consult the release tag on GitHub

Bug Fixes

  • Correct deps and docs to align with Elixir 1.11 and recent releases of ex_unicode.

Unicode String v0.2.0

This is the changelog for Unicode String v0.2.0 released on July 12th, 2020. For older changelogs please consult the release tag on GitHub

Enhancements

This release implements the Unicode break rules for graphemes, words, lines (word-wrapping) and sentences.

  • Adds Unicode.String.split/2

  • Adds Unicode.String.break?/2

  • Adds Unicode.String.break/2

  • Adds Unicode.String.splitter/2

  • Adds Unicode.String.next/2

Unicode String v0.1.0

This is the changelog for Unicode String v0.1.0 released on May 17th, 2020. For older changelogs please consult the release tag on GitHub

Enhancements

  • Initial release