Skip to content

Releases: simdutf/simdutf

Version 4.0.1

22 Oct 16:58
Compare
Choose a tag to compare

What's Changed

Full Changelog: v4.0.0...v4.0.1

Version 4.0.0

21 Oct 00:01
e428b79
Compare
Choose a tag to compare

What's Changed

This version is meant to be identical to the latest maintenance release (3.2.18) except that it adds full support for transcoding from and to Latin 1.

  • Latin1 to UTF-8 transcoding,
  • Latin1 to UTF-16LE/BE transcoding
  • Latin1 to UTF-32 transcoding
  • UTF-8 to Latin1 transcoding,
  • UTF-16LE/BE to Latin1 transcoding
  • UTF-32 to Latin1 transcoding

All kernels (westmere, haswell, icelake, arm64) contain optimized routines. The performance is expected to be excellent. We also have extensive tests and fuzzing. We do not expect bugs, but we are still proceeding with a pre-release because this version contains many new functions and it is possible that the API is not perfect and could require changes.

Latin1 usage is similar to the rest of the library.

// suppose source contains UTF-8 (possibly invalid)
size_t expected_latin1words = simdutf::latin1_length_from_utf8(source.c_str(), source.size());
std::unique_ptr<char[]> latin1_output{new char[expected_latin1words]};
size_t latin1words = simdutf::convert_utf8_to_latin1(source.c_str(), source.size(), latin1_output.get());
if(latin1words != 0) { // success!
        // Let us go back, from Latin1 to UTF-8
        size_t expected_utf8words = simdutf::utf8_length_from_latin1(latin1_output.get(), latin1words);
        std::unique_ptr<char[]> utf8_output{ new char[expected_utf8words] };
        // convert to UTF-8
        size_t utf8words = simdutf::convert_latin1_to_utf8( latin1_output.get(), latin1words, utf8_output.get());
        // we could verify that we have come back to the UTF-8 input
} else {
      // the original UTF-8 input wasn't valid UTF-8
}

Of note is the fact that Latin1 allows no validation: all possible byte inputs are potentially valid Latin1 content.

There are many contributors to this new release but we need to credit @Nick-Nuon for substantial contributions. We thank @anonrig for his support and reviews.

There are also some minor updates:

Version 4.0.0-pre (prerelease)

15 Oct 19:18
Compare
Choose a tag to compare
Pre-release

What's Changed

This version is meant to be identical to the latest maintenance release (3.2.18) except that it adds full support for transcoding from and to Latin 1.

  • Latin1 to UTF-8 transcoding,
  • Latin1 to UTF-16LE/BE transcoding
  • Latin1 to UTF-32 transcoding
  • UTF-8 to Latin1 transcoding,
  • UTF-16LE/BE to Latin1 transcoding
  • UTF-32 to Latin1 transcoding

All kernels (westmere, haswell, icelake, arm64) contain optimized routines. The performance is expected to be excellent. We also have extensive tests and fuzzing. We do not expect bugs, but we are still proceeding with a pre-release because this version contains many new functions and it is possible that the API is not perfect and could require changes.

Latin1 usage is similar to the rest of the library.

// suppose source contains UTF-8 (possibly invalid)
size_t expected_latin1words = simdutf::latin1_length_from_utf8(source.c_str(), source.size());
std::unique_ptr<char[]> latin1_output{new char[expected_latin1words]};
size_t latin1words = simdutf::convert_utf8_to_latin1(source.c_str(), source.size(), latin1_output.get());
if(latin1words != 0) { // success!
        // Let us go back, from Latin1 to UTF-8
        size_t expected_utf8words = simdutf::utf8_length_from_latin1(latin1_output.get(), latin1words);
        std::unique_ptr<char[]> utf8_output{ new char[expected_utf8words] };
        // convert to UTF-8
        size_t utf8words = simdutf::convert_latin1_to_utf8( latin1_output.get(), latin1words, utf8_output.get());
        // we could verify that we have come back to the UTF-8 input
} else {
      // the original UTF-8 input wasn't valid UTF-8
}

Of note is the fact that Latin1 allows no validation: all possible byte inputs are potentially valid Latin1 content.

Version 3.2.18

08 Oct 17:55
228f074
Compare
Choose a tag to compare

This patch disables the icelake kernel when building with Visual Studio 2019.

@deepak1556 demonstrated an issue that we reproduce solely with Visual Studio 2019 (not Visual Studio 2022) when transcoding from UTF-16 to UTF-8 using AVX-512 routines. It seems possible that the AVX-512 support in Visual Studio 2019 is faulty.

Version 3.2.17

11 Aug 17:33
e0c859c
Compare
Choose a tag to compare

This patch release solves a rarely occurring issue with validate_utf8_with_errors.

Version 3.2.16

09 Aug 15:31
Compare
Choose a tag to compare

Patch back port of #267 to 3.2 series.

Visual Studio, when compiling for x86 targets (32-bit) may refuse to compile the following due to the mismatch between size_t and int.

size_t leftovers = ...;
for (int i = 0; i < leftovers; i++) {
   ...
}

This code appears in the code of the suft executable.

Credit: @JuliusBrueggemann

Version 3.2.15

04 Aug 17:29
Compare
Choose a tag to compare

Fix for node issue nodejs/node#48995

Full Changelog: v3.2.14...v3.2.15

Version 3.2.14

05 Jun 13:08
Compare
Choose a tag to compare

Requiring popcnt explicitly for icelake kernel, credit to @thesamesam

Full Changelog: v3.2.13...v3.2.14

Version 3.2.13

03 Jun 00:40
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v3.2.12...v3.2.13

Version 3.2.12

23 May 01:48
Compare
Choose a tag to compare

What's Changed

  • Another patch for Node issue 45427 by @lemire in #249

Full Changelog: v3.2.11...v3.2.12