Releases: simdutf/simdutf
Version 4.0.1
What's Changed
- pkg-config support by @lemire in #336 requested by @clausecker
- Fixing issue 337: rewrite of sse_convert_utf32_to_latin1 (simplify) by @lemire in #338 to fix issue #337 reported by @anonrig
Full Changelog: v4.0.0...v4.0.1
Version 4.0.0
What's Changed
This version is meant to be identical to the latest maintenance release (3.2.18) except that it adds full support for transcoding from and to Latin 1.
- Latin1 to UTF-8 transcoding,
- Latin1 to UTF-16LE/BE transcoding
- Latin1 to UTF-32 transcoding
- UTF-8 to Latin1 transcoding,
- UTF-16LE/BE to Latin1 transcoding
- UTF-32 to Latin1 transcoding
All kernels (westmere, haswell, icelake, arm64) contain optimized routines. The performance is expected to be excellent. We also have extensive tests and fuzzing. We do not expect bugs, but we are still proceeding with a pre-release because this version contains many new functions and it is possible that the API is not perfect and could require changes.
Latin1 usage is similar to the rest of the library.
// suppose source contains UTF-8 (possibly invalid)
size_t expected_latin1words = simdutf::latin1_length_from_utf8(source.c_str(), source.size());
std::unique_ptr<char[]> latin1_output{new char[expected_latin1words]};
size_t latin1words = simdutf::convert_utf8_to_latin1(source.c_str(), source.size(), latin1_output.get());
if(latin1words != 0) { // success!
// Let us go back, from Latin1 to UTF-8
size_t expected_utf8words = simdutf::utf8_length_from_latin1(latin1_output.get(), latin1words);
std::unique_ptr<char[]> utf8_output{ new char[expected_utf8words] };
// convert to UTF-8
size_t utf8words = simdutf::convert_latin1_to_utf8( latin1_output.get(), latin1words, utf8_output.get());
// we could verify that we have come back to the UTF-8 input
} else {
// the original UTF-8 input wasn't valid UTF-8
}
Of note is the fact that Latin1 allows no validation: all possible byte inputs are potentially valid Latin1 content.
There are many contributors to this new release but we need to credit @Nick-Nuon for substantial contributions. We thank @anonrig for his support and reviews.
There are also some minor updates:
- Fix for issue 331 Building as a shared library but static is hardcoded by @SimeonStoykovQC
- Fix for issue 333 Build fails on armv7 FreeBSD 13.2 by @clausecker
Version 4.0.0-pre (prerelease)
What's Changed
This version is meant to be identical to the latest maintenance release (3.2.18) except that it adds full support for transcoding from and to Latin 1.
- Latin1 to UTF-8 transcoding,
- Latin1 to UTF-16LE/BE transcoding
- Latin1 to UTF-32 transcoding
- UTF-8 to Latin1 transcoding,
- UTF-16LE/BE to Latin1 transcoding
- UTF-32 to Latin1 transcoding
All kernels (westmere, haswell, icelake, arm64) contain optimized routines. The performance is expected to be excellent. We also have extensive tests and fuzzing. We do not expect bugs, but we are still proceeding with a pre-release because this version contains many new functions and it is possible that the API is not perfect and could require changes.
Latin1 usage is similar to the rest of the library.
// suppose source contains UTF-8 (possibly invalid)
size_t expected_latin1words = simdutf::latin1_length_from_utf8(source.c_str(), source.size());
std::unique_ptr<char[]> latin1_output{new char[expected_latin1words]};
size_t latin1words = simdutf::convert_utf8_to_latin1(source.c_str(), source.size(), latin1_output.get());
if(latin1words != 0) { // success!
// Let us go back, from Latin1 to UTF-8
size_t expected_utf8words = simdutf::utf8_length_from_latin1(latin1_output.get(), latin1words);
std::unique_ptr<char[]> utf8_output{ new char[expected_utf8words] };
// convert to UTF-8
size_t utf8words = simdutf::convert_latin1_to_utf8( latin1_output.get(), latin1words, utf8_output.get());
// we could verify that we have come back to the UTF-8 input
} else {
// the original UTF-8 input wasn't valid UTF-8
}
Of note is the fact that Latin1 allows no validation: all possible byte inputs are potentially valid Latin1 content.
Version 3.2.18
This patch disables the icelake kernel when building with Visual Studio 2019.
@deepak1556 demonstrated an issue that we reproduce solely with Visual Studio 2019 (not Visual Studio 2022) when transcoding from UTF-16 to UTF-8 using AVX-512 routines. It seems possible that the AVX-512 support in Visual Studio 2019 is faulty.
Version 3.2.17
This patch release solves a rarely occurring issue with validate_utf8_with_errors
.
Version 3.2.16
Patch back port of #267 to 3.2 series.
Visual Studio, when compiling for x86 targets (32-bit) may refuse to compile the following due to the mismatch between size_t
and int
.
size_t leftovers = ...;
for (int i = 0; i < leftovers; i++) {
...
}
This code appears in the code of the suft
executable.
Credit: @JuliusBrueggemann
Version 3.2.15
Fix for node issue nodejs/node#48995
Full Changelog: v3.2.14...v3.2.15
Version 3.2.14
Requiring popcnt explicitly for icelake kernel, credit to @thesamesam
Full Changelog: v3.2.13...v3.2.14
Version 3.2.13
What's Changed
- Make dllexport on Windows optional by @jblazquez in #250
- Requiring popcnt explicitly for westmere kernel #251 credit to @thesamesam
New Contributors
- @jblazquez made their first contribution in #250
Full Changelog: v3.2.12...v3.2.13
Version 3.2.12
What's Changed
Full Changelog: v3.2.11...v3.2.12