Releases · simdutf/simdutf

What's Changed

Most text today is represented using the Unicode standard. The simdutf library seeks to provide high performance Unicode functions for C++ programmers. Version 2.0 introduces a richer API, with support for the most popular Unicode formats (UTF-32, UTF-16BE, UTF-16LE and UTF-8). Users can transcode between these formats, and validate the inputs as needed. For users that so desire, we also return a structure containing failure information, including the nature and location of the error.

For advanced x64 processors, we introduce a whole new AVX-512 kernel which includes novel algorithms by @WojciechMula and @clausecker It can be twice as fast as a previous kernels, reaching speeds close to 5 GB/s on non-trivial Unicode inputs. The library relies on runtime dispatching so that if your processor supports the new kernel, it is automatically used. The currently supported processors include Ice Lake, Rocket Lake, and Zen4.

On an Ice Lake processor, we get the following speeds with the Arabic-Lipsum.utf8.txt test file:

function	UTF-8 to UTF-16 speed (GB/s)
simduft (AVX-512)	4.6 GB/s
simduft (AVX2)	2.3 GB/s
ICU	1.4 GB/s
iconv	0.7 GB/s

Major changes

AVX512 kernel for Ice Lake / Zen 4 processors by @WojciechMula and @clausecker in #174
Support for UTF-32, UTF-16BE and transcoding between UTF-32, UTF-16BE, UTF-16LE and UTF-8, by @NicolasJiaxin, @clausecker and others
Ascii validation by @NicolasJiaxin in #110
One pass autodetect encodings by @NicolasJiaxin in #134
Returning a struct indicating success and length for some functions by @NicolasJiaxin in #157
Iconv-like tool (sut) by @NicolasJiaxin in #160

Performance

Optimize ARM utf16 validation by @danlark1 in #145

Bug fixes

fix valid_utf8_to_utf16.h producing invalid utf16 (issue111) by @lemire in #119
Fix Buffer Overrun on aarch64 by @wx257osn2 in #171
fix some typos by @striezel in #139

Testing

Fuzzer for buffer overflow by @NicolasJiaxin in #163
update actions/checkout in GitHub Actions to v3 by @striezel in #138

Building

👷‍♀️ CMake: Guard Tests/Examples Behind CMake Variables by @ThePhD in #149

Benchmarking

Added iconv to the benchmarks, by @lemire in #164
We use simpler performance counters since under graviton 2 (AWS), you may only access two counters at a time by @lemire in #123

New Contributors

@striezel made their first contribution in #139
@danlark1 made their first contribution in #145
@ThePhD made their first contribution in #149
@wx257osn2 made their first contribution in #171

Full Changelog: v1.0.1...v2.0.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

Contributors

What's Changed

Major changes

Performance

Bug fixes

Testing

Building

Benchmarking

New Contributors

Contributors

Releases: simdutf/simdutf

Version 2.0.5

What's Changed

Contributors

Version 2.0.4

What's Changed

New Contributors

Contributors

Version 2.0.3

What's Changed

New Contributors

Contributors

Version 2.0.2

Version 2.0.1

What's Changed

Contributors

Version 2.0.0

What's Changed

Major changes

Performance

Bug fixes

Testing

Building

Benchmarking

New Contributors

Contributors

Version 1.0.1

Version 1.0.0