Skip to content

Releases: kimwalisch/primesieve

primesieve-12.3

18 Apr 17:50
Compare
Choose a tag to compare

This release adds runtime dispatching to AVX512 (for x64 CPUs that support it) for MinGW. For x64 CPUs, AVX512 runtime dispatching is now enabled by default when compiling using GCC and Clang on all operating systems.

  • Improve Windows multiarch support (now works with MinGW64).
  • Add runtime POPCNT detection using CPUID for x86 CPUs.
  • Improve GCC/Clang multiarch preprocessor logic.
  • CMakeLists.txt: Remove POPCNT/BMI check for x86 CPUs.

primesieve-12.1

10 Mar 10:22
Compare
Choose a tag to compare

This is a new maintenance release, it is fully backwards compatible with the previous release.

  • CMakeLists.txt: Fix undefined reference to pthread_create #146.
  • test/Riemann_R.cpp: Fix musl libc issue #147.
  • src/app/test.cpp: Fix -ffast-math failure.
  • test/count_primes2.cpp: Fix -ffast-math failure.
  • PrimeSieve.cpp: Improve status output.

primesieve-12.0

19 Feb 14:46
Compare
Choose a tag to compare

The C/C++ API and ABI of primesieve-12.0 are fully backwards compatible with primesieve-11.*

The stress test functionality is the main new feature of primesieve-12.0, it can be launched using the --stress-test[=MODE] option of the primesieve command-line application. The stress test option supports two modes: CPU (default) or RAM. The CPU mode uses little memory (< 5 MiB per thread) and puts the highest load on the CPU. The RAM mode uses much more memory (each thread uses about 1.16 GiB) than the CPU mode, but the CPU usually won't get as hot. Due to primesieve's function multi-versioning support, on x64 CPUs the stress test will run an AVX512 algorithm if your CPU supports it.

  • stressTest.cpp: New -S[=MODE] and --stress-test[=MODE] command-line options.
  • RiemannR.cpp: Faster Riemann R function implementation #144.
  • CmdOptions.cpp: New -R and --RiemannR command line options.
  • CmdOptions.cpp: New --RiemannR-inverse command line option.
  • CmdOptions.cpp: Add new --timeout option for stress testing.
  • main.cpp: Improve command-line option handling.

primesieve-11.2

10 Jan 17:13
Compare
Choose a tag to compare

This is a new maintenance release, it is fully backwards compatible with the previous release. This release contains one CMake bug fix, documentation improvements, tests have been ported to GitHub Actions and the nth prime code has been cleaned up.

  • nthPrime.cpp: Rewritten using more accurate nth prime approximation.
  • nthPrimeApprox.cpp: Added logarithmic integral and Riemann R function implementations.
  • cmake/libatomic.cmake: Fix failed to find libatomic #141.
  • .github/workflows/ci.yml: Port AppVeyor CI tests to GitHub Actions.
  • doc/C_API.md: Fix off by 1 error in OpenMP example #137.
  • doc/CPP_API.md: Fix off by 1 error in OpenMP example #137.
  • Vector.hpp: Rename pod_vector to Vector and pod_array to Array.
  • iterator.h: Improve documentation.
  • iterator.hpp: Improve documentation.
  • C_API.md: Add SIMD (vectorization) section.
  • CPP_API.md: Add SIMD (vectorization) section.
  • README.md: Add C & C++ API badges.

Thanks to @sethtroisi and Sven S. for being primesieve sponsors in this release cycle!

primesieve-11.1

13 May 07:47
Compare
Choose a tag to compare

When primesieve is distributed via distro package managers, it is often not compiled using the highest optimization level -O3. Because of this primesieve's pre-sieving algorithm was not auto-vectorized in many cases. As a workaround for this issue I have now manually vectorized the pre-sieving algorithm for x64 CPUs (using portable SSE2) and for ARM64 CPUs (using portable ARM NEON). This can improve performance by up to 40%.

  • PreSieve.cpp: Vectorize loop using x64 SSE2 & ARM NEON.
  • popcount.cpp: Add POPCNT algorithm for x64 & AArch64.
  • primesieve.h: Fix -Wstrict-prototypes warning.
  • examples/c/*.c: Fix -Wstrict-prototypes warning.
  • test/*.c: Fix -Wstrict-prototypes warning.
  • CMakeLists.txt: New WITH_AUTO_VECTORIZATION option (with default ON).
  • cmake/auto_vectorize.cmake: Enable auto-vectorization if the compiler supports it.
  • scripts/build_mingw64_x64.sh: Build primesieve x64 release binary.
  • scripts/build_mingw64_arm64.sh: Build primesieve arm64 release binary.

primesieve-11.0

06 Dec 17:56
Compare
Choose a tag to compare

This version fixes two annoying libprimesieve issues. Firstly, from now on the shared libprimesieve version (.so version) will match the primesieve version. This makes it easier to depend on libprimesieve and to update to the latest libprimesieve. Secondly, primesieve_jump_to() has been added to libprimesieve's API. The new primesieve_jump_to(iter, start, stop) includes the start number (generates primes ≥ start), whereas the old primesieve_skipto(iter, start, stop) excludes the start number (generates primes > start). In practice, the use of
primesieve_jump_to() requires up to 2x less start number corrections (e.g. start-1) compared to primesieve_skipto().

C API deprecations

The libprimesieve C API and ABI are backwards compatible with libprimesieve ≥ 10.0. However, the primesieve_skipto() function from the libprimesieve C API has been marked as deprecated, please use the new primesieve_jump_to() instead.

C++ API breaking changes

Unlike the C API, in the C++ API the primesieve::iterator::skipto() method has been replaced by primesieve::iterator::jump_to(). The new method includes the start number whereas the old method excluded the start number. The primesieve::iterator constructors now also include the start number while they previously excluded the start number. Please read the documentation for more information.

ChangeLog

  • CMakeLists.txt: Improve Emscripten WebAssembly support.
  • iterator.cpp: Add new primesieve::iterator::jump_to().
  • iterator.cpp: Fix use after free in primesieve::iterator::clear().
  • iterator-c.cpp: Add new primesieve_jump_to().
  • iterator-c.cpp: Mark primesieve_skipto() as deprecated.
  • iterator-c.cpp: Fix use after free in primesieve_iterator_clear().
  • pod_vector.hpp: Added support for types with destructors.
  • malloc_vector.hpp: Fix potential memory leak.
  • api.cpp: Support non power of 2 sieve sizes.
  • PrimeSieve.cpp: Support non power of 2 sieve sizes.
  • PreSieve.cpp: Use std::initializer_list instead of std::vector.
  • Erat.cpp: Improve documentation.
  • C_API.md: Improve next_prime() and prev_prime() documentation.
  • CPP_API.md: Improve next_prime() and prev_prime() documentation.

Acknowledgements

I would like to thank Philip Vetter for his detailed feedback on the libprimesieve API, which caused me to create the new primesieve_jump_to().

primesieve-8.0

05 Jul 13:46
Compare
Choose a tag to compare

This is a new major release, the API of libprimesieve is backwards compatible, but the ABI (Application Binary Interface) of libprimesieve is not backwards compatible. This means that if your program uses the C/C++ libprimesieve you can simply recompile your program against the latest libprimesieve without any modifications of your code needed. If on the other hand you have e.g. written libprimesieve bindings for another programming language you will have to migrate your code to the new libprimesieve ABI.

Highlights of primesieve-8.0

  • libprimesieve now has multiarch support for x64 CPUs. At runtime libprimesieve now dispatches to the latest supported CPU instruction set like POPCNT, BMI2, AVX512 #116.
  • libprimesieve now generates an array (or vector) of primes up to 20% faster #123.

ChangeLog

  • primesieve::iterator's ABI has been modified in both the C & C++ API.
    primesieve::iterator's API remains backwards compatible.
  • CPP_API.md: Renamed doc/CPP_Examples.md to doc/CPP_API.md.
  • C_API.md: Renamed doc/C_Examples.md to doc/C_API.md.
  • Fix undefined behavior (g++-12 issue) caused by resizeUninitialized.hpp, use new pod_vector<uint64_t> from pod_vector.hpp instead.
  • iterator.cpp: Enable pre-sieving for primesieve::iterator.prev_prime().
  • iterator-c.cpp: Enable pre-sieving for primesieve::iterator.prev_prime().
  • PreSieve.cpp: Detect if the user sieves many consective intervals.
  • PrimeGenerator.cpp: Improve AVX512 of fillNextPrimes().
  • PrimeGenerator.cpp: Reduce memory usage for tiny stop numbers.
  • PrimeGenerator.hpp: Add GCC/Clang's function multiversioning for AVX512.
  • Erat.cpp: Dynamically grow the sieve size: use a small sieve size for small stop numbers and a large sieve size for large stop numbers.
  • Erat.cpp: Reduce memory usage, allocate the minimum required memory to store all sieving primes.
  • CpuInfo.cpp: Detect AVX512 using CPUID.
  • pmath.hpp: Use compiler instrinsics for ilog2() & floorPow2().
  • StorePrimes.hpp: Use vector::insert() instead of vector::push_back(), see: #123.
  • CMakeLists.txt: Automatically enable expensive debug assertions in debug mode (if CMAKE_BUILD_TYPE=Debug).

primesieve-7.9

03 May 10:46
Compare
Choose a tag to compare

This is a new minor release, the API and ABI of libprimesieve are backwards compatible.

The focus of this release has been to reduce the memory usage of libprimesieve and to reduce its initialization overhead. I have also added support for big.LITTLE CPU detection on Linux which provides a significant speedup on Intel's latest consumer CPUs. Many of the improvements in this release originated from Jason's patch set, thank you Jason!

ChangeLog

  • intrinsics.hpp: Improved x64 BSF assembly.
  • iterator.cpp: Reduce memory allocations in generate_prev_primes().
  • iterator-c.cpp: Reduce memory allocations.
  • CpuInfo.cpp: Improve hybrid CPU detection on Linux.
  • Erat.cpp: Reduce memory usage when sieving a single segment.
  • EratBig.cpp: Improve instruction level parallelism.
  • EratBig.cpp: Improve next wheel index code.
  • EratBig.cpp: Use std::copy() instead of std::rotate().
  • SievingPrimes.cpp: Reduce branch mispredictions.
  • PreSieve.cpp: Hardcode buffersDist.
  • MemoryPool.cpp: Reduce memory usage.
  • StorePrimes.hpp: Improve nth prime approximation.
  • config.hpp: Tune FACTOR_ERATMEDIUM constant.
  • Use a single MemoryPool per thread (previously 2).
  • Increase max sieve array size to 8 KiB.

primesieve-7.8

29 Jan 08:59
Compare
Choose a tag to compare

This is a new minor release, the API and ABI of libprimesieve are backwards compatible.

The primesieve command-line program runs up to 10% faster due to improved pre-sieving and libprimesieve's primesieve::iterator runs up to 15% faster due to improved pre-sieving, reduced branch mispredictions and increased instruction level parallelism. primesieve now pre-sieves the multiples of small primes < 100 (previously ≤ 19) using only half as much memory as before. Instead of using a single large pre-sieved buffer primesieve now uses 8 smaller pre-sieved buffers which are bitwise AND together before being copied into the sieve array. Thanks to @zielaj for this amazing work!

ChangeLog

  • PreSieve.cpp: Add multiple pre-sieve buffers #110.
  • PrimeGenerator.cpp: Reduce branch mispredictions #109.
  • PrimeGenerator.cpp: Add AVX512 algorithm #109.
  • iterator.cpp: Avoid default initialization of primes vector.
  • iterator-c.cpp: Avoid default initialization of primes vector.
  • ParallelSieve.cpp: Initialize PreSieve.
  • ALGORITHMS.md: Update documentation.

primesieve-7.7

04 Dec 09:01
286ecbe
Compare
Choose a tag to compare

This is a new minor release, the API and ABI of libprimesieve are backwards compatible.

The CPU cache size detection has been improved on big.LITTLE CPUs such as Intel Alder Lake. The code now also handles uncertain situations better when CPU cache information is only partially available, it then uses a more conservative approach (i.e. smaller sieve array size) to prevent potential scaling issues.

Backwards incompatible change in primesieve command-line application

The behavior of the -q/--quiet option in the primesieve command-line application has been modified. This option now prints the result without any additional text, e.g. "1229" instead of previously "Primes: 1229". This is a backwards incompatible change in the primesieve command-line application, however I didn't increase primesieve's major version since this change does not affect libprimesieve's API/ABI.

ChangeLog

  • CpuInfo.cpp: Fix issues with big.LITTLE CPUs #105.
  • api.cpp: Simplify private L2 cache size detection #103.
  • config.cpp: Add fallback sieve size & L1 data cache size.
  • Erat.cpp: If runtime CPU cache detection fails use config::L1D_CACHE_BYTES.
  • main.cpp: Improve -q/--quiet option #102.
  • api-c.cpp: Print error messages to stderr.
  • iterator-c.cpp: Print error messages to stderr.
  • doc/primesieve.1: Update man page.
  • CMakeLists.txt: Add WITH_MSVC_CRT_STATIC option to force static linking.
  • C_Examples.md: Add CMake build instructions.
  • CPP_Examples.md: Add CMake build instructions.