Skip to content

v0.20

Compare
Choose a tag to compare
@anita-intel anita-intel released this 28 Jun 17:40
· 34 commits to rls-v0.20 since this release

Performance optimizations

  • Improved GEMM-based convolutions performance.
  • Improved softmax performance.
  • Added arbitrary eltwise fusion support in GEMM-based convolutions and inner product.

New functionality

  • Introduced bfloat16 data type support in reorders, (de-)convolution, pooling, batch normalization, local response normalization, eltwise, inner product, shuffle, sum, and concat. The implementation relies on new instructions targeting future Intel Xeon Scalable processor (codename Cooper Lake). On the processors with Intel AVX512 support bfloat16 arithmetic is emulated.

Thanks to the contributors

This release contains contributions from many Intel Performance Libraries developers. We would also like to thank everyone who asked questions and reported issues.