You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This release contains new reduction operations, Winograd algorithm performance improvements as well as bug fixes. Various host side performance improvements have been added as well.
Changes
Added a GPU reference kernel implementation for faster testing.
Add TargetID support for new AMD GPU architectures.
Implementation of four additional generic tensor reduction operations (AVG, AMAX, NORM1, NORM2).
Fixed a bug where Batchnorm would give incorrect results when the product of image height and image width is not a factor of four.
Various host side improvements for better find and tuning performance.