Documentation for hipFFT is available at https://rocm.docs.amd.com/projects/hipFFT/en/latest/.
-
Implemented the
hipfftMpAttachComm
,hipfftXtSetDistribution
, andhipfftXtSetSubformatDefault
APIs to allow computing FFTs that are distributed between multiple MPI (Message Passing Interface) processes. These APIs can be enabled with theHIPFFT_MPI_ENABLE
CMake option, which defaults toOFF
.The backend FFT library called by hipFFT must support MPI for these APIs to work.
- Building with the address sanitizer option sets xnack+ for the relevant GPU architectures.
- Use find_package CUDAToolkit instead of CUDA in cmake for modern-cmake compatibility.
- Fixed client packages to depend on hipRAND instead of rocRAND.
- Support for the gfx1151, gfx1200, and gfx1201 architectures
- hipfft-test now includes a --smoketest option.
- The AMD backend is now compiled using amdclang++ instead of hipcc. The NVIDIA CUDA backend still uses hipcc-nvcc.
- CLI11 replaces Boost Program Options as the command line parser for clients.
- Support gfx1151 architecture.
- Added hip::host as a public link library, as hipfft.h includes HIP runtime headers.
- Prevent C++ exceptions leaking from public API functions.
- Make output of hipfftXt match cufftXt in geometry and alignment for 2D and 3D FFTs.
- When building hipFFT from source, rocFFT code no longer needs to be initialized as a git submodule.
- Fixed error when creating length-1 plans.
hipfft-rider
has been renamed tohipfft-bench
; it is controlled by theBUILD_CLIENTS_BENCH
CMake option (note that a link for the old file name is installed, and the oldBUILD_CLIENTS_RIDER
CMake option is accepted for backwards compatibility, but both will be removed in a future release)- Binaries in debug builds no longer have a
-d
suffix - The minimum rocFFT required version has been updated to 1.0.21
hipfftXtSetGPUs
,hipfftXtMalloc, hipfftXtMemcpy
,hipfftXtFree
, andhipfftXtExecDescriptor
APIs have been implemented to allow FFT computing on multiple devices in a single process
hipfftXtMakePlanMany
,hipfftXtGetSizeMany
, andhipfftXtExec
APIs have been implemented to allow half-precision transform requests
- Added the
--precision
argument to benchmark and test clients (--double
is still accepted, but has been deprecated as a method to request a double-precision transform)
- Fixed old version ROCm include and lib folders that were not removed during upgrades
- Added the
hipfftExtPlanScaleFactor
API to efficiently multiply each output element of an FFT by a given scaling factor (result scaling must be supported in the backend FFT library)
- rocFFT 1.0.19 or higher is now required for hipFFT builds on the rocFFT backend
- Data are initialized directly on GPUs using hipRAND
- Updated build files now use standard C++17
- Cleaned up build warnings
- GNUInstallDirs enhancements
- GoogleTest 1.11 is required
- Added file and folder reorganization changes with backward compatibility support when using rocm-cmake wrapper functions
- New packages for test and benchmark executables on all supported operating systems that use CPack
- Implemented
hipfftMakePlanMany64
andhipfftGetSizeMany64
- Use
fft_params
struct for accuracy and benchmark clients
- Incorrect reporting of rocFFT version
- Unconditionally enabled callback functionality: On the CUDA backend, callbacks only run correctly when hipFFT is built as a static library, and linked against the static cuFFT library
- Added support for Windows 10 as a build target
- Packaging has been split into a runtime package (
hipfft
) and a development package (hipfft-devel
): The development package depends on the runtime package. When installing the runtime package, the package manager will suggest the installation of the development package to aid users transitioning from the previous version's combined package. This suggestion by package manager is for all supported operating systems (except CentOS 7) to aid in the transition. Thesuggestion
feature in the runtime package is introduced as a deprecated feature and will be removed in a future ROCm release.
- Add calls to rocFFT setup and cleanup
- CMake fixes for clients and backend support
- Added support for Windows 10 as a build target
- CMake updates
- New callback API in
hipfftXt.h
header
- No changes
- Batch support for
hipfftMakePlanMany
- Work area handling during plan creation and
hipfftSetWorkArea
- Honour
autoAllocate
flag
- Testing infrastructure reuses code from rocFFT