Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pyvkfft.fft.fftn error? #34

Open
mDiracYas opened this issue Apr 18, 2024 · 3 comments
Open

pyvkfft.fft.fftn error? #34

mDiracYas opened this issue Apr 18, 2024 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@mDiracYas
Copy link

Hi!
I have found that pyvkfft.fft.fftn exhibits strange behavior.
Also, in previous versions of pyvkfft, this code raises an error.

OS:Windows 10
Python: 3.12.2

import numpy as np
import cupy as cp
import pyvkfft.fft

shape = (3,2,2)

size = np.prod(shape)
x1 = cp.arange(size,dtype=cp.complex64).reshape(shape)
x2 = x1[:,None]

y1 = pyvkfft.fft.fftn(x1,axes=(-1,-2))
y2 = pyvkfft.fft.fftn(x2,axes=(-1,-2)).squeeze()

print("shape x1:",x1.shape, "shape x2:", x2.shape)
print("shape y1:",y1.shape, "shape y2:", y2.shape)
print(y1-y2)

case (i) pyvkfft == 2024.1.1 vkfft == 1.3.4

shape x1: (3, 2, 2) shape x2: (3, 1, 2, 2)
shape y1: (3, 2, 2) shape y2: (3, 2, 2)
[[[ -9.+0.j          1.+0.j       ]
  [  2.-3.4641016j   0.+0.j       ]]

 [[ 28.+3.4641016j  -2.+0.j       ]
  [-55.+0.j          3.+0.j       ]]

 [[ 44.-3.4641016j  -2.+0.j       ]
  [  2.+3.4641016j   0.+0.j       ]]]

case (ii) pyvkfft == 2023.1.1 vkfft == 1.2.9

---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
Cell In[2], line 8
      5 x2 = x1[:,None]
      7 y1 = pyvkfft.fft.fftn(x1,axes=(-1,-2))
----> 8 y2 = pyvkfft.fft.fftn(x2,axes=(-1,-2)).squeeze()
     10 print("shape x1:",x1.shape, "shape x2:", x2.shape)
     11 print("shape y1:",y1.shape, "shape y2:", y2.shape)

File [~\miniconda3\Lib\site-packages\pyvkfft\fft.py:200](http://localhost:8888/lab/tree/cupy_test/vkfft_error/~/miniconda3/Lib/site-packages/pyvkfft/fft.py#line=199), in fftn(src, dest, ndim, norm, axes, cuda_stream, cl_queue, return_scale)
    167 """
    168 Perform a FFT on a GPU array, automatically creating the VkFFTApp
    169 and caching it for future re-use.
   (...)
    197 :return: the destination array if return_scale is False, or (dest, scale)
    198 """
    199 backend, inplace, dest, cl_queue = _prepare_transform(src, dest, cl_queue, False)
--> 200 app = _get_fft_app(backend, src.shape, src.dtype, inplace, ndim, axes, norm, cuda_stream, cl_queue,
    201                    strides=src.strides)
    202 if backend == Backend.PYOPENCL:
    203     app.fft(src, dest, queue=cl_queue)

File [~\miniconda3\Lib\site-packages\pyvkfft\fft.py:137](http://localhost:8888/lab/tree/cupy_test/vkfft_error/~/miniconda3/Lib/site-packages/pyvkfft/fft.py#line=136), in _get_fft_app(backend, shape, dtype, inplace, ndim, axes, norm, cuda_stream, cl_queue, strides)
    134 @lru_cache(maxsize=config.FFT_CACHE_NB)
    135 def _get_fft_app(backend, shape, dtype, inplace, ndim, axes, norm, cuda_stream, cl_queue, strides=None):
    136     if backend in [Backend.PYCUDA, Backend.CUPY]:
--> 137         return VkFFTApp_cuda(shape, dtype, ndim=ndim, inplace=inplace,
    138                              stream=cuda_stream, norm=norm, axes=axes, strides=strides)
    139     elif backend == Backend.PYOPENCL:
    140         return VkFFTApp_cl(shape, dtype, cl_queue, ndim=ndim, inplace=inplace,
    141                            norm=norm, axes=axes, strides=strides)

File [~\miniconda3\Lib\site-packages\pyvkfft\cuda.py:123](http://localhost:8888/lab/tree/cupy_test/vkfft_error/~/miniconda3/Lib/site-packages/pyvkfft/cuda.py#line=122), in VkFFTApp.__init__(self, shape, dtype, ndim, inplace, stream, norm, r2c, dct, axes, strides, **kwargs)
    121     raise RuntimeError("Error creating VkFFTConfiguration. Was the CUDA context properly initialised ?")
    122 res = ctypes.c_int(0)
--> 123 self.app = _vkfft_cuda.init_app(self.config, ctypes.byref(res))
    124 check_vkfft_result(res, shape, dtype, ndim, inplace, norm, r2c, dct, axes, "cuda")
    125 if self.app is None:

OSError: exception: integer divide by zero
@vincefn vincefn self-assigned this Apr 21, 2024
@vincefn vincefn added the bug Something isn't working label Apr 21, 2024
@vincefn
Copy link
Owner

vincefn commented Apr 21, 2024

Thanks for the report, I can confirm this behaviour also with OpenCL:

import numpy as np
import pyopencl as cl
import pyopencl.array as cla
import pyvkfft.fft
from pyvkfft.base import calc_transform_axes

ctx = cl.create_some_context()
clq = cl.CommandQueue(ctx)

shape = (3,2,2)

size = np.prod(shape)
x0 = np.arange(size,dtype=np.complex64).reshape(shape)
x1 = cla.to_device(clq, x0)
x2 = cla.to_device(clq, x0[:,None])

y1 = pyvkfft.fft.fftn(x1,axes=(-1,-2)).get()
y2 = pyvkfft.fft.fftn(x2,axes=(-1,-2)).get() #.squeeze()

y0 = np.fft.fftn(x0, axes=(-1,-2))

print("shape x1:",x1.shape, "shape x2:", x2.shape)
print("shape y1:",y1.shape, "shape y2:", y2.shape)
print(abs(y1-y0).sum())
print(abs(y2.squeeze()-y0).sum())

which gives:

shape x1: (3, 2, 2) shape x2: (3, 1, 2, 2)
shape y1: (3, 2, 2) shape y2: (3, 1, 2, 2)
0.0
152.34962482055641

Interestingly the initial re-shuffling of the internal array shape passed to VkFFT (collapsing non-transformed arrays) gives the same results:

from pyvkfft.base import calc_transform_axes

print(calc_transform_axes((3,2,2), axes=(-1,-2)))
print(calc_transform_axes((3,1,2,2), axes=(-1,-2)))
([2, 2, 1], 3, [False, False, True], 2, (-1, -2))
([2, 2, 1], 3, [False, False, True], 2, (-1, -2))

So something else is going wrong...

@vincefn
Copy link
Owner

vincefn commented Apr 21, 2024

Hmm, I think I understand what is happening - for an axis of size 1 the stride is zero:

print(x1.strides)
print(x2.strides)
(72, 24, 8)
(72, 0, 24, 8)

and pyvkfft will then assume the internal transform order must be changed, and this is where the error occurs:

print(calc_transform_axes((3,2,2), axes=(-1,-2), strides=x1.strides))
print(calc_transform_axes((3,1,2,2), axes=(-1,-2), strides=x2.strides))
([2, 2, 1], 3, [False, False, True], 2, (-1, -2))
([2, 1, 3, 1], 2, [False, True, False, True], 3, [-3, -1])

now I need to understand what is going wrong - probably the easiest solution is to ignore any 1-sized (0-strided) axis

Some background info about why an axis of size 1 has a stride=0: see numpy/numpy#22950

vincefn added a commit that referenced this issue May 1, 2024
…he complicated cases of R2C where the fast axis must be known, even when only one axis has a length>1 [#34]
@vincefn
Copy link
Owner

vincefn commented May 1, 2024

OK, this should be fixed in the current git master.

The main issue was not, in fact, strides equal to zero (that's easily handled). And it is not possible to just ignore (squeeze) axes of length 1, as R2C transforms will give different results due to the handling of the fast axis... Which is surprisingly messy to handle, e.g. when considering arrays where only one axis has a length>1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants