-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Input data is real, not complex #32
Comments
@fredRos what do you think? Feasible? |
yes, we should do that. I'm profiling the code right now and see a number of things we can improve. This one is high-level and makes perfect sense |
It is more work and requires more decisions:
|
A realistic use-case of galario employs only the For |
What about shifting the real image? It seems to me like we could do that and only after the shift we'd add the imaginary part, perhaps on the device |
I'm following your suggestion now. I tried to do it in place but that would not allow multiple threads to operate concurrently. But then we have to use 50 % extra memory on the GPU to have a real and complex image until the complex image is properly constructed. This may be a problem for users with high-res images and small memory GPUs Perhaps I can do the profiling to see if it's faster to do the construction on the CPU and then transfer. On all system I have seen so far it is safe to assume that there is more memory on the host available. |
fixed by #45 |
Now the input image is assumed to be complex: (from libcommon.pyx)
def sample(dcomplex[:,::1] data, dRA, dDec, du, dreal[::1] u, dreal[::1] v)
We assumed this so that the input data already allocated the space needed for FFT operations. However, the image in real space is always real, so at the moment we need to cast the image to complex type before using galario:
sample(ref_complex.astype('complex128'), dRA, dDec, du, u, v)
We could save the casting and transfer time if we copy a real 2d array, rather than complex.
Can we easily implement this? E.g. by:
def sample(dreal[:,::1] data, dRA, dDec, du, dreal[::1] u, dreal[::1] v)
kernel unsigned int to float: https://stackoverflow.com/questions/9153861/typecasting-in-cuda-and-cublas
This change would be needed since there is no real sense to have a complex input image.
As a future step: we could switch from using
complex to complex
toreal to complex
FFTs, but we did not do that because it changes the mapping and the size of the matrices.The text was updated successfully, but these errors were encountered: