Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Fast Fourier Transform ops #386

Closed
NickShahML opened this issue Dec 1, 2015 · 79 comments

Comments

Projects
None yet
@NickShahML
Copy link

commented Dec 1, 2015

Hey TF,

Trying to integrate the new Unitary RNN's into tensorflow. Unfortunately, they require Fast Fourier Transform and Inverse Fast Fourier Transforms.

For efficiency this needs to be done on the GPU. On the author's code (theano), they do this by making a Theano Op, and inserting scikit's Fast Fourier and Inverse Fast Fourier Here:

https://github.com/amarshah/complex_RNN/blob/master/fftconv.py

How can this be done in TF? Upon completing the Unitary RNN, I plan to share it to everyone! I asked on the google groups, but didn't get a reply.

@vrv

This comment has been minimized.

Copy link
Contributor

commented Dec 2, 2015

We'll need to add a few FFT ops to get this done -- probably using cufft or fbfft for GPU. Assigning to @zffchen78 since he has experience in ffts and adding ops :)

@NickShahML

This comment has been minimized.

Copy link
Author

commented Dec 2, 2015

Thank you @vrv and @zffchen78 . Basically, these new types of RNN's are very promising and I hope to integrate them into TensorFlow over the next week if possible. I'm starting to work on it here:
https://github.com/LeavesBreathe/Seq2Seq_Upgrade_TensorFlow

However, it will not be possible to do without FFT and IFFT support so I really appreciate your help!

@adder

This comment has been minimized.

Copy link

commented Dec 2, 2015

So would this imply that the FFT ops only work for GPU and not CPU? Or just very very slow on CPU?

@NickShahML

This comment has been minimized.

Copy link
Author

commented Dec 2, 2015

Well the problem is that the actual unitary RNN uses FFT and IFFT within its cell. Ideally you want the cell's computation done in parallel on the GPU. If its on the CPU, its going to take forever.

Here's the paper: Look at their main equation http://arxiv.org/pdf/1511.06464v1.pdf (its number 6)

@adder

This comment has been minimized.

Copy link

commented Dec 2, 2015

With 'taking forever', you mean compared to the uRNN GPU version or also compared to a 'classical' RNN or a LSTM with the same number of parameters? I would at least hope the speed to convergence demonstrated in the paper would outweigh any extra computation time per iteration. :)

Anyways, I would expect that all ops defined in Tensorflow would be available for both CPU and GPU (and any future device). Me and I guess others don't have day to day access to a suitable GPU. Or is that not one of the design goals?

@vrv

This comment has been minimized.

Copy link
Contributor

commented Dec 2, 2015

We'll try to provide an implementation for both.

@vrv vrv changed the title Unitary RNN --> Fast Fourier Transform on GPU Implement Fast Fourier Transform ops Dec 2, 2015

@NickShahML

This comment has been minimized.

Copy link
Author

commented Dec 2, 2015

Perfect! Thanks! @adder, I primarily use GPUs and if you're going to use unitary RNN's for some serious language modeling, you gotta use GPU's (10x faster training time).

I know they may converge more quickly (one of the benefits), but keep in mind that an epoch usually takes at least 24 hrs with normal LSTM's and GRU's for big datasets. So we simply can't afford to put them on CPUs.

@zffchen78

This comment has been minimized.

Copy link
Contributor

commented Dec 2, 2015

@LeavesBreathe , can you elaborate a bit more on your requirements about FFT/IFFT?
Specifically,

  1. do you need r2c (real-to-complex), c2r(complex-to-real), or c2c(complex-complex) fft/ifft support?
  2. do you need 1D, 2D, or 3D fft/ifft? or all of them?
  3. do you need support for batching?
  4. In your original email, you pointed to theano's code fftconv.py, which uses fft/ifft to implement conv2d operation. Do you need conv2d in the end? If that's all you want, tf has conv2d operation already and they run on gpu using cudnn. As I was told, newer version of cudnn uses fft under the hood already.
@adder

This comment has been minimized.

Copy link

commented Dec 2, 2015

I guess you should ask this questions to @LeavesBreathe :)

@zffchen78

This comment has been minimized.

Copy link
Contributor

commented Dec 2, 2015

@LeavesBreathe, is this the paper you are trying to reproduce? http://arxiv.org/pdf/1511.06464v1.pdf And formula (6) is the core computation you need to do. AFAIK, you need 2D, non-batched, complex-to-complex fft/ifft. Is that correct?

@NickShahML

This comment has been minimized.

Copy link
Author

commented Dec 2, 2015

Thanks @adder! @zffchen78, thanks for your help in this matter. I will do my best to answer your questions. The ultimate goal is to replicate this paper: http://arxiv.org/pdf/1511.06464v1.pdf

To do so, I was planning on making a new class in RNN_cell.py. Therefore, I believe with need complex to complex along with 2d support.

The reason why I wanted a pythonic tensorflow op is so that we can assign multiple gpu's to multiple unitary RNN's when the whole integration is complete. So one uRNN would be on gpu 1, and another uRNN would be on gpu 2.

I don't know how hard it is for you to implement these fft's and ifft's but I think 3d support would be nice for future users who may try unitary conv nets. I certainly don't want to ask too much of you though!

If this is of any help to you, the goal is to replicate "Complex_RNN" here: https://github.com/amarshah/complex_RNN/blob/master/models.py#L532

@psmit

This comment has been minimized.

Copy link

commented Dec 9, 2015

I would also very much like to see fft support, not for the training of dnn models, but for the feature extraction in a speech recognition pipeline. In this case only r2c, 1d would be required, preferably with batching both in cpu and gpu.

@NickShahML

This comment has been minimized.

Copy link
Author

commented Dec 16, 2015

Hey @zffchen78 , I just wanted to follow up and ask if this was implemented yet? I can't make any progress on the unitary RNN's without it. Don't mean to bring on any pressure! Just wanted to ask.

@zffchen78

This comment has been minimized.

Copy link
Contributor

commented Dec 18, 2015

I implemented 2D fft in TF code base already. Hopefully, it'll be copied in OSS soon.

@NickShahML

This comment has been minimized.

Copy link
Author

commented Dec 18, 2015

Okay thanks for the headsup. Looking forward to it!

@NickShahML NickShahML closed this Dec 18, 2015

@psmit

This comment has been minimized.

Copy link

commented Dec 18, 2015

@LeavesBreathe Why close this, it isn't there yet!

@NickShahML NickShahML reopened this Dec 18, 2015

@zffchen78

This comment has been minimized.

Copy link
Contributor

commented Dec 22, 2015

@LeavesBreathe, we pushed changes today which enables fft2d and ifft2d on gpu. You can take a look of python/kernel_tests/fft_ops_test.py to see if it works for you. We are still figuring out license issues w/ fftw where cpu support for fft2d/ifft2d needs. Hopefully, gpu supports are sufficient for you now. Let me know if you see any problems.

@NickShahML

This comment has been minimized.

Copy link
Author

commented Dec 22, 2015

Awesome, gpu support is really all i needed so let me test it out and get back to you. I am going to be pretty busy over the holidays but come January 4 I should be back to testing this!

@zffchen78

This comment has been minimized.

Copy link
Contributor

commented Dec 22, 2015

@LeavesBreathe, I took a brief of the paper you plan to replicate in TF. I suspect you'll encounter some nuances. E.g., some operations may not have support for complex64 as time being. They are minor problems because they can be easily patched by updating a few .cc file to expand the types they support.

@NickShahML

This comment has been minimized.

Copy link
Author

commented Dec 31, 2015

OKay this is very good to know @zffchen78. I will be sure to look into these formats.

@NickShahML NickShahML closed this Dec 31, 2015

@NickShahML NickShahML reopened this Dec 31, 2015

@raingo

This comment has been minimized.

Copy link
Contributor

commented Jan 15, 2016

I think @LeavesBreathe answer to @zffchen78 question was wrong.

What uRNN needs is batched 1d fft. Both the paper and their theano implementation indicate this.

Paper wise, the Equation (6) is certainly multiplied with a complex vector for each data instance.

Implementation wise, check the following call stack,

  1. https://github.com/amarshah/complex_RNN/blob/master/models.py#L14
  2. https://github.com/amarshah/complex_RNN/blob/master/fftconv.py#L147

To TF core developer: why not simply implement an interface like scikits.cuda: https://github.com/lebedov/scikit-cuda/blob/master/skcuda/fft.py

@zffchen78

This comment has been minimized.

Copy link
Contributor

commented Jan 16, 2016

We welcome the community to contribute the code.

If one wants to make the fft op supports more complete in tensorflow, feel
free to send us changes. It shouldn't be too hard to copy the fft2d impl,
and add fft1d, fft3d, and batched_fft1d, batched_fft2d, batched_3d, etc. as
needs arises, plus testing, etc. These ops' impl will pretty much look like
stream->CreatePlan(...);
stream->DoFft(...);

On Fri, Jan 15, 2016 at 9:10 AM Raingo notifications@github.com wrote:

I think @LeavesBreathe https://github.com/LeavesBreathe answer to
@zffchen78 https://github.com/zffchen78 question was wrong.

What uRNN needs is batched 1d fft. Both the paper and their theano
implementation indicate this.

Paper wise, the Equation (6) is certainly multiplied with a complex vector
for each data instance.

Implementation wise, check the following call stack,

  1. https://github.com/amarshah/complex_RNN/blob/master/models.py#L14
  2. https://github.com/amarshah/complex_RNN/blob/master/fftconv.py#L147

To TF core developer: why not simply implement an interface like
scikits.cuda:
https://github.com/lebedov/scikit-cuda/blob/master/skcuda/fft.py


Reply to this email directly or view it on GitHub
#386 (comment)
.

@tillahoffmann

This comment has been minimized.

Copy link
Contributor

commented Mar 26, 2017

@rryan, I have working CPU/GPU kernels for complex FFTs based on Eigen but have some issues integrating your recent RFFT changes. Would you be up for a chat to iron out those bumps so we can add the CPU kernels? Ideally, I would like to move to an interface similar to conv2d which takes use_cudnn_on_gpu as an optional argument.

@rryan

This comment has been minimized.

Copy link
Member

commented Mar 26, 2017

@tillahoffmann happy to chat! though I'm not on the TF team per se (just a contributor). I'll loop in some TF folks who could weigh in. Want to email me and we'll go from there? My google.com email is rjryan. Also, are the changes pushed publicly anywhere?

@tillahoffmann

This comment has been minimized.

Copy link
Contributor

commented Mar 27, 2017

@rryan, no not pushed publicly at the moment because I don't have the tests passing yet. Will drop you an e-mail.

@rryan

This comment has been minimized.

Copy link
Member

commented Apr 3, 2017

The current plan of record is that @tillahoffmann is working on a CPU-based FFT/RFFT using Eigen's TensorFFT module. No firm ETA but hopefully soon -- thanks @tillahoffmann!

@pawarrick

This comment has been minimized.

Copy link

commented May 8, 2017

@tillahoffmann, do you have any updates on progress to the CPU implementation of FFT? Many thanks for this.

@tillahoffmann

This comment has been minimized.

Copy link
Contributor

commented May 18, 2017

@pawarrick, complex FFT and IFFT as well as the RFFT are now in master. The IRFFT still needs a bit of work.

@pawarrick

This comment has been minimized.

Copy link

commented May 18, 2017

yes thanks very much for your contribution. I saw the discussion about it in this thread #9029

@Czxck001

This comment has been minimized.

Copy link
Contributor

commented May 23, 2017

@tillahoffmann Very much thanks for your contribution! I've made up the IRFFT part, see #10127

@rryan

This comment has been minimized.

Copy link
Member

commented May 24, 2017

Thanks to @tillahoffmann (#9029) and @Czxck001 (#10127), I think we can call this done!

@rryan rryan closed this May 24, 2017

@beniroquai

This comment has been minimized.

Copy link

commented Jun 10, 2017

Great work! Now can I finally hack any fft operation on my laptop.
I'm a bit curious. Do you see any chances, that these OPs could eventually be ported to mobile platforms such as Android?

@Androbin

This comment has been minimized.

Copy link
Contributor

commented Jun 10, 2017

@beniroquai
As long as they are available for CPU, they can be compiled for Android.
I don't know, if they are in the default op list.
Otherwise you can use selective registration.
See #10254, #10299 and #10351

@uschmidt83

This comment has been minimized.

Copy link
Contributor

commented Jul 21, 2017

Hi, I just tried running FFT ops on the CPU with the 1.2.1 release, but still get errors like this:
No OpKernel was registered to support Op 'FFT2D' with these attrs. Registered devices: [CPU], Registered kernels: device='GPU'

Am I missing something?

@tillahoffmann

This comment has been minimized.

Copy link
Contributor

commented Jul 21, 2017

@uschmidt83, have you tried the nightly build?

@beniroquai

This comment has been minimized.

Copy link

commented Jul 21, 2017

@rryan

This comment has been minimized.

Copy link
Member

commented Jul 21, 2017

@uschmidt83

This comment has been minimized.

Copy link
Contributor

commented Jul 25, 2017

Thanks, everyone! It does work for me with the 1.3.0-rc0 release.

@FedericoMuciaccia

This comment has been minimized.

Copy link

commented Sep 15, 2017

there isn't an alias for real ffts in TF 1.3.0

present functions:
tf.spectral.fft
tf.spectral.fft2d
tf.spectral.fft3d
tf.spectral.ifft
tf.spectral.ifft2d
tf.spectral.ifft3d
tf.spectral.irfft
tf.spectral.irfft2d
tf.spectral.irfft3d
tf.spectral.rfft
tf.spectral.rfft2d
tf.spectral.rfft3d

present aliases:
tf.fft
tf.fft2d
tf.fft3d
tf.ifft
tf.ifft2d
tf.ifft3d

can someone please create those missing aliases for the next TF release?

@rryan

This comment has been minimized.

Copy link
Member

commented Sep 15, 2017

@WangNuoWa

This comment has been minimized.

Copy link

commented Oct 31, 2017

Hi everyone,

I would like to do some work on the basis of this article, but the original author uses theano.tensor.fft module to implement batch 1-D FFT, batch 2-D FFT, batch 1-D IFFT, batch 2-D IFFT. Now i want to replace the theano.tensor.fft module directly into tensorflow in tf.fft and tf.fft2d, ok?

https://github.com/ChihebTrabelsi/deep_complex_networks/blob/master/complexnn/fft.py

We look forward to your help, thank you very much!

@rryan

This comment has been minimized.

Copy link
Member

commented Oct 31, 2017

@WangNuoWa -- yes, tf.fft/tf.ifft and tf.fft2d/tf.ifft2d should be drop-in replacements for those (though I have never used theano.tensor.fft), perhaps with slightly different normalization strategies though (I see an "ortho" norm in there).

@sirgogo

This comment has been minimized.

Copy link

commented Dec 14, 2017

Is tf.fft differentiable?

@rryan

This comment has been minimized.

Copy link
Member

commented Dec 14, 2017

@sirgogo

This comment has been minimized.

Copy link

commented Dec 24, 2017

@rryan, is there anything special we would have to do? I get new warnings now, like these: (you can copy paste the whole thing)

import numpy as np
import tensorflow as tf

x = np.linspace(0,10,num=10,dtype=np.complex64)
w = np.linspace(0,1,num=10,dtype=np.complex64)
y = np.multiply(x,w)

x_tf = tf.Variable(x,dtype=tf.complex64)
w_tf = tf.Variable(w,dtype=tf.complex64)
y_tf = tf.multiply(x_tf,w_tf)

sess = tf.Session()
sess.run(tf.global_variables_initializer())
sess.run(y_tf)
g1 = tf.gradients(y_tf,x_tf)
'''
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/intel/intelpython3/lib/python3.5/site-packages/tensorflow/python/ops/gradients_impl.py", line 472, in gradients
    grad_ys = _DefaultGradYs(grad_ys, ys, colocate_gradients_with_ops)
  File "/opt/intel/intelpython3/lib/python3.5/site-packages/tensorflow/python/ops/gradients_impl.py", line 233, in _DefaultGradYs
    y.dtype)
TypeError: Gradients of complex tensors must set grad_ys (y.dtype = tf.complex64)
'''

a_tf = tf.fft(x_tf)
sess.run(a_tf)
g2 = tf.gradients(a_tf,x_tf)
'''
g2 = tf.gradients(a_tf,x_tf)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/intel/intelpython3/lib/python3.5/site-packages/tensorflow/python/ops/gradients_impl.py", line 472, in gradients
    grad_ys = _DefaultGradYs(grad_ys, ys, colocate_gradients_with_ops)
  File "/opt/intel/intelpython3/lib/python3.5/site-packages/tensorflow/python/ops/gradients_impl.py", line 233, in _DefaultGradYs
    y.dtype)
TypeError: Gradients of complex tensors must set grad_ys (y.dtype = tf.complex64)
'''

print(x_tf)
print(w_tf)
print(y_tf)
'''
>>> print(x_tf)
<tf.Variable 'Variable:0' shape=(10,) dtype=complex64_ref>
>>> print(w_tf)
<tf.Variable 'Variable_1:0' shape=(10,) dtype=complex64_ref>
>>> print(y_tf)
Tensor("Mul:0", shape=(10,), dtype=complex64)
'''

If its relevant:

pip list | grep tensorflow
tensorflow-gpu (1.4.1)
@rryan

This comment has been minimized.

Copy link
Member

commented Dec 24, 2017

@rryan, is there anything special we would have to do? I get new warnings now, like these: (you can copy paste the whole thing)

I think what you're seeing is independent of whether FFT is differentiable. Here's a colab where I copied your example. You need to do as the error message says and provide a grad_ys argument to tf.gradients if ys is complex, because the TF gradient infrastructure currently won't assume that grad_ys is all ones if the type is complex, as it does when the type is real (I'm not totally sure why this is... maybe the original author of that code didn't want to assume that 1+1j was the proper grad_ys for a complex number because its magnitude is greater than 1?).

You can see this logic here:

if grad_y is None:
if y.dtype.is_complex:
raise TypeError(
"Gradients of complex tensors must set grad_ys (y.dtype = %r)" %
y.dtype)
new_grad_ys.append(array_ops.fill(
array_ops.shape(y), constant_op.constant(
1, dtype=y.dtype, name="grad_ys_%d" % i)))

@zaccharieramzi

This comment has been minimized.

Copy link

commented Mar 25, 2019

Hi,

I was wondering whether you guys at TF were planning to implement a non-uniform FFT layer.
I don't know if this question belongs here, or if I should create a new issue by itself.

Cheers

@rryan

This comment has been minimized.

Copy link
Member

commented Mar 25, 2019

@zaccharieramzi nobody on the TF team is currently considering that, as far as I know.

AIUI, the most efficient algorithms for it are defined in terms of the FFT -- is it an easy extension of the FFT ops that already exist? If so, I would be happy to code review an addition to tf.signal to add it if someone submits a pull request.

If it is more involved to implement, for example requiring custom kernels, then I think it might be a harder sell to include since there has not been high demand for it, and adding custom kernels increasing the binary size and compile time of TensorFlow.

Could you please file a GitHub issue requesting the feature? Anyone else who is interested would then have a nice place to add a +1 so we can better understand how many people would benefit from it being in TensorFlow itself (as opposed to being implemented as a standalone library).

@zaccharieramzi

This comment has been minimized.

Copy link

commented Mar 27, 2019

Hi @rryan ,

Unfortunately it's not a direct extension of the FFT afaik (I am still new to the topic). So I think it does require a custom kernel. I totally understand that it could be a hard sell and I will file a github issue requesting the feature (I am totally not familiar with C/C++ so I don't think I could implement it myself).

In the meantime I will use ODL to achieve my goals, although slower because the implementation is in python.


EDIT: done in #27193 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.