This repository has been archived by the owner. It is now read-only.

Spectrogram fft fix #505

Merged
merged 4 commits into from Feb 14, 2018

Conversation

Projects
None yet
3 participants
@DrChainsaw
Copy link
Contributor

DrChainsaw commented Feb 7, 2018

What changes were proposed in this pull request?

Fixes for issue #502

Length of each timewindow in Spectrogram now is same sa fftSampleSize.
FastFourierTransform has a method which takes a real-valued signal. I left the old behaviour (which assumed complex valued) since there was a testcase for it already.

How was this patch tested?

New testcase for FFT added, although it has magic numbers in it. Manual inspection in debugger to see that change of real to complex format expected by FFT is correct.
Spectrogram windowing was tested manually by (temporarily) adding amplitudes as argument to buildSpectrogram, commenting out calling it from constructor and a main method like this:
public static void main(String[] args) {
Spectrogram spec = new Spectrogram(null, 4,2);
short[] testVec = {0,1,2,3,4,5,6,7};
spec.buildSpectrogram(testVec);
}

Please review
https://github.com/deeplearning4j/deeplearning4j/blob/master/CONTRIBUTING.md before opening a pull request.

DrChainsaw added some commits Feb 7, 2018

Two missing files for previous commit
Corrections for issue 502:
#502
@AlexDBlack
Copy link
Member

AlexDBlack left a comment

LGTM, though I'm not really familiar with FFT math...

@AlexDBlack AlexDBlack requested a review from saudet Feb 8, 2018

@saudet
Copy link
Member

saudet left a comment

Looks good! Just one comment inline.

* @return intensities of each frequency unit: mag[frequency_unit]=intensity
*/
public double[] getMagnitudes(double[] amplitudes) {

This comment has been minimized.

@saudet

saudet Feb 9, 2018

Member

@DrChainsaw We could keep backward compatibility by adding the following lines, what do you think?

/** Returns {@code getMagnitudes(amplitudes, true)}. */
public double[] getMagnitudes(double[] amplitudes) {
    return getMagnitudes(amplitudes, true);
}
/** Returns {@code complex ? getMagnitudesForComplex(amplitudes) : getMagnitudesForReal(amplitudes)}. */
public double[] getMagnitudes(double[] amplitudes, boolean complex) {
    return complex ? getMagnitudesForComplex(amplitudes) : getMagnitudesForReal(amplitudes);
}

This comment has been minimized.

@DrChainsaw

DrChainsaw Feb 9, 2018

Contributor

Well, its your call.

I was thinking that this might be a rare case when backwards compatibility is not desired. Chances are that people have been using the method without realizing it assumes the wierd odd-even complex format from FFT.transform and maintaining backwards comptibility would keep masking that potential issue. Otoh, who am I to second guess what people do and do not think about and break their code because of it? :)

I have also become somewhat allergic to methods with flag arguments over the years, but I'll gladly make the change if you prefer it. If it is just backwards compatibility you were after that obviously can be achieved by just calling getMagnitudesForComplex right away. If you prefer the flag way I assume you'd also like ForReal and ForComplex be private, right?

This comment has been minimized.

@saudet

saudet Feb 11, 2018

Member

No, the flag is just to indicate that it has a "default value" via the other overload. Makes it harder for people to miss it IMO...

Updates according to review comments.
Reverted to legacy method signature for complex interlaced amplitudes and added flag to select assumptions about amplitudes format for new method
@DrChainsaw

This comment has been minimized.

Copy link
Contributor

DrChainsaw commented Feb 13, 2018

@saudet Sorry for the delay. Noob as I am I accidentally pushed changes to the fork master and not to the branch.

I'm not sure this is exactly what you were after, but I assumed you didn't want four public methods to do the two things the class can do.

@saudet

This comment has been minimized.

Copy link
Member

saudet commented Feb 14, 2018

Well, looks like a good enough compromise for me for now. 👍 Thanks!

We'll need to revist this anyway to get rid of the com.sun.media.sound.FFT class at some point since it's not available in Java 9 anymore.

@saudet saudet merged commit c833d8b into deeplearning4j:master Feb 14, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@DrChainsaw

This comment has been minimized.

Copy link
Contributor

DrChainsaw commented Feb 14, 2018

@saudet Glad I could help!

Just a tip regarding the FFT: I recently converted my project to use Jtransform for transforms. It's licensed under BSD-2 Clause which from what I understand is a nicer license compared to GPL as it is non-viral. They claim comparable performance to FFTW (although the min execution time benchmark methodology looks somewhat fishy to me). Having a mix java and c in your project is bound to fail anyways /jk :)

I haven't done any measurements, but to me, there is no perceptable difference for ETL when training on raw waveforms vs spectrograms/MFSC/MFCC etc. It seems to be dominated by the file reading. For "online" inference (classifying audio stream from soundcard) even with my worst model ensembles the program spends ~70% of the time idling while the sound buffer fills up. All in all, FFT performance does not seem to be that important since other things (GPU memory for me) is bottlenecking.

I also saw that NDArrays have a built in fft method. I haven't tried it yet though but I guess it could be a kind of nice self-containment thing.

@DrChainsaw DrChainsaw deleted the DrChainsaw:SpectrogramFftFix branch Feb 14, 2018

@saudet

This comment has been minimized.

Copy link
Member

saudet commented Feb 15, 2018

https://github.com/wendykierp/JTransforms looks awesome! If you could integrate that it would be great! FFT isn't currently implemented in ND4J and since this is currently only for ETL, I think it makes sense to just use JTransforms as is in DataVec. But if you'd like to create the API in ND4J for FFT first, and then use that in DataVec, that would be fine too.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.