Skip to content

Latest commit

 

History

History
97 lines (70 loc) · 4.1 KB

README.md

File metadata and controls

97 lines (70 loc) · 4.1 KB

Fun with the Fourier transform

Images

What happens when you switch the Fourier magnitudes of two pictures, while leaving their phases intact? Are the pictures still recognizable? If so, which is which? Is the magnitude or the phase more important to human visual perception?

Have a look at this Jupyter notebook to find out.

Credit

I'm not sure whom to attribute this "parlor trick" to - it's been around for a while and I've seen it in at least a couple of talks by different people. The earliest reference I could find is this Berkeley course from 1997, taught by S. Shankar Sastry.

Update: After looking some more, I believe the original source is this: A. V. Oppenheim and J. S. Lim, "The importance of phase in signals," in Proceedings of the IEEE, vol. 69, no. 5, pp. 529-541, May 1981. doi: 10.1109/PROC.1981.12022 (paywall)

There are some earlier papers that look at reconstructing images from phase and unit magnitude, but this seems to be the first that does the swapping.

Image sources and processing

Resized using the ImageMagick identify and convert commands to the dimensions of Bernie_Sanders.jpg:

DIMS=`identify -format '%wx%h!' Bernie_Sanders.jpg`
convert Hillary_Clinton_official_Secretary_of_State_portrait_crop.jpg \
    -resize $DIMS Hillary_Clinton.jpg
convert Anoures.jpg -resize $DIMS frogs.jpg

Audio

Trying the same trick with audio is a little more complicated. The Fourier transform is nonlocal, but humans do not hear a whole song at once. So in audio processing, instead of taking the Fourier transform of the whole signal, it is common to break up the signal into short time windows and do a Fourier transform on each of these windows.

A naive implementation of swapping magnitude and phase of two audio files on such windows may be found in this Python script. The results sound a little smoother using the Short-Time Fourier Transform (essentially: overlapping windows and a more carefully designed filter than simply a rectangular indicator). An implementation using this Python stft package may be found in this Python script.

The results are very window-size-dependent and are much harder to describe than in the case of images, but are still quite interesting and fun. I put some short examples in the speech and music folders.

Audio sources and processing

These files were processed using sox. See the music/music.sh and speech/speech.sh scripts for details.

License

MIT. See the LICENSE file for more details.