FFT/Audio Post-processing as a Song Heuristic #86

stephen-huan · 2020-05-22T14:32:34Z

Going along with the conversation in #81 and #39, while the song length heuristic is elegant and easy to program, the problem is remixes, which are rare but a possible problem. NLP is harder to program and probably error prone. In general, neither give a guarantee of true audio similarity.

Just adding another possible heuristic into the pot - combine song length with the Fast Fourier Transform (FFT). Numpy has an implementation, and the FFT can be used to directly compare two wave forms for similarity. The FFT can in fact be used to minimize the L2 norm between two integer arrays (the squared difference between the numbers at each index), I have an explanation here.

Although there is a large body of research trying to compute music similarity, I think a simple algorithm is sufficient in this case since the songs compared should be almost identical.

However, this likely introduces a non-intuitive extra parameter FFT_CUTOFF which would likely be experimentally determined (if the songs have a FFT value > FFT_CUTOFF, warn the user that the song found is likely incorrect).

Another algorithm than the FFT is fine, just something that deals with the actual audio.

In summary:
First, check the songs to make sure they have similar lengths.
Then, run a FFT over the songs, computing the L2 norms between the songs themselves.
If the value is > FFT_CUTOFF, pick another song or warn the user.

Pros:

Not an indirect heuristic, targets the exact thing we want (audio similarity)
Standard tool for wave analysis

Cons:

Could be slow depending on implementation (run async/in parallel?)
Not intuitive what the cutoff should be
More complex

The text was updated successfully, but these errors were encountered:

stephen-huan closed this as completed Mar 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FFT/Audio Post-processing as a Song Heuristic #86

FFT/Audio Post-processing as a Song Heuristic #86

stephen-huan commented May 22, 2020

FFT/Audio Post-processing as a Song Heuristic #86

FFT/Audio Post-processing as a Song Heuristic #86

Comments

stephen-huan commented May 22, 2020