Audio Pattern Discovery
Pattern Discovery In Audio Collections in Rust.
The program will extract interesting regions from wav files and then cluster them using hierarchical clustering under dynamic time warping. Below we see some extracted and clustered dolphin whistles.
From each file we extract the cepstrum  in the following manner:
- Extract Sliding Window
- Compute DFT for each window
- Convolve DFT with triangular window with a stride of half the filter
- Compute log of filtered window
- Compute Cepstrum by computing the discrete cosine transform
The parameters needed so far are:
- dft window
- dft step
- triangular window size
We then find slices where something
- For each cepstrum frame compute its variance
- Smooth the variances in each sequence using a moving average
- Extract long sequences of high variances
The parameters needed for the
interesting detector are:
- percentile of variance to find variance threshold
- min size of subsequence
Now we can also reduce the dimensionality further, by adding an auto encoder. The one used here only has one hidden layer.
We then cluster all sequences using dynamic time warping window.
The window can be restricted by a
Sakoe-Chiba band . Furthermore,
we can weigh the errors
separate weights . We also stop clustering using a threshold
estimated by a percentage.
We cluster using agglomerative clustering with average linkage also known as UPGMA.
After this we generate an audio file for each cluster which contains all instances of the cluster. A latex document with the dendrograms of the clusterin and a classification experiment showing that the models for each cluster model the data. The output of the tool is summarised in a result html page.
In order to generate the report and all the clusters run:
The folder should contain wav files, it will be searched recursively.
In order to configure the program use the file in
In order to change the latex templates use the
audio.rsRead and Write Audio
main.rsTying it all together
alignments.rsDTW code with back tracking and alignment path information
numerics.rsAll numerics methods
spectrogram.rsImplements spectrogram and slicing
neural.rsImplements a one layer autoencoder
The results will be generated in the output folder:
result.htmlSummary of output with all links to the tool
log.txtWill show the logs of the run
imgHolds all image files, including the tikz files for the dendrograms and the png files for the spectrograms
encoderBinary dump of the auto encoder
docsWill contain the final pdf with all images and the log
audioIncludes all interesting regions and clusters as wav files
- Rust and Cargo