Skip to content

markusMM/audioBSSwICA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Audio BSS with ICA

(c) Apr 2016 Markus Meister, Jakob Drefs, Jörn Anemüller

This project is a proof of concept for the unsupervised (blind) separation of two audio sources captured from the same scene with K channels.

We are here using the Maximum Likelihood Estimation MLE given a Cauchy in each frequency bin. This funktion shall model the super Gaussian behaviour of the speech.

Generally in the folder "DSPII_ICA" all necessary methods and funtions, with proper comments, can be found to even build your own MLE!

The MLE is calculated through gradient ascend of the Likelihood (it could also be the gradient descend of the inverse likelihood, I don't quite remember..).

We can make use of the natural gradient and the normal gradient methods. Generally the natural gradient seems to be more calculation intense.

Unfortunately, those methods are not separated, like the other methods in this package. Anyways many different gradient descent techniques do exist and there are plenty of optimization methods (like Adam) which could efficiently train this model.

Since, this was a proof of concept and might be better transformed into Python for more utility, it does not have that many optimizations. Even in the sample files, frequency bins might not be used very well. This is one reason, why I changed the dimensionality of the output figures of the example.

But really it makes a lot of sense exploring the "DSPII_ICA" folder yourself! There are a lot of useful functions for exploratory data analysis for acoustic / time series data ^^"

A few nice ones to mention are: -kurt.m for the curtosis -konv.m for convolution of two time series (also as an even faster version, ... ) -ovadd.m to reconstruct a signal previously windowed into several frequency bins

Example

As an example we have two 2 channel spoken stereo signals overlapped (see ... * or better hear * "audio/160318_02.WAV"). For the original setup we also once recorded our own signals with two microphones, but the results were terrible and the recodings quite large xP

I have to mention here, that we did not measure special frequency metrics, like MFCCs, and did not try to dereverberate the audio as part of the preprocesing! -> This is, why the standard audio signals do work much better than recorded signals!

We can take a look at the spectogramms of the signals before and after the separation and the log-likelihood of the ICA MLE below.

some replacement text for the spectograms

some replacement text for the log-likelihood

If you want to find out how well this reconstruction listens, try the "main_BSS.m" scipt yourself! It will play you the results at the end after a few beeps ^^!

Installation

requirements

It is important for you to have at least Matlab v2010a installed and successfully running. A few functions, like FFT, are still dependent in the following package:

  • Signal Procesing Library for Matlab

instruction

Just clone the repo and run the examples. Plus, you could learn some Matlab syntax to use some functions right ahead by your own and improve the code for you needs. (Note that array indices here are biased by the vector object and have the notation "A(1:N)" for the first N elements of A for example!)

License

This project is open source. Beside of Matlab itself, which is licensed under Mathworks (c), this projects source code is licensed under the GNU license for open and free software / code. Feel free to reuse and modify this code, but don't forget to eventuall mention this place and give feedback!

Credits

Markus Meister : university of Oldenburg (Olb) - Division for Machine Learning (H4A) (former student worker) meister.markus.mm@gmx.net

Jakob Drefs : university of Oldenburg (Olb) - Division for Machine Learning (H4A) jakob.heinrich.drefs@uol.de

Prof. Dr. Jörg Anemüller : university of Oldenburg (Olb) - Division for Signal Processing and Acoustics (H4A) joern.anemueller@uol.de

Final Note

ATM this repo is WIP, so please wait for the full adaptation to GIT!

About

Audio BSS with ICA

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published