From the abstract:
Direction of voice (DoV) estimation is a recent development in audio signal processing which can be leveraged to enhance video call experiences such that they more closely resemble natural audio perception. Using some of the methods from recent research in this area, we provide a filtering-based application of DoV estimation to facilitate the transmission of relevant conversation to video call recipients. An extra-trees classifier determines which segments of audio are targeted towards the recording device. A filter blocks audio segments that are projected along irrelevant axes from being transmitted to the wrong audience. Empirically, the proposed work validates this filtering functionality using existing datasets and recordings that were acquired to emulate video call and streaming experiences.
Data should be put in a subdirectory data/