# PyListener

PyListener is an open-source software written in Python for sound comparison.<br>
Currently, its main functionality is to listen to sound from microphone and save a recognized sound as WAV file, when it is similar with a loaded template sound.

Jinook Oh, Cognitive Biology department, University of Vienna<br>
Contact: jinook0707@gmail.com, tecumseh.fitch@univie.ac.at<br>
September 2019.


<br>Run the below code and play the video to see an examplar usage case.

## What it does

In short, PyListener app does the below.<br>
**1)** Load template WAV file(s) to analyze & form template WAV data and parameters to compare.<br>
**2)** Captures a sound from streaming of microphone.<br>
**3)** This captured sound is also analyzed & its result parameters will be compared to those of template WAV data.<br>
**4)** Store result text in log file and show it on the app UI. Also, if two sounds match, the captured sound will be saved as a WAV file in 'recordings' folder.

In PyListener app, a user can load wave file(s) in a folder to form a template sound, then start/stop listening from a microphone.
A sound will be automatically cut out of the continuous listening, when amplitude goes over a threshold and stay longer than another threshold.<br>
Related variables are as below (defined in *\_\_init\__* of *pyListener*).<br>
- PyListener.**ampMonDur**: Time (in seconds) to look back for average amplitude. This should be longer than PyListener.**minDur4SF**.
- PyListener.**minDur4SF**: Minimum duration (in seconds) for a sound fragment.
- PyListener.**ampThr**: Amplitude (0.0-1.0) threshold to start a sound fragment.
- PyListener.**maxDurLowerThr**: Once amplitude goes above threshold, the program will continute to capture audio data until amplitude is lower than PyListener.**ampThr** longer than this duration, PyListener.**maxDurLowerThr**.

Analyzed parameters of both template sound and a sound fragment, captured as above, are used to compare them to determine whether it's similar or not.
The below variables are related to define parameters to be analyzed. They are located in *\_\_init\__* function of *pyListener*.<br>
- PyListener.**pKeys**: This is a list of parameters to store such as *duration*, *summed amplitude*, *center of mass*, *low frequency*, *high frequency* and so on. Some of these are for the comparision, others are just for calculations in the app. **Description for each item can be found at the end of this document**.
- PyListener.**compParamList**: This is a subset list of PyListener.**pKeys** to use in the comparison.
- PyListener.**indCPL**: This is a subset list of PyListener.**compParamList**, which does not change when different template WAV files are selected. (Currently, there are only two items, 'summedAmpRatio' and 'corr2auto'.
- PyListener.**cplInitMargin**: This is a Python dictionary, which has values to subtract or add to the template WAV parameters. Keys are keys of PyListener.**compParamList** (excluding keys in PyListener.**indCPL**) plus "\_min" and "\_max".
- PyListener.**indCPRange**: This is a Python dictionary, which has threshold ranges for items in PyListener.**indCPL**. Keys are keys of PyListener.**indCPL** plus "\_min" and "\_max". Its values are the minimum and maximum values respectively.
- PyListener.**compParamLabel**: This is a list of labels to appear in UI of *pyListener*. The items of PyListener.**compParamList** will appear in UI of *pyListerner* for manual adjustments of threshold ranges, and these labels will be shown next to it.

\* When a new parameter to be analyzed is added, the above variables should accordingly updated. Will be explained later in this document.<br>
\** A user can permanently change these variables or temporarily disable certain parameters in PyListener.**compParamList** by unchecking it in UI of pyListner app.

The analyzed results will be compared with the analyzed results of the template file.
If it's similar enough (within all the thresholding ranges), the app will save the sound fragment as a wave file in a folder named **recordings**. Also, all the results of each analysis will be saved in a log file in **log** folder. 


## Example comparison, using pyListenerLib.py as a library

Currently, pyListener has three Python files, <br>
- **pyListener.py**: pyListener app, using wxPython.<br>
- **pyListenerLib.py**: This contains main functionalities of pyListener such as sound loading, comparing and saving. This can be used without loading wxPython frame in **pyListener.py**.<br>
- **fFuncNClasses.py**: Simple functions and a dialog class to be used in multiple places in the above files.

Here, we will test sound comparision functionality with **pyListenerLib.py** without wxPython frame.<br>

First, import necessary packages, then, create _PyListener_ object.

In [13]:
from os import getcwd
import pyListenerLib as PLL
pl = PLL.PyListener(parent=None, frame=None, logFile='log/testLog.txt', cwd=getcwd())

We don't have wxPython frame in this notebook example, therefore, we give 'None' for both parent and frame.
As it's created, it will make a text file, testLog.txt, in 'log' folder in your working directory (assuming your working directory is where pyListern files are) to record all comparison results.

When **PyListener** class is intialized, it finds microphones using a preferred device string (PyListener.**prefDevStr**), which is currently set to ['built-in'] in the code. (Because PyListener was programmed on Mac OSX and it has 'built-in' microphone').
If you want to use another microphone attached to your computer, you can find input devices again by running the below code.

In [3]:
pl.prefDevStr = ['H4N', 'built-in'] # in case, you have H4N microphone attached.
pl.devIdx, pl.devNames = pl.find_device(devType='input')

*devIdx* and *devNames* are lists.
If you're not sure what string should be give, please open 'testLog.txt', created on **PyListener**'s initialization.
The first few lines will say about input devices, such as 'device-name:Built-in Microphone'.
You can give a peculiar string in that name. (Letter case doesn't matter here. pl.prefDevStr=['BUILT-IN'] will also work.)

When listening via a microphone is started, a device index can be set to a specific one among found devices.
You will see this later in this document.

### Loading template dataset

In this example, we'll try to catch *phee* call type of common marmoset monkeys.<br>
We give a folder path which contains several WAV files of *phee* call.

In [14]:
tSpAD, __, templP = pl.listen(flag='templateFolder', wavFP='input/sample_phee')

PyListener.**listen** function reads and analyzes WAV files with a given folder path, *wavFP*,<br>
 and it returns **data**, **amp** and **params**.

**data** is a numpy array (of spectrgoram) which contains greyscale pixel values (0-255) for drawing a spectrogram.<br>
**amp** is a RMS amplitude which is to check whether the amplitude of real time recording sound is over/under threshold to determine beginning/end of a sound fragment. But, it's not necessary here since we'll deal with already prepared WAV files.<br>
**params** is a Python dictionary containing analyzed parameters of the given sound data. When it's a template folder (instead of a sound fragment of real-time listening), it has average values of WAV files in the folder. Also, it has '*parameter*\_min' and '*parameter*\_max' to show initial threshold ranges.<br>If you need what is actually calculated to generate this params, see functions called in *analyzeSpectrogramArray* function in *PyListener*.

### Running comparison with WAV files in testing folder

Here, instead of capturing a sound fragment in real-time listening, we're going through test WAV files to compare them with the template file parameters (*templP*). The test WAV files starts with 'm_' in *input/test* folder; they're several different sounds such as *phee*, *rapid fire tsik*, *etc*.

In [15]:
from glob import glob
from os import path

### Prepare min. and max. values for thresholding.
### In the app, this process will be done, using values in UI (TextCtrl widgets).
tParams = {}
for param in pl.compParamList:
    tParams[param+'_min'] = templP[param+"_min"]
    tParams[param+'_max'] = templP[param+"_max"]

fLists = glob('input/test/m_*.wav')
for fp in sorted(fLists): # loop through WAV files
    spAD, __, sfParams = pl.listen(flag='wavFile', wavFP=fp) # read & analyze a WAV file
    flag, rsltTxt = pl.compareParamsOfSF2T(sfParams, tParams, path.basename(fp)) # compare analyzed parameters of the current WAV and template WAV
    print(rsltTxt) # print output; this text is also recorded in the log file by 'compareParamsOfSF2T' function.

2019_09_05_16_36_47, [RESULT], Sound fragment (m_noise01.wav) did [NOT] match/  duration [NOT] (0.870 <= 0.360 <= 2.800)/ cmxN (0.206 <= 0.222 <= 0.596)/ cmyN (0.135 <= 0.217 <= 0.479)/ avgNumDataInCol [NOT] (-10.045 <= 63.833 <= 34.600)/ lowFreq [NOT] (3.550 <= 1.150 <= 8.800)/ highFreq (5.800 <= 9.000 <= 11.150)/ distLowRow2HighRow [NOT] (0.000 <= 157.000 <= 76.000)/ summedAmpRatio (0.500 <= 0.783 <= 3.000)


2019_09_05_16_36_47, [RESULT], Sound fragment (m_phee01.wav) [MATCHED] with following parameters ( duration/ cmxN/ cmyN/ avgNumDataInCol/ lowFreq/ highFreq/ distLowRow2HighRow/ summedAmpRatio ) duration (0.870 <= 1.380 <= 2.800)/ cmxN (0.206 <= 0.449 <= 0.596)/ cmyN (0.135 <= 0.327 <= 0.479)/ avgNumDataInCol (-10.045 <= 13.333 <= 34.600)/ lowFreq (3.550 <= 7.450 <= 8.800)/ highFreq (5.800 <= 8.300 <= 11.150)/ distLowRow2HighRow (0.000 <= 17.000 <= 76.000)/ summedAmpRatio (0.500 <= 1.228 <= 3.000)


2019_09_05_16_36_47, [RESULT], Sound fragment (m_phee02.wav) [MATCHED] with follo

### Comparing with another template dataset

This time, we load a different call type with the below line.

In [18]:
tSpAD, __, templP = pl.listen(flag='templateFolder', wavFP='input/sample_rfts')

Now, if you run the code in 'Running comparison with WAV files in testing folder' again, the matching results should be different (matched wave files should be 'm_rapFTsik##.wav' files).

### Comparing a sound fragment from micropohone with template

Running the below code will start listening via the first (pl.devIdx[0]) detected microphone.
Then, it will start another thread to continous processing of captured sound (In the app, this part is done with wxPython's timer), which will compare audio data and print output text here and log file.

\* These threads will continuously work until you run the next code to end them.

## Adding a new parameter in analysis

To add a new parameter to be analyzed, follow below steps.

1) Add a key string such as 'duration' in PyListener.**pKeys**.<br>
If this new parameter should be used for the comparison, also conduct following steps.
- If its value depends on loaded template WAV sounds,<br>&nbsp;&nbsp;&nbsp; add the same key string in PyListener.**compParamList**, and add a margin value to add/subtract in PyListener.**cplInitMargin**.<br>&nbsp;&nbsp;&nbsp; *e.g.*: Add a string item, 'duration', to PyListener.**compParamList** and add *duration_min=0.25* and *duration_max=0.5* to PyListener.**cplInitMargin** dictionary. (When the duration of a chosen template WAV file is one second, *pyListener* UI will show 0.75 - 1.5 seconds thresholding range for *duration*.)<br><br>Otherwise (such as *summedAmpRatio*),<br>&nbsp;&nbsp;&nbsp; add a string item, 'summedAmpRatio', in PyListener.**incCPL**, and add the initial threshold range in PyListener.**indCPRange**.<br>&nbsp;&nbsp;&nbsp; *e.g.*: Add a key, 'summedAmpRatio', to PyListener.**indCPL**, and add *summedAmpRatio_min=0.5* and *summedAmpRatio_max=2.0*. (Summed amplitude of a sound fragment can be in the range of half to twice of summed amplitude of template data.)<br><br>
- Add a label in PyListener.**compParamLabel** to be appeared in UI of pyListener app.

2) Add calculation lines for the new parameter in *analyzeSpectrogramArray* function in *PyListener*, and store the result value in *params[key]*, in which *key* is the same key string in PyListener.**pKeys**.<br>

# Descriptions of items (of PyListener.compParamList), used to analyze sound data

- **duration**: Duration of sound in seconds
- **cmxN**: X coordinate value of center-of-mass of spectrogram image, normalized to 0.0. and 1.0.
- **cmyN**: Y coordinate value of center-of-mass of spectrogram image, normalized to 0.0. and 1.0.
- **avgNumDataInCol**: Average number of non-zero data points in columns of spectrogram. This number of data is calculated after cutting off too low and too high frequency data (defined as *PyListener.comp_freq_range*), then auto-contrasting on spectrogram.
- **lowFreq**: Calculates lowest frequency in each column (also after cutting off and auto-contrasting). **lowFreq** is the average value of those lowest frequencies.
- **highFreq**: High frequency counter-part of **lowFreq**.
- **distLowRow2HighRow**: Distance (in pixels of spectrogram) between 'lowFreqRow' and 'highFreqRow'.
- **summedAmpRatio**: 'summedAmp' of a captured sound fragment, divided by 'summedAmp' of template data
- **corr2auto**: [correlate(s,t)/correlate(t,t)]. Correlation bewteen a captured sound fragment and template data, divided by auto-correlation of the template data. To calculate a single value, the maximum values of result arrays from correlation and auto-correlation are used.