Whistle Detection

TLDR

Credits

The UNSW Australia team thanks Thomas Hamboeck and the Austrian Kangaroos for their contribution to the Standard Platform League code releases, and Alexandre Mazel for his Stack Overflow answer that allowed earlier versions of this code to work with Naoqi under the 1.14 toolchain.

Testing on PC or Mac

This script was designed to work as a (more or less) standalone python process:

# On PC or Mac, this should run pc_wav_test()
# which retests the implementation against audio recordings in rUNSWift/test/audio
python whistle_detector.py

Running on Nao

The same whistle_detector.py Python module also runs on the Nao under the 2.1 toolchain / Nao V4 & V5.

Debugging

Try changing VERBOSITY to a higher integer value to get a feel for what is going on, then follow the print statements.

e.g. It has self-terminating code built in. This self-termination was designed to preserve the existing runswift interface so most developers shouldn't need to know how to start the whistle detector independently of their main executable, or hopefully need to be concerned about rogue whistle-detectors. It should be easy to update that if you wish to run with a different main executable, or comment it out for easier debugging.

Analysis

Whistle detection analysis was performed with the help of the Audacity Spectrogram visualisation - basically humans are good at looking for patterns. This was used to develop the algorithms and tune the parameters.

Audacity - Whistle Analysis

Algorithm - Adaptive Background Growth, Spectral & Temporal Filtering

Conceptually from the image above, you can think of the whistle_detector as looking for a red on blue, or white on red rectangle, somewhere in the 2000-4000 Hz range, lasting for a minimum time of 250 ms, among other things in the Developer configurable settings section.

The main algorithm is in the interrogate function. The algorithm:

Assumes it can get 48000 Hz input data from pyalsaaudio or some other audio source.
Performs a numpy.fft.rfft
Checks it has at least spectra_per_second = 47 spectra of data, used to determine what is background noise temporally.
Calculates the whistle_threshold as the maximum of the spectrum_threshold and temporal_threshold (on review, this probably could be done a bit later which may save CPU if we can early exit on background noise).
Adaptively grows background noise zones based on the sound spectrum to focus on the whistle "signal".
Filters the remaining spectrum based on the whistle_threshold determined earlier.
If they all succeed, increments a counter until we get 12 successes, or ~250ms, this being a "whistle heard".
The counter is reset on 4 misses, or ~83ms.

Integration - Game States

As whistle_detector.py consumes around 30-40% of the Nao's Atom CPU, we decided only to run whistle detection in the READY and SET game states. Running in READY is partly a historical trade-off from the 1.14 toolchain, where we spent up to 10 seconds longer telling Naoqi to release the microphones so we could use them.

Integration - Data Flows

On the Nao, whistle_detector.py will generate ~1-2 second long timestamped files when it detects a whistle. These are then read by the GameController thread.

Some of those files were extracted from the robot, and manually classified so they can be run by the pc_wav_test(). They are stored in rUNSWift/test/audio.

We found this to be very useful in testing gear noises which caused many false positives on earlier implementations, and also early on at competition as we suspected a noisy crowd would false positive, which it did requiring further parameter tuning (and thankfully no further algorithmic development, though we were prepared for this possibility).

Integration - Voting

In the runswift GameController thread, we also implemented a voting over WiFi as we found the Aldebaran microphones sometimes mysteriously failed in our lab games (seemed like loose wires as they also sometimes worked perfectly). This was particularly annoying when just our striker had the bad microphone.

Possible Future work

whistle_detector.py currently uses around 30-40% of the Nao's Atom CPU as measured by the top command.

The implementation would probably work with 8000 Hz input data as a possible future optimisation to lower CPU usage. This would also be a good opportunity to review the developer configurable settings as many feel duplicated and clunkily hard-coded to the specific soccer whistle sound.
It may also be beneficial to integrate at a different part of the rather complicated Linux sound stack to the current ALSA, as sometimes we think Naoqi or libagent unexpectedly locked the microphones, requiring a system reboot. It may also allow audioin or audiodevice to be put back into naoqi/autoload.ini
Future work could look into PyPI or additional Python profiling.
It might also be nice to remove the wtb_pip hacks if we are prepared to easy_install pip as part of our robot setup process. Unfortunately pip was not bundled with python until very recent versions. It is disappointing that Aldebaran does not ship more recent versions of Python 3 with the Nao system image (ours has 2.7.x and 3.1.x, while Python 3.5 was recently released).