OrcaCNN: Detecting and Classifying Killer Whale from Acoustic Data
Passive acoustic observation of whales is an increasingly important tool for whale research. Accurately detecting whale sounds and correctly classifying them into corresponding whale pods are essential tasks, especially in the case when two or more species of whales vocalize in the same observed area. Most of the current tasks of whale sound detection and classification still need to be implemented manually.
This project aims to develop two deep learning models for the detection and pod-classification of orca, or killer whale calls in unknown long audio samples. These deep neural networks will help identify and verify killer whale calls so that researchers, grad students, and shipping vessels don't have to. The end-user interface is made as a web-app which can easily be used by scientists in their research.
There are mainly three stages involved in the development of OrcaCNN.
- Data preparation and Pre-processing
- Hyperparameter Tuning and Training with Model Evaluation
Sufficient steps have been mentioned for each of these stages to aid in the development of OrcaCNN model. Each method has its merits and demerits. It is therefore essential to evaluate your choice before proceeding with any method.
The methods chosen for this project have been mentioned besides each method.
The data was given by Dan Olsen from his research work. The data being around 1 TB was uploaded to a Google Cloud Bucket where the files could directly be accessed. There were several approaches taken for transferring data to Google Colab and thanks to Shane for writing them down graciously.
The data consisted of two folders:
- Autonomous Recordings NGOS
Most of the autonomous recordings are very quiet, which improves the quality and the distance with which the whale calls can be detected (less distortion). The autonomous recordings need to be amplified, or sensitized, for the spectrograms to capture the whale calls properly. But these are not labelled, and hence cannot be used further except for testing data.
As mentioned by Dan, most of the E3, E4, H2, L1 directories do not have whale calls to aid in automating the process of finding the calls out of a lot of quiet or 'boat noisy' recordings. L1,L2 has the highest percentage of calls, with 50-90% of days that have calls at some point in the day.
The recordings follow the naming convention as
Location followed by deployment number. For example,
E1 is the first deployment at Etches (Port Etches), P2 is the second deployment at Pony Cove, H1 is the first deployment at Hanning Bay and L2 is second deployment at Little Bay.
File names start with the hydrophone serial number, followed by date and time of recording. For example: For a file named 67391498.180916010013, the recording was made with hydrophone 67391498 on 2018 September 16th at 1:03 am (and 13 seconds) UTC.
Field Recordings NGOS
- 2005 field recordings
- 2006 field recordings to ...
- 2018 field recordings
The field recordings have a lot of whale calls and as mentioned by Dan, we mainly have three kinds/types of killer whales
- Resident Killer Whales which feeds only on fishes.
- Transient (or Biggs) Killer Whales which feeds on mammals.
- Offshore Killer Whales which prey mostly on shark. Their calls are very similar to Residents.
For these 14 years of field recordings, around 22 pods exist, namingly,
To help address false positive dectection of the presence of killer whale's due to humpback whale vocalization, the MBARI generously provided a collection of humpback whale calls collected from the MARS hydrophone.