The goal of SILIC is to build an autonomous wildlife sound identification system which can help to monitoring the population status and trends of terrestrial vocal animals in Taiwan by using the data of Passive Acoustic Monitorings (PAMs).
- Object 1: Extract robust species, sound class, time and frequency information from various and complex soundscape recordings.
- Object 2: Model can be constructed using a dataset as small as possible, and the training audios can be easily and quickly acquired.
- Object 3: Most species of terrestrial vocal wildlife in Taiwan should be included in model, especially those are hard to be detected with survey methods other than PAM.
SILIC uses Python language and yolov5 package (Glenn Jocher et al., 2020) to construct a object detection model. Additional pydub (Robert, 2011), nnAudio (Cheuk et al., 2020) and matplotlib (Hunter, 2007) libraries were imported for audio signal processing and Time–Frequency Representation (TFR).
- Training and validation: ./dataset/Training_Validation_Dataset.txt
- Test with evaluation results: ./dataset/evaluation_testset.csv
- Model Weights:
- ./model/exp12 , including 27 sound classes of 16 species, updated on Apr. 2021
- ./model/exp14 , including 74 sound classes of 52 species, updated on Jul. 2021
- ./model/exp18 , including 194 sound classes of 147 species, updated on Oct. 2021
- ./model/exp20 , including 213 sound classes of 163 species, updated on Dec. 2021
- Scripts of detection: ./silic.ipynb
- Demo video of inference results of the SILIC on detecting an camera trap video:
- Macaulay Library
- xeno-canto
- Asian Soundscape
- Thinning Forest Monitoring
- iNaturalist