This is a reproducible example demonstrating how to use pykanto, a python library to manage and analyse animal vocalisations. We use a small sample dataset to answer a real research question: can individual birds be recognised by their song?
Note: this is a large repository (~600 MiB) including the data necessary to train a deep learning model and reproduce the results included in the paper. It might take a couple of minutes to download, or longer if you have a slow connection.
- Create a new environment, e.g. using miniconda:
conda create -n pykanto-example python=3.8
-
Install pykanto: See installing pykanto for a complete installation guide for the library, or just run
pip install pykanto
. -
Clone this repository to your computer, navigate to its root and install using pip:
git clone https://github.com/nilomr/pykanto-example.git
cd pykanto-example
pip install .
One of the steps to reproduce this example involves training a deep neural network, which requires compatible GPU resources. The repository includes the feature vectors output by the model, so this step can be safely skipped if you don't want to train the NN again.
Expand
If you do want to train the model yourself, you will need a few more libraries that are not installed automatically with pykanto. The reason for this is that the are a bit finicky: which exact installation you need depends on which version of CUDA you have and the like.
I recommend that, if this is the case, you first create a fresh environment with conda:
conda create -n pykanto-example python=3.8
And then install torch, pykanto and this example including the extra libraries.
conda install -c pytorch pytorch torchvision
pip install pykanto
# Navigate to the root of this repository, then:
pip install ."[torch]"
You can also install in developer mode, and install along extra dependencies
that you might find useful: pip install -e ."[extras]"
. See
[project.optional-dependencies]
in the pyproject.toml file to see options for
extras
.
First, make sure that you have activated this project's environment (conda activate pykanto-example
if you followed the instructions above). Then,
navigate to /notebooks
. This is where the scripts are located. They can
all be run from the terminal, python <script-name>
.
Expand user guide
Script | Description | Use |
---|---|---|
1_prepare-dataset.py |
Ingests, creates spectrograms, and segments the dataset1 | To run: python 1_prepare-dataset.py -d pykanto-example -f pykanto-example , to see options: python 1_prepare-dataset.py --help |
2_interactive-labels.py |
Opens an interactive app to check the automatically assigned labels | The latter step requires user input so it's disabled by default for reproducibility. If you want to test the app yourself set use_app = False in that script. To learn how to use it, see using the interactive app |
3_export-training-data.py |
Exports the data required to train the deep learning model | python 3_export-training-data.py |
4_train-model.ipynb |
Model definition and training step | A separate, self-contained jupyter notebook. This is to make it easier to run interactively on a GPU-enabled HPC. If you don't want to retrain the model, you can skip this step |
5_measure-similarity.py |
Measures the similarity between songs across years and birds | python 5_measure-similarity.py |
6_plot-results.py |
Plots the results. | python 6_plot-results.py : will output to graphics device but not save. |
6.1_publication-plots.R |
Reproduce the exact plots included in the paper | Switch to R and run Rscript -e 'renv::run("6.1_publication-plots.R")' after installing the R dependecies via renv::restore() |
If you use pykanto
in your own work, please cite the associated article and/or
the repository:
© Nilo M. Recalde, 2021-present
Footnotes
-
If you want to run this in a HPC you can use
pykanto
's tool for this, which makes it very easy: (see Docs for more info)pykanto-slaunch --exp BigBird2020 --p short --time 01:00:00 -n 1 --memory 40000 --gpu 0 -c "python 1_prepare-dataset.py -d pykanto-example -f pykanto-example" -env pykanto-example
↩