Design and implementation of a pipeline that:
- Process, given user specifications, the data generated by the WP2_T3 pipeline to create the clips to train the AI models.
- Implements the three selected AI models (Vggish, PaSST, EffAT) and well as an standardized way of training, testing and performing inference with them.
The recommended Python version is 3.9.13
The project uses Poetry for dependency management and packaging. Before running any code, ensure Poetry is installed and configured properly:
- Windows users: Use the command
poetry install --with intel - Linux users: Use the command
poetry install --without intel
The recommended Poetry version is 1.8.3
The repository is organized as follows:
-
scripts/: Contains scripts for training, testing, and inference.
train.py: Script to train the AI models.test.py: Script to test the trained models.inference.py: Script to perform inference using the trained models.
-
src/:
- config/: Contains all the
.yamlfiles that define modifiable parameters for training and testing each AI model. - models/: Houses implementations and architectures of the three models:
effat_model.py: Implementation of the EfficientAT model for training, testing, and inference (EfficientAT repo).passt_model.py: Implementation of the PaSST model for training, testing, and inference (PaSST repo).vggish_model.py: Implementation of the Vggish model for training, testing, and inference (Vggish repo).
- pipeline/:
dataset.py: Contains theEcossDatasetclass, responsible for modeling and processing datasets.utils.py: Contains utility functions and helper classes used within theEcossDatasetclass and other modules related to the AI models.
- config/: Contains all the
-
poetry.lock: A file generated by Poetry, locking in the exact versions of all dependencies. -
pyproject.toml: Defines the project's metadata, including required packages, their versions, and the compatible Python versions.
To set up the necessary variables to run the code, create a file named .env with the following content:
# Common Parameters
EXP_NAME = # desired name of the folder to store the results
# Dataset Management parameters
NEW_ONTOLOGY = 'Ship,Biological,...' # Top level to group labels
UNWANTED_LABELS = 'Tursiops,SpermWhale,...' # Labels to drop
TEST_SIZE = # Size of the test set in decimal format (e.g. 0.3)
MIN_DURATION = # The amount of seconds that each signal should atleast have to be preserved on the dataset
CLASSES_FILTER_TIME # A set of classes to apply this min_duration filter. Should be formatted like this: 'Delphinids,Tursiops'. If we want to apply the filter to all our classes simply use an empy string (CLASSES_FILTER_TIME = '')
DESIRED_MARGIN = # The amount of frequencial content of the signal needed in order to keep it. (e.g 0.2 (20%))
REDUCIBLE_CLASSES = # The classes of the ontology whose number of samples needs to be reduced (e.g 'Ship,Biological'). If you dont want to reduce simply keep this as an empty string
TARGET_COUNT = # Number of samples that we want to keep for each of the reducible_classes (e.g '2000,1800'). If you dont want to reduce simply keep this as an empty string
# Train parameters
ANNOTATIONS_PATHS= 'path/to/data1,path/to/data1,...'
YAML_PATH = path/to/yaml/config/file
MODEL_TYPE = # effat passt or vggish
PATH_STORE_DATA = data/ # Desired path to save processed data
PAD_MODE = # zeros, random or white_noise
OVERWRITE_DATA = # True or False. If PATH_STORE_DATA already exists and OVERWRITE_DATA = False, it doesn't process the data, just load the data from the path defined.
# Inference / Test parameters
INFERENCE_DATA_PATH = path/to/file/to/predict # Can also the path to a folder with more than one audio, inference will be performed in all the audios in the folder. Only audios should be inside that folder.
PATH_MODEL_TEST = path/to/trained/model/weights
# Check labels
DATASET_PATH_CHECK = path/to/dataset/to/analyze
YAML_LABELS_CHECK = path/to/specific/yaml
STORE_PATH_CHECK = path/where/the/labels/are/stored
- Install poetry on your base python virtual environment with "pip install poetry".
- Write "poetry init" in order to create the project.
- In the pyproject.toml file change the name parameter in [tool.poetry] for the name "pipeline".
- Write "poetry config virtualenvs.in-project true". This will tell poetry to locate the .venv file that will be created in the root of the project.
- Write "poetry install". This will create the .venv file and install all the packages and dependecies needed for your project.
- In case you want to add more, simply write "poetry add package-name".
- Install poetry on your base python virtual environment with "pip install poetry".
- Write "poetry config virtualenvs.in-project true". This will tell poetry to locate the .venv file that will be created in the root of the project.
- Write "poetry install". This will create the .venv file and install all the packages and dependecies needed for your project.
- In case you want to add dependecies and package, write "poetry add package-name".
Two ways:
- Write "poetry shell". This will activate the venv and you will be able to navigate to your script and do "python X.py". In order to exit this shell write "exit".
- Activate the venv manually.
If you use linux, then you dont need the package tensorflow-intel because it will raise an error. Install everything like this: "poetry install --without intel". If you use windows, then you will need the package tensorflow-intel because otherwise you will experience errors. Install everything like this: "poetry install --with intel".
It is possible to compile the Effat and Passt models before using them for optimization. To do this, simply set the compile parameter to
True in the .yaml configuration file. Please note that this feature requires a Linux environment. If you are using Windows, you can utilize the Windows Subsystem for Linux (WSL).
To install WSL, follow the instructions provided at Microsoft's official documentation. Once WSL is installed, proceed with the steps below to install this repository in a Linux environment. This process has been tested on Ubuntu 22.04.
Windows by default limits the RAM capacity of WSL to 50% of the RAM of the machine. You'll need to add a memory=48GB (or your preferred setting) to a .wslconfig file that is placed in your Windows home directory (\Users\{username}\).
[wsl2]
memory=48GB