Skip to content
/ ACC Public

Official repository of the paper Audio Classification of the Content of Food Containers and Drinking Glasses

License

Notifications You must be signed in to change notification settings

CORSMAL/ACC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Audio Classification of the Content of Food Containers and Drinking Glasses

Audio Content Classification (ACC) is our proposed method in the paper: Audio classification of the content of food containers and drinking glasses. ACC is a convolutional-neural-network ensemble for estimating the type and level of content in a container, from the sound produced by an action of a person manipulating a container. ACC estimates the type and level jointly from a single class estimation, as a spectrogram classification task. ACC is designed to perform on the CORSMAL Containers Manupulation dataset (CCM), but can be trained or estimate on other data. Additionally, we provide another set of recordings from a different setup for validation.

Alt text

[arXiv] [webpage] [pre-trained models] [CCM dataset] [ACM-S2 dataset]

Installation

Requirements

  • python=3.8.3
  • librosa=0.8.0
  • numpy=1.20.1
  • pandas=1.2.3
  • scikit-image=0.18.1
  • scikit-learn=0.24.1
  • tensorflow=2.3.0
  • matplotlib-base=3.3.4
  • tikzplotlib=0.9.8

Instructions

  1. Clone repository

git clone https://github.com/CORSMAL/ACC

  1. From a terminal or an Anaconda Prompt, go to project's root directory (/acc-net) and run:

conda env create -f environment.yml

conda activate acc

This will create a new conda environment (acc) and install all software dependencies.

If this step failed, in a Linux machine (also from the project's root directory) run the following commans and try again:

  • module load anaconda3
  • source activate

Alternatively, you can create the new environment and install the software manually with the following commands:

  • conda create --name acc python=3.8.3 or conda create -n acc python==3.8.3
  • conda activate acc
  • conda install -c conda-forge tensorflow
  • conda install -c conda-forge librosa
  • conda install -c conda-forge pandas
  • conda install -c conda-forge scikit-image
  • conda install -c conda-forge tikzplotlib
  1. Load the pre-trained models to the project: Move the following files to /acc-net/methods/acc/models/. You might need to create the last directory (/models):
  • acc_action.json
  • acc_action.h5
  • acc_pouring.json
  • acc_pouring.h5
  • acc_shaking.json
  • acc_shaking.h5

These models can be our pre-trained models or your exported models after training.

Demo

Once completed the previous steps, you can run a demo to check if the installation, setup and models are working as expected.

Note: You do not need to download the CCM dataset for the demo, as it will use a few samples provided in this repository, from the CCM dataset validation split. To test the demo, from the project's root directory (/acc-net), run the following line:

python demo.py

This script will display each step (data loading, pre-processing and model estimation) together with some TensorFlow logs based on your hardware and version. Then, it will display the demo sample's filename, true label and predicted label, followed by OK: Ready to run main.py, if the installation and setup succeded and the pre-trained model is loaded. If something fails, it will raise the detected errors.


Running arguments

(In order to run the following methods you need the CORSMAL Containers manipulation dataset).

main.py is the main procedure, located in the project's root directory (/acc-net). This will run the software with the default arguments (evaluate on the public test split of the CCM dataset). To run it:

python main.py

Use the arguments to select the execution mode. The main options are: train ACC on a dataset with labels, or estimate (and evaluate if available labels) on any data or dataset. Arguments:

  • --datapath: Absolute or relative path to dataset or files. Change code as needed if not using the CCM dataset.
  • --outdir: (optional) Absolute or relative path to the output directory. Default output directory: /acc-net/outputs
  • --mode: train to train ACC from scratch, test (default) to predict with a pre-trained model.

For training:

  • --model: Either train all components at once acc (default), or each sub-model individually acc_action, acc_pouring, acc_shaking.

This will train ACC on the train split of the CCM dataset. If want to train on different data, change code as needed.

For inferencing:

  • --data_split: Selects which data split of the CCM dataset uses for estimating. It can be: train, test (default), private_test, new_containers. Change the code as needed if not using the CCM dataset.

Running examples

Train the action classifier:

python main.py --mode train --model acc_action

Estimate on the private test data split:

python main.py --data_split private_test

Data format

Input

  • Dataset (datapath argument): Audio files in .wav format. For training a model in CCM, keep same directory structure of the CCM dataset.
  • Trained models: Each trained model requires two files (.json and .h5).

Output

In the output directory (outdir argument), two folders will be generated: train and test.

  • Train: Exports each training execution, saving the trained models (.json and .h5 files), accuracy and loss curves (.png and .tex), and validation scores in a confusion matrix (.png).
  • Test: Exports the model predictions as a .csv file, that can be evaluated on the CCM Evaluation toolkit.

Evaluation

To evaluate estimations on one of our datasets and get the scores for each task (filling level and filling type):

  1. Running main.py will export the estimations to the provided output directory. Example: estimations.csv

  2. Optional: Move the estimations file (e.g. estimations.csv) from your output directory to (/acc-net/eval). If not, provide the full path to the file as argument in the next step.

  3. Move to eval directory and run the eval.py script providing the estimations file and dataset to evaluate. The available arguments are:

    • --submission: CSV file to evaluate. Default: example/acc.csv
    • --testset: dataset to evaluate. Options: ccm_train, ccm_val (default), ccm_private, ccm_public, novel_containers. Note that the annotations for the test splits are not avaibale, so no score will be calculated.
  4. Scores will be displayed and exported as txt to the current directory.

Example: Evaluating estimations on the validation split of the CCM dataset:

python eval.py --submission example/acc.csv --testset ccm_val


Enquiries, Question and Comments

If you have any further enquiries, question, or comments, please contact s.donaher@qmul.ac.uk or a.xompero@qmul.ac.uk. If you would like to file a bug report or a feature request, use the Github issue tracker.

Licence

This work is licensed under the MIT License. To view a copy of this license, see LICENSE.

About

Official repository of the paper Audio Classification of the Content of Food Containers and Drinking Glasses

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Languages