Skip to content
Dataset and its sample codes for anomaly detection in sound.
Python
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
C01_create_small_INT_dataset Add files via upload Aug 6, 2019
E01_simple_AE_test Add files via upload Aug 6, 2019
anomaly_conditions
DETAIL.pdf
LICENSE.pdf Add files via upload Jul 25, 2019
README.md Update README.md Aug 23, 2019

README.md

ToyADMOS dataset

ToyADMOS dataset is a machine operating sounds dataset of approximately 540 hours of normal machine operating sounds and over 12,000 samples of anomalous sounds collected with four microphones at a 48kHz sampling rate, prepared by Yuma Koizumi and members in NTT Media Intelligence Laboratories. The ToyADMOS dataset is designed for anomaly detection in machine operating sounds (ADMOS) research. We have collected normal and anomalous operating sounds of miniature machines by deliberately damaging their components. It is designed for three tasks of ADMOS: product inspection (toy car), fault diagnosis for fixed machine (toy conveyor), and fault diagnosis for moving machine (toy train). For more information, refer to the paper [1]. If you use the ToyADMOS dataset in your work, please cite this paper where it was introduced.

[1] Yuma Koizumi, Shoichiro Saito, Noboru Harada, Hisashi Uematsu and Keisuke Imoto, "ToyADMOS: A Dataset of Miniature-Machine Operating Sounds for Anomalous Sound Detection," in Proc of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2019. Paper URL: https://arxiv.org/abs/1908.03299

Download:

The dataset can be downloaded at https://zenodo.org/record/3351307#.XT-JZ-j7QdU.

Since the total size of the ToyADMOS dataset is over 440GB, each sub-dataset is split into 7-9 files by 7-zip (7z-format). The total size of the compressed dataset is approximately 180GB, and that of each sub-dataset is approximately 60GB. Download the zip files corresponding to sub-datasets of interest and use your favorite compression tool to unzip these split zip files.

Detailed description of dataset

See the file named DETAIL.pdf

Usage examples

To give a sense of the usage of this dataset, a set of Python codes for data-generation, training, and test are available.

Tutorials on small training/test datasets written in [1].
- Dowload "C01_create_small_INT_dataset", "E01_simple_AE_test", and "anomaly_conditions"
- Run "make_dataset_for_car_and_conveyor.py" and "make_dataset_for_train.py" in "C01_create_small_INT_dataset" to make dataset
- Run "01_train.py" in "E01_simple_AE_test" to train a model
- Run "02_test.py" in "E01_simple_AE_test" to evaluate a model
- Note that paths in each code need to be changed depending on your environment

We have tested these codes on follwoing environment:

Python: 3.6.8
Chainer: 4.5.0
NumPy: 1.16.2
CuPy:
  CuPy Version          : 4.1.0
  CUDA Build Version    : 9000
  CUDA Driver Version   : 10000
  CUDA Runtime Version  : 9000
  cuDNN Build Version   : 7104
  cuDNN Version         : 7600

License:

See the file named LICENSE.pdf

Authors and Contact

You can’t perform that action at this time.