Legacy support

This version supports reading the datasets from DCASE2020 task2, DCASE2021 task2, DCASE2022 task2 and DCASE2023 task2 dataset for inputs.

Description

Legacy-support scripts are similar to the main scripts. These are in tools directory.

Helper scripts
- tools/data_download_2020.sh
  - This script downloads development and evaluation data files into data/dcase2020t2/dev_data/raw/ and data/dcase2020t2/eval_data/raw/.
  - Rename evaluation data after downloading the dataset to evaluate and calculate AUC score. Renamed data is stored in data/dcase2020t2/eval_data/raw/test_rename
- tools/data_download_2021.sh
  - This script downloads development data and evaluation data files and puts them into data/dcase2021t2/dev_data/raw/ and data/dcase2021t2/eval_data/raw/.
  - Merge source_test and target_test into test to be treated like any other.
  - Rename evaluation data after downloading the dataset to evaluate and calculate the AUC score. Renamed data is stored in data/dcase2021t2/eval_data/raw/test_rename
- tools/data_download_2022.sh
  - This script downloads development data and evaluation data files and puts them into data/dcase2022t2/dev_data/raw/ and data/dcase2022t2/eval_data/raw/.
  - Rename evaluation data after downloading the dataset to evaluate and calculate AUC score. Renamed data is stored in data/dcase2022t2/eval_data/raw/test_rename
- tools/data_download_2023.sh
  - This script downloads development data and evaluation data files and puts them into data/dcase2023t2/dev_data/raw/ and data/dcase2023t2/eval_data/raw/.
  - Rename evaluation data after downloading the dataset to evaluate and calculate AUC score. Renamed data is stored in data/dcase2023t2/eval_data/raw/test_rename
tools/01_train_legacy.sh
- DCASE2020 task2 mode:
  - "Development" mode:
    - This script trains a model for each machine type for each section ID by using the directory data/dcase2020t2/dev_data/raw/<machine_type>/train/<section_id>
  - "Evaluation" mode:
    - This script trains a model for each machine type for each section ID by using the directory data/dcase2020t2/eval_data/raw/<machine_type>/train/<section_id>.
- tools/DCASE2021 task2 mode:
  - "Development" mode:
    - This script trains a model for each machine type for each section ID by using the directory data/dcase2021t2/dev_data/raw/<machine_type>/train/<section_id>
  - "Evaluation" mode:
    - This script trains a model for each machine type for each section ID by using the directory data/dcase2021t2/eval_data/raw/<machine_type>/train/<section_id>.
- DCASE2022 task2 mode:
  - "Development" mode:
    - This script trains a model for each machine type for each section ID by using the directory data/dcase2022t2/dev_data/raw/<machine_type>/train/<section_id>
  - "Evaluation" mode:
    - This script trains a model for each machine type for each section ID by using the directory data/dcase2022t2/eval_data/raw/<machine_type>/train/<section_id>.
- DCASE2023 task2 mode:
  - "Development" mode:
    - This script trains a model for each machine type for each section ID by using the directory data/dcase2023t2/dev_data/raw/<machine_type>/train/<section_id>
  - "Evaluation" mode:
    - This script trains a model for each machine type for each section ID by using the directory data/dcase2023t2/eval_data/raw/<machine_type>/train/<section_id>.
tools/02a_test_legacy.sh (Use MSE as a score function for the Simple Autoencoder mode)
- DCASE2020 task2 mode:
  - "Development" mode:
    - This script generates a CSV file for each section, including the anomaly scores for each WAV file in the directories data/dcase2020t2/dev_data/raw/<machine_type>/test/.
    - The generated CSV files will be stored in the directory results/.
    - The generated CSV file contains AUC, pAUC, precision, recall, and F1-score for each section.
  - "Evaluation" mode:
    - This script generates a CSV file for each section, including the anomaly scores for each WAV file in the directories data/dcase2020t2/eval_data/raw/<machine_type>/test/. (These directories will be made available with the "evaluation dataset".)
    - The generated CSV files are stored in the directory results/.
    - If test_rename directory is available, this script generates a CSV file that contains AUC, pAUC, precision, recall, and F1-score for each section.
- DCASE2021 task2 mode:
  - "Development" mode:
    - This script generates a CSV file for each section, including the anomaly scores for each wav file in the directories data/dcase2021t2/dev_data/raw/<machine_type>/test/.
    - The generated CSV files will be stored in the directory results/.
    - This script also generates a CSV file that contains AUC, pAUC, precision, recall, and F1-score for each section.
  - "Evaluation" mode:
    - This script generates a CSV file for each section, including the anomaly scores for each wav file in the directories data/dcase2021t2/eval_data/raw/<machine_type>/test/. (These directories will be made available with the "evaluation dataset".)
    - The generated CSV files are stored in the directory results/.
    - If test_rename directory is available, this script generates a CSV file including AUC, pAUC, precision, recall, and F1-score for each section.
- DCASE2022 task2 mode:
  - "Development" mode:
    - This script generates a CSV file for each section, including the anomaly scores for each wav file in the directories data/dcase2022t2/dev_data/raw/<machine_type>/test/.
    - The generated CSV files will be stored in the directory results/.
    - It also generates a CSV file including AUC, pAUC, precision, recall, and F1-score for each section.
  - "Evaluation" mode:
    - This script generates a CSV file for each section, including the anomaly scores for each wav file in the directories data/dcase2022t2/eval_data/raw/<machine_type>/test/. (These directories will be made available with the "evaluation dataset".)
    - The generated CSV files are stored in the directory results/.
    - If test_rename directory is available, this script generates a CSV file including AUC, pAUC, precision, recall, and F1-score for each section.
- DCASE2023 task2 mode:
  - "Development" mode:
    - This script generates a CSV file for each section, including the anomaly scores for each wav file in the directories data/dcase2023t2/dev_data/raw/<machine_type>/test/.
    - The generated CSV files will be stored in the directory results/.
    - It also generates a CSV file including AUC, pAUC, precision, recall, and F1-score for each section.
  - "Evaluation" mode:
    - This script generates a CSV file for each section, including the anomaly scores for each wav file in the directories data/dcase2023t2/eval_data/raw/<machine_type>/test/. (These directories will be made available with the "evaluation dataset".)
    - The generated CSV files are stored in the directory results/.
    - If test_rename directory is available, this script generates a CSV file including AUC, pAUC, precision, recall, and F1-score for each section.
tools/02b_test_legacy.sh (Use Mahalanobis distance as a score function for the Selective Mahalanobis mode)
- "Development" mode:
  - This script generates a CSV file for each section, including the anomaly scores for each wav file in the directories data/dcase2021t2/dev_data/raw/<machine_type>/test/.
  - The CSV files will be stored in the directory results/.
  - It also generates a CSV file including AUC, pAUC, precision, recall, and F1-score for each section.
- "Evaluation" mode:
  - This script generates a CSV file for each section, including the anomaly scores for each wav file in the directories data/dcase2021t2/eval_data/raw/<machine_type>/test/. (These directories will be made available with the "evaluation dataset".)
  - The CSV files are stored in the directory results/.
  - If test_rename directory is available, this script generates a CSV file including AUC, pAUC, precision, recall, and F1-score for each section.
- DCASE2022 task2 mode:
  - "Development" mode:
    - This script generates a CSV file for each section, including the anomaly scores for each wav file in the directories data/dcase2022t2/dev_data/raw/<machine_type>/test/.
    - The CSV files will be stored in the directory results/.
    - It also makes a csv file including AUC, pAUC, precision, recall, and F1-score for each section.
  - "Evaluation" mode:
    - This script generates a CSV file for each section, including the anomaly scores for each wav file in the directories data/dcase2022t2/eval_data/raw/<machine_type>/test/. (These directories will be made available with the "evaluation dataset".)
    - The generated CSV files are stored in the directory.
    - This script also generates a CSV file, containing AUC, pAUC, precision, recall, and F1-score for each section.
- DCASE2023 task2 mode:
  - "Development" mode:
    - This script generates a CSV file for each section, including the anomaly scores for each wav file in the directories data/dcase2023t2/dev_data/raw/<machine_type>/test/.
    - The CSV files will be stored in the directory results/.
    - It also makes a csv file including AUC, pAUC, precision, recall, and F1-score for each section.
  - "Evaluation" mode:
    - This script generates a CSV file for each section, including the anomaly scores for each wav file in the directories data/dcase2023t2/eval_data/raw/<machine_type>/test/. (These directories will be made available with the "evaluation dataset".)
    - The generated CSV files are stored in the directory.
    - This script also generates a CSV file, containing AUC, pAUC, precision, recall, and F1-score for each section.
03_summarize_results.sh
- This script summarizes results into a csv file.
- Use the same as when summarizing DCASE2023T2 and DCASE2024T2 results.

Usage

Legacy scripts in tools directory can be executed regardless of the current directory.

1. Download datasets

DCASE2020T2
- "Development dataset"
  - Download "dev_data_<machine_type>.zip" from https://zenodo.org/record/3678171.
- "Additional training dataset", i.e., the evaluation dataset for training
  - Download "eval_data_<machine_type>_train.zip" from https://zenodo.org/record/3727685.
- "Evaluation dataset", i.e., the evaluation dataset for test
  - Download "eval_data_<machine_type>_test.zip" from https://zenodo.org/record/3841772.
DCASE2021T2
- "Development dataset"
  - Download "dev_data_<machine_type>.zip" from https://zenodo.org/record/4562016.
- "Additional training dataset", i.e., the evaluation dataset for training
  - Download "eval_data_<machine_\type>_train.zip" from https://zenodo.org/record/4660992.
- "Evaluation dataset", i.e., the evaluation dataset for test
  - Download "eval_data_<machine_type>_test.zip" from https://zenodo.org/record/4884786.
DCASE2022T2
- "Development dataset"
  - Download "dev_data_<machine_type>.zip" from https://zenodo.org/record/6355122.
- "Additional training dataset", i.e., the evaluation dataset for training
  - Download "eval_data_<machine_type>_train.zip" from https://zenodo.org/record/6462969.
- "Evaluation dataset", i.e., the evaluation dataset for test
  - Download "eval_data_<machine_type>_test.zip" from https://zenodo.org/record/6586456.
DCASE2023T2
- "Development Dataset"
  - Download "dev_data_<machine_type>.zip" from https://zenodo.org/record/7882613.
- "Additional Training Dataset", i.e., the evaluation dataset for training
  - Download "eval_data_<machine_type>_train.zip" from https://zenodo.org/record/7830345.
- "Evaluation Dataset", i.e., the evaluation dataset for test
  - Download "eval_data_<machine_type>_test.zip" from https://zenodo.org/record/7860847.

3. Unzip the downloaded files and make the directory structure as follows:

The legacy dataset directory structure is the same as DCASE2023 task2. These parent directories are the result of all the downloaded datasets.

dcase2023_task2_baseline_ae
- /data
  - /dcase2020t2
  - /dcase2021t2
  - /dcase2022t2
  - /dcase2023t2
  - /dcase2024t2

learn more about directory structure.

4. Change parameters

Change parameters using baseline.yaml in the same as DCASE2024 mode.

4.1. Enable Auto-download dataset

If you haven't yet downloaded the dataset yourself nor you have not run the download script (example, data_download_2020.sh) then you may want to use the auto download. To enable the auto-downloading, set the parameter --is_auto_download (default: False) True in baseline.yaml. If --is_auto_download is True, then auto-download is executed.

5. Run the training script

Run the training script 01_train_legacy.sh. this script differs from 01_train.sh in using two options. The first option is using the dataset name. An example is DCASE2020T2. The second option chooses whether to use dev data or eval data. Use the options DCASE2020T2 and -d for the development dataset data/dcase2020t2/dev_data/<machine_type>/raw/train/.

01_train_legacy.sh can be used wherever the current directory. Specify it in a relative path.

# if your current directory is dcase2023_task2_baseline_ae.
$ bash tools/01_train.sh DCASE2020T2 -d

First parameters
- DCASE2020T2
- DCASE2021T2
- DCASE2022T2
- DCASE2023T2
Second parameters
- -d
- -e

Others are the same as in 01_train.sh.

6. Run the test script

6.1. Testing with the Simple Autoencoder mode

Run the training script 02a_test_legacy.sh. this script differs from 02a_test.sh in using two options. The first option is using the dataset name. An example is DCASE2020T2. The second option chooses whether to use dev data or eval data. Use the options DCASE2020T2 and -d for the development dataset data/dcase2020t2/dev_data/<machine_type>/raw/test/.

# if your current directory is dcase2023_task2_baseline_ae.
$ bash tools/02a_test_legacy.sh DCASE2020T2 -d

First parameters
- DCASE2020T2
- DCASE2021T2
- DCASE2022T2
- DCASE2023T2
Second parameters
- -d
- -e

Others are the same as in 02a_test_legacy.sh.

6.2. Testing with the Selective Mahalanobis mode

Run the training script 02b_test_legacy.sh. this script differs from 02b_test.sh in using two options. The first option is using the dataset name. An example is DCASE2020T2. The second option chooses whether to use dev data or eval data. Use the options DCASE2020T2 and -d for the development dataset data/dcase2020t2/dev_data/<machine_type>/raw/test/.

# if your current directory is dcase2023_task2_baseline_ae.
$ bash tools/02b_test_legacy.sh DCASE2020T2 -d

First parameters
- DCASE2020T2
- DCASE2021T2
- DCASE2022T2
- DCASE2023T2
Second parameters
- -d
- -e

Others are the same as in 02a_test_legacy.sh.

7. Check results

You can check the anomaly scores in the csv files anomaly_score_<machine_type>_section_<section_index>_test.csv in the directory results/. Each anomaly score corresponds to a wav file in the directories. If you were learning with DCASE2020T2 then data/dcase2020t2/dev_data/<machine_type>/test/.

Also, anomaly detection results based on the corresponding threshold can be checked in the CSV files decision_result_<machine_type>_section_<section_index>_test.csv. In addition, you can check performance indicators such as AUC, pAUC, precision, recall, and F1-score.

8. Summarize results

After the executed 02a_test_legacy.sh, 02b_test_legacy.sh, or both. Run the summarize script 03_summarize_results.sh with options that are the same as 01_train_legacy.sh, 02a_test_legacy.sh and 02_b_test_legacy.sh.

# Summarize development dataset
$ 03_summarize_results.sh DCASE2020T2 -d

# Summarize evaluation dataset
$ 03_summarize_results.sh DCASE2020T2 -e

First parameters
- DCASE2020T2
- DCASE2021T2
- DCASE2022T2
- DCASE2023T2
- DCASE2024T2
Second parameters
- -d
- -e

If you want to change, summarize results directory or export directory, edit 03_summarize_results.sh.

Directory structure of the downloaded dataset

Note that the wav file's parent directory. At that time dataset directory is dev_data and eval_data, but in this repository, it is data/dcase202xt2/dev_data/raw and data/dcase202xt2/eval_data/raw.

DCASE2020 task2

dcase2023_task2_baseline_ae
- /data/dcase2020t2/dev_data/raw
  - /ToyCar
    - /train (Only normal data for all Machine IDs are included.)
      - /normal_id_01_00000000.wav
      - ...
      - /normal_id_01_00000999.wav
      - /normal_id_02_00000000.wav
      - ...
      - /normal_id_04_00000999.wav
    - /test (Normal and anomaly data for all Machine IDs are included.)
      - /normal_id_01_00000000.wav
      - ...
      - /normal_id_01_00000349.wav
      - /anomaly_id_01_00000000.wav
      - ...
      - /anomaly_id_01_00000263.wav
      - /normal_id_02_00000000.wav
      - ...
      - /anomaly_id_04_00000264.wav
  - /ToyConveyor (The other Machine Types have the same directory structure as ToyCar.)
  - /fan
  - /pump
  - /slider
  - /valve
- /data/dcase2020t2/eval_data/raw
  - /ToyCar
    - /train (Unzipped "additional training dataset". Only normal data for all Machine IDs are included.)
      - /normal_id_05_00000000.wav
      - ...
      - /normal_id_05_00000999.wav
      - /normal_id_06_00000000.wav
      - ...
      - /normal_id_07_00000999.wav
    - /test (Unzipped "evaluation dataset". Normal and anomaly data for all Machine IDs are included, but there is no label about normal or anomaly.)
      - /id_05_00000000.wav
      - ...
      - /id_05_00000514.wav
      - /id_06_00000000.wav
      - ...
      - /id_07_00000514.wav
    - test_rename/ (convert from test directory using tools/rename.py)
      - /normal_id_05_00000000.wav
      - ...
      - /normal_id_05_00000249.wav
      - /anomaly_id_05_00000000.wav
      - ...
      - /anomaly_id_05_00000264.wav
      - /normal_id_06_00000000.wav
      - ...
      - /anomaly_id_07_00000264.wav
  - /ToyConveyor (The other machine types have the same directory structure as ToyCar.)
  - /fan
  - /pump
  - /slider
  - /valve

DCASE2021 task2

dcase2023_task2_baseline_ae
- /data/dcase2021t2/dev_data/raw
  - /fan
    - /train (Normal data in the source and target domains for all sections are included.)
      - /section_00_source_train_normal_0000_<attribute>.wav
      - ...
      - /section_00_source_train_normal_0999_<attribute>.wav
      - /section_00_target_train_normal_0000_<attribute>.wav
      - /section_00_target_train_normal_0001_<attribute>.wav
      - /section_00_target_train_normal_0002_<attribute>.wav
      - /section_01_source_train_normal_0000_<attribute>.wav
      - ...
      - /section_02_target_train_normal_0002_<attribute>.wav
    - /source_test (Normal and anomaly data in the source domain for all sections are included.)
    - /target_test (Normal and anomaly data in the target domain for all sections are included.)
    - /test (Normal and anomaly data in the source and target domains for all sections are included)
      - /section_00_source_test_normal_0000.wav
      - ...
      - /section_00_source_test_normal_0099.wav
      - /section_00_source_test_anomaly_0000.wav
      - ...
      - /section_00_source_test_anomaly_0099.wav
      - /section_01_source_test_normal_0000.wav
      - ...
      - /section_02_source_test_anomaly_0099.wav
      - /section_00_target_test_normal_0000.wav
      - ...
      - /section_00_target_test_normal_0099.wav
      - /section_00_target_test_anomaly_0000.wav
      - ...
      - /section_00_target_test_anomaly_0099.wav
      - /section_01_target_test_normal_0000.wav
      - ...
      - /section_02_target_test_anomaly_0099.wav
  - /gearbox (The other machine types have the same directory structure as fan.)
  - /pump
  - /slider
  - /valve
  - /ToyCar
  - /ToyTrain
- /data/dcase2021t2/eval_data/raw
  - /fan
    - /train (Unzipped additional training dataset. Normal data in the source and target domains for all sections are included.)
      - /section_03_source_train_normal_0000_<attribute>.wav
      - ...
      - /section_03_source_train_normal_0999_<attribute>.wav
      - /section_03_target_train_normal_0000_<attribute>.wav
      - /section_03_target_train_normal_0001_<attribute>.wav
      - /section_03_target_train_normal_0002_<attribute>.wav
      - /section_04_source_train_normal_0000_<attribute>.wav
      - ...
      - /section_05_target_train_normal_0002_<attribute>.wav
    - /source_test (Unzipped evaluation dataset. Normal and anomaly data in the source domain for all sections are included.)
    - /target_test (Unzipped evaluation dataset. Normal and anomaly data in the target domain for all sections are included.)
    - /test (Unzipped evaluation dataset. Normal and anomaly data in the source and target domains for all sections are included)
      - /section_03_source_test_0000.wav
      - ...
      - /section_03_source_test_0199.wav
      - /section_04_source_test_0000.wav
      - ...
      - /section_05_source_test_0199.wav
      - /section_03_target_test_0000.wav
      - ...
      - /section_03_target_test_0199.wav
      - /section_04_target_test_0000.wav
      - ...
      - /section_05_target_test_0199.wav
    - test_rename/ (convert from test directory using tools/rename.py)
      - /section_03_source_test_normal_0000.wav
      - ...
      - /section_03_source_test_normal_0099.wav
      - /section_03_source_test_anomaly_0000.wav
      - ...
      - /section_03_source_test_anomaly_0099.wav
      - /section_04_source_test_normal_0000.wav
      - ...
      - /section_05_source_test_anomaly_0099.wav
      - /section_03_target_test_normal_0000.wav
      - ...
      - /section_03_target_test_normal_0099.wav
      - /section_03_target_test_anomaly_0000.wav
      - ...
      - /section_03_target_test_anomaly_0099.wav
      - /section_04_target_test_normal_0000.wav
      - ...
      - /section_05_target_test_anomaly_0099.wav
  - /gearbox (The other machine types have the same directory structure as fan.)
  - /pump
  - /slider
  - /valve
  - /ToyCar
  - /ToyTrain

DCASE2022 task2

dcase2023_task2_baseline_ae
- /data/dcase2022t2/dev_data/raw
  - /fan
    - /train (only normal clips)
      - /section_00_source_train_normal_0000_<attribute>.wav
      - ...
      - /section_00_source_train_normal_0989_<attribute>.wav
      - /section_00_target_train_normal_0000_<attribute>.wav
      - ...
      - /section_00_target_train_normal_0009_<attribute>.wav
      - /section_01_source_train_normal_0000_<attribute>.wav
      - ...
      - /section_02_target_train_normal_0009_<attribute>.wav
    - /test
      - /section_00_source_test_normal_0000_<attribute>.wav
      - ...
      - /section_00_source_test_normal_0049_<attribute>.wav
      - /section_00_source_test_anomaly_0000_<attribute>.wav
      - ...
      - /section_00_source_test_anomaly_0049_<attribute>.wav
      - /section_00_target_test_normal_0000_<attribute>.wav
      - ...
      - /section_00_target_test_normal_0049_<attribute>.wav
      - /section_00_target_test_anomaly_0000_<attribute>.wav
      - ...
      - /section_00_target_test_anomaly_0049_<attribute>.wav
      - /section_01_source_test_normal_0000_<attribute>.wav
      - ...
      - /section_02_target_test_anomaly_0049_<attribute>.wav
    - attributes_00.csv (attribute csv for section 00)
    - attributes_01.csv (attribute csv for section 01)
    - attributes_02.csv (attribute csv for section 02)
  - /gearbox (The other machine types have the same directory structure as fan.)
  - /bearing
  - /slider (slider means "slide rail")
  - /ToyCar
  - /ToyTrain
  - /valve
- /data/dcase2022t2/eval_data/raw
  - /fan
    - /train (after the launch of the additional training dataset)
      - /section_03_source_train_normal_0000_<attribute>.wav
      - ...
      - /section_03_source_train_normal_0989_<attribute>.wav
      - /section_03_target_train_normal_0000_<attribute>.wav
      - ...
      - /section_03_target_train_normal_0009_<attribute>.wav
      - /section_04_source_train_normal_0000_<attribute>.wav
      - ...
      - /section_05_target_train_normal_0009_<attribute>.wav
    - /test (after the launch of the evaluation dataset)
      - /section_03_test_0000.wav
      - ...
      - /section_03_test_0199.wav
      - /section_04_test_0000.wav
      - ...
      - /section_05_test_0199.wav
    - test_rename/ (convert from test directory using tools/rename.py)
      - /section_03_source_test_normal_0000_<attribute>.wav
      - ...
      - /section_03_source_test_normal_0049_<attribute>.wav
      - /section_03_source_test_anomaly_0000_<attribute>.wav
      - ...
      - /section_03_source_test_anomaly_0049_<attribute>.wav
      - /section_03_target_test_normal_0000_<attribute>.wav
      - ...
      - /section_03_target_test_normal_0049_<attribute>.wav
      - /section_03_target_test_anomaly_0000_<attribute>.wav
      - ...
      - /section_03_target_test_anomaly_0049_<attribute>.wav
      - /section_04_source_test_normal_0000_<attribute>.wav
      - ...
      - /section_05_target_test_anomaly_0049_<attribute>.wav
    - attributes_03.csv (attribute csv for train data in section 03)
    - attributes_04.csv (attribute csv for train data in section 04)
    - attributes_05.csv (attribute csv for train data in section 05)
  - /gearbox (The other machine types have the same directory structure as fan.)
  - /bearing
  - /slider (slider means "slide rail")
  - /ToyCar
  - /ToyTrain
  - /valve

DCASE2023 task2

dcase2023_task2_baseline_ae
- /data/dcase2023t2/dev_data/raw
  - /bearing
    - /train (only normal clips)
      - /section_00_source_train_normal_0000_<attribute>.wav
      - ...
      - /section_00_source_train_normal_0989_<attribute>.wav
      - /section_00_target_train_normal_0000_<attribute>.wav
      - ...
      - /section_00_target_train_normal_0009_<attribute>.wav
    - test/
      - /section_00_source_test_normal_0000_<attribute>.wav
      - ...
      - /section_00_source_test_normal_0049_<attribute>.wav
      - /section_00_source_test_anomaly_0000_<attribute>.wav
      - ...
      - /section_00_source_test_anomaly_0049_<attribute>.wav
      - /section_00_target_test_normal_0000_<attribute>.wav
      - ...
      - /section_00_target_test_normal_0049_<attribute>.wav
      - /section_00_target_test_anomaly_0000_<attribute>.wav
      - ...
      - /section_00_target_test_anomaly_0049_<attribute>.wav
    - attributes_00.csv (attributes CSV for section 00)
  - /fan (The other machine types have the same directory structure as fan.)
  - /gearbox
  - /slider (slider means "slide rail")
  - /ToyCar
  - /ToyTrain
  - /valve
- /data/dcase2023t2/eval_data/raw/
  - /bandsaw
    - /train (after launch of the additional training dataset)
      - /section_00_source_train_normal_0000_<attribute>.wav
      - ...
      - /section_00_source_train_normal_0989_<attribute>.wav
      - /section_00_target_train_normal_0000_<attribute>.wav
      - ...
      - /section_00_target_train_normal_0009_<attribute>.wav
    - /test (after launch of the evaluation dataset)
      - /section_00_test_0000.wav
      - ...
      - /section_00_test_0199.wav
    - /test_rename (convert from test directory using tools/rename.py)
      - /section_00_source_test_normal_0000_<attribute>.wav
      - ...
      - /section_00_source_test_normal_0049_<attribute>.wav
      - /section_00_source_test_anomaly_0000_<attribute>.wav
      - ...
      - /section_00_source_test_anomaly_0049_<attribute>.wav
      - /section_00_target_test_normal_0000_<attribute>.wav
      - ...
      - /section_00_target_test_normal_0049_<attribute>.wav
      - /section_00_target_test_anomaly_0000_<attribute>.wav
      - ...
      - /section_00_target_test_anomaly_0049_<attribute>.wav
    - attributes_00.csv (attributes CSV for section 00)
  - /grinder
  - /shaker
  - /ToyDrone
  - /ToyNscale
  - /ToyTank
  - /Vacuum

Truth attribute of evaluation data

Public ground truth

The following code was used to calculate the official score. Among these is evaluation datasets ground truth.

dcase2020_task2_evaluator
dcase2021_task2_evaluator
dcase2022_task2_evaluator
dcase2023_task2_evaluator

In this repository

This repository have evaluation data's ground truth csv. this csv is using to rename evaluation datasets. You can calculate AUC and other score if add ground truth to evaluation datasets file name. *Usually, rename function is executed along with download script and auto download function.

DCASE2020 task2
DCASE2021 task2
DCASE2022 task2
DCASE2023 task2

Citation

If you use this system, please cite all papers that correspond to the datasets you use.

DCASE2020 task2

Yuma Koizumi, Shoichiro Saito, Hisashi Uematsu, Noboru Harada, and Keisuke Imoto. ToyADMOS: a dataset of miniature-machine operating sounds for anomalous sound detection. In Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 308 E12. November 2019. URL.
Harsh Purohit, Ryo Tanabe, Takeshi Ichige, Takashi Endo, Yuki Nikaido, Kaori Suefusa, and Yohei Kawaguchi. MIMII Dataset: sound dataset for malfunctioning industrial machine investigation and inspection. In Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019), 209 E13. November 2019. URL.
Yuma Koizumi, Yohei Kawaguchi, Keisuke Imoto, Toshiki Nakamura, Yuki Nikaido, Ryo Tanabe, Harsh Purohit, Kaori Suefusa, Takashi Endo, Masahiro Yasuda, and Noboru Harada. Description and discussion on DCASE2020 challenge task2: unsupervised anomalous sound detection for machine condition monitoring. In Proceedings of the Detection and Classification of Acoustic Scenes and Events 2020 Workshop (DCASE2020), 81 E5. November 2020. URL.

DCASE2021 task2

Ryo Tanabe, Harsh Purohit, Kota Dohi, Takashi Endo, Yuki Nikaido, Toshiki Nakamura, and Yohei Kawaguchi. MIMII DUE: sound dataset for malfunctioning industrial machine investigation and inspection with domain shifts due to changes in operational and environmental conditions. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pages 21 E5, 2021. doi:10.1109/WASPAA52581.2021.9632802. URL.
Noboru Harada, Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Masahiro Yasuda, and Shoichiro Saito. ToyADMOS2: another dataset of miniature-machine operating sounds for anomalous sound detection under domain shift conditions. In Proceedings of the 6th Detection and Classification of Acoustic Scenes and Events 2021 Workshop (DCASE2021), 1 E. Barcelona, Spain, November 2021. URL.
Yohei Kawaguchi, Keisuke Imoto, Yuma Koizumi, Noboru Harada, Daisuke Niizumi, Kota Dohi, Ryo Tanabe, Harsh Purohit, and Takashi Endo. Description and discussion on dcase 2021 challenge task 2: unsupervised anomalous detection for machine condition monitoring under domain shifted conditions. In Proceedings of the 6th Detection and Classification of Acoustic Scenes and Events 2021 Workshop (DCASE2021), 186 E90. Barcelona, Spain, November 2021. URL.

DCASE2022 task2

Kota Dohi, Tomoya Nishida, Harsh Purohit, Ryo Tanabe, Takashi Endo, Masaaki Yamamoto, Yuki Nikaido, and Yohei Kawaguchi. MIMII DG: sound dataset for malfunctioning industrial machine investigation and inspection for domain generalization task. In Proceedings of the 7th Detection and Classification of Acoustic Scenes and Events 2022 Workshop (DCASE2022), 1 E. Nancy, France, November 2022. URL.
Noboru Harada, Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Masahiro Yasuda, and Shoichiro Saito. ToyADMOS2: another dataset of miniature-machine operating sounds for anomalous sound detection under domain shift conditions. In Proceedings of the 6th Detection and Classification of Acoustic Scenes and Events 2021 Workshop (DCASE2021), 1 E. Barcelona, Spain, November 2021. URL.
Kota Dohi, Keisuke Imoto, Noboru Harada, Daisuke Niizumi, Yuma Koizumi, Tomoya Nishida, Harsh Purohit, Ryo Tanabe, Takashi Endo, Masaaki Yamamoto, and Yohei Kawaguchi. Description and discussion on DCASE 2022 challenge task 2: unsupervised anomalous sound detection for machine condition monitoring applying domain generalization techniques. In Proceedings of the 7th Detection and Classification of Acoustic Scenes and Events 2022 Workshop (DCASE2022), 1 E. Nancy, France, November 2022. URL.

DCASE2023 task2

Kota Dohi, Keisuke Imoto, Noboru Harada, Daisuke Niizumi, Yuma Koizumi, Tomoya Nishida, Harsh Purohit, Ryo Tanabe, Takashi Endo, and Yohei Kawaguchi. Description and discussion on DCASE 2023 challenge task 2: first-shot unsupervised anomalous sound detection for machine condition monitoring. In arXiv e-prints: 2305.07828, 2023. URL.
Noboru Harada, Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Masahiro Yasuda, and Shoichiro Saito. ToyADMOS2: another dataset of miniature-machine operating sounds for anomalous sound detection under domain shift conditions. In Proceedings of the Detection and Classification of Acoustic Scenes and Events Workshop (DCASE), 1 E. Barcelona, Spain, November 2021. URL.
Kota Dohi, Tomoya Nishida, Harsh Purohit, Ryo Tanabe, Takashi Endo, Masaaki Yamamoto, Yuki Nikaido, and Yohei Kawaguchi. MIMII DG: sound dataset for malfunctioning industrial machine investigation and inspection for domain generalization task. In Proceedings of the 7th Detection and Classification of Acoustic Scenes and Events 2022 Workshop (DCASE2022). Nancy, France, November 2022. URL.
Noboru Harada, Daisuke Niizumi, Yasunori Ohishi, Daiki Takeuchi and Masahiro Yasuda, "First-Shot Anomaly Sound Detection for Machine Condition Monitoring: A Domain Generalization Baseline," 2023 31st European Signal Processing Conference (EUSIPCO), Helsinki, Finland, 2023, pp. 191-195, doi: 10.23919/EUSIPCO58844.2023.10289721. URL.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README_legacy.md

README_legacy.md

Legacy support

Description

Usage

1. Download datasets

3. Unzip the downloaded files and make the directory structure as follows:

4. Change parameters

4.1. Enable Auto-download dataset

5. Run the training script

6. Run the test script

6.1. Testing with the Simple Autoencoder mode

6.2. Testing with the Selective Mahalanobis mode

7. Check results

8. Summarize results

Directory structure of the downloaded dataset

DCASE2020 task2

DCASE2021 task2

DCASE2022 task2

DCASE2023 task2

Truth attribute of evaluation data

Public ground truth

In this repository

Citation

DCASE2020 task2

DCASE2021 task2

DCASE2022 task2

DCASE2023 task2

Files

README_legacy.md

Latest commit

History

README_legacy.md

File metadata and controls

Legacy support

Description

Usage

1. Download datasets

3. Unzip the downloaded files and make the directory structure as follows:

4. Change parameters

4.1. Enable Auto-download dataset

5. Run the training script

6. Run the test script

6.1. Testing with the Simple Autoencoder mode

6.2. Testing with the Selective Mahalanobis mode

7. Check results

8. Summarize results

Directory structure of the downloaded dataset

DCASE2020 task2

DCASE2021 task2

DCASE2022 task2

DCASE2023 task2

Truth attribute of evaluation data

Public ground truth

In this repository

Citation

DCASE2020 task2

DCASE2021 task2

DCASE2022 task2

DCASE2023 task2