Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

when im running the program shows some error like this #7

Open
ak-karimzai opened this issue Nov 28, 2023 · 4 comments
Open

when im running the program shows some error like this #7

ak-karimzai opened this issue Nov 28, 2023 · 4 comments

Comments

@ak-karimzai
Copy link

python train_and_test.py --asv_path ./datasets/ASVspoof2021_DF_eval/ --in_the_wild_path ./datasets/release_in_the_wild --config configs/finetuning/whisper_frontend_mesonet.yaml --batch_size 8 --epochs 5  --train_amount 100000 --valid_amount 25000

/home/khalid/anaconda3/lib/python3.11/site-packages/torchaudio/functional/functional.py:584: UserWarning: At least one mel filterbank has all zero values. The value for `n_mels` (128) may be set too high. Or, the value for `n_freqs` (257) may be set too low.
  warnings.warn(
2023-11-28 16:41:07,747 - INFO - Loading data...
Traceback (most recent call last):
  File "/home/khalid/Desktop/temp/deepfake-whisper-features/train_and_test.py", line 125, in <module>
    evaluation_config_path, model_path = train_models.train_nn(
                                         ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/khalid/Desktop/temp/deepfake-whisper-features/train_models.py", line 65, in train_nn
    data_train, data_test = get_datasets(
                            ^^^^^^^^^^^^^
  File "/home/khalid/Desktop/temp/deepfake-whisper-features/train_models.py", line 31, in get_datasets
    data_train = DetectionDataset(
                 ^^^^^^^^^^^^^^^^^
  File "/home/khalid/Desktop/temp/deepfake-whisper-features/src/datasets/detection_dataset.py", line 38, in __init__
    datasets = self._init_datasets(
               ^^^^^^^^^^^^^^^^^^^^
  File "/home/khalid/Desktop/temp/deepfake-whisper-features/src/datasets/detection_dataset.py", line 70, in _init_datasets
    asvspoof_dataset = DeepFakeASVSpoofDataset(asvspoof_path, subset=subset)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/khalid/Desktop/temp/deepfake-whisper-features/src/datasets/deepfake_asvspoof_dataset.py", line 29, in __init__
    self.samples = self.read_protocol()
                   ^^^^^^^^^^^^^^^^^^^^
  File "/home/khalid/Desktop/temp/deepfake-whisper-features/src/datasets/deepfake_asvspoof_dataset.py", line 67, in read_protocol
    samples = self.add_line_to_samples(samples, line)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/khalid/Desktop/temp/deepfake-whisper-features/src/datasets/deepfake_asvspoof_dataset.py", line 81, in add_line_to_samples
    sample_path = self.flac_paths[sample_name]
                  ~~~~~~~~~~~~~~~^^^^^^^^^^^^^
KeyError: 'DF_E_2015779'

but when im searching for these file in dataset file exists

find ./datasets/ASVspoof2021_DF_eval/flac -name 'DF_E_2015779.flac' -type f
./datasets/ASVspoof2021_DF_eval/flac/DF_E_2015779.flac
@piotrkawa
Copy link
Owner

Hi!
Do you have all parts of eval DF subset and use this keys & metadata file?

@ak-karimzai
Copy link
Author

yes i downloaded all parts of dataset, its around 60k+ files, and also i downloaded keys & metadata file also.

for better understanding im sharing ASVspoof2021_DF_eval directory contents

ls -al
total 32388
drwxrwxr-x 4 khalid khalid     4096 Nov 28 17:34 .
drwxrwxr-x 4 khalid khalid     4096 Nov 28 14:41 ..
-rw-r--r-- 1 khalid khalid  7953777 May 27  2021 ASVspoof2021.DF.cm.eval.trl.txt
drwxrwxr-x 2 khalid khalid 25182208 Nov 28 17:34 flac
drwxr-xr-x 3 khalid khalid     4096 Dec  1  2021 keys
-rw-r--r-- 1 khalid khalid     2374 May 28  2021 LICENSE.DF.txt
-rw-r--r-- 1 khalid khalid     2227 May 28  2021 README.DF.txt
-rw-rw-r-- 1 khalid khalid      410 Nov 28 15:10 remove_those_which_not_exist.py

@ak-karimzai
Copy link
Author

remove_those_which not exist is a script in python i try to see which files are actually not exist

import os

def is_file_exists(file_path):
    return os.path.exists(file_path) and os.path.isfile(file_path)


files = open('keys/CM/trial_metadata.txt', "r")

data = files.readlines()

files.close()

for idx, line in enumerate(data):
    data[idx] = line.split('- ')[1].strip()

for line in data:
    if not is_file_exists("flac/" + line + ".flac"):
        print(line)
    else:
        print("OK: " + line)

@piotrkawa
Copy link
Owner

I recreated the preparations of the dataset, please check if you follow all of them exactly - the following structure works.

  1. Download all tar files from here (I downloaded them earlier, but I checked and md5sums match).
  2. Download keys and metadata - https://www.asvspoof.org/resources/DF-keys-stage-1.tar.gz
  3. Untar all archives and move them to single directory e.g. DF.
  4. Move keys dir from 2) to DF dir.
  5. The final dataset structure should look as follows:
/path/to/DF# tree -L 3
.
├── ASVspoof2021_DF_eval_part00
│   └── ASVspoof2021_DF_eval
│       ├── ASVspoof2021.DF.cm.eval.trl.txt
│       ├── LICENSE.DF.txt
│       ├── README.DF.txt
│       └── flac
├── ASVspoof2021_DF_eval_part01
│   └── ASVspoof2021_DF_eval
│       └── flac
├── ASVspoof2021_DF_eval_part02
│   └── ASVspoof2021_DF_eval
│       └── flac
├── ASVspoof2021_DF_eval_part03
│   └── ASVspoof2021_DF_eval
│       └── flac
└── keys
    └── CM
        └── trial_metadata.txt
  1. Run script by providing ASV path like this - --asv_path /path/to/DF

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants