Skip to content

VCTK missing txt file for 'p315' #483

@HsuanYang-Wang

Description

@HsuanYang-Wang

🐛 Bug

The 'p315' text is missing from the VCTK Corpus. This leads to "FileNotFoundError" error when accessing './VCTK-Corpus/txt/p315'

Please note while text files containing transcripts of the speech are provided for 109 of the 110 recordings, in the '/txt' folder, the 'p315' text was lost due to a hard disk error.

To Reproduce

Steps to reproduce the behavior:

!pip install torch>=1.2.0
!pip install torchaudio
!pip install librosa
%matplotlib inline

import torch
import torchaudio
import librosa
import matplotlib.pyplot as plt
import numpy as np

import torchaudio.datasets as dsets
vctk_data = dsets.VCTK(".", download=True)
vctk_data[1]

Expected behavior

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
<ipython-input-25-a25a0369fbf9> in <module>()
----> 1 vctk_data[1]

1 frames
/usr/local/lib/python3.6/dist-packages/torchaudio/datasets/vctk.py in __getitem__(self, n)
    106             self._ext_txt,
    107             self._folder_audio,
--> 108             self._folder_txt,
    109         )
    110 

/usr/local/lib/python3.6/dist-packages/torchaudio/datasets/vctk.py in load_vctk_item(fileid, path, ext_audio, ext_txt, folder_audio, folder_txt, downsample)
     17     # Read text
     18     file_txt = os.path.join(path, folder_txt, speaker_id, fileid + ext_txt)
---> 19     with open(file_txt) as file_text:
     20         utterance = file_text.readlines()[0]
     21 

FileNotFoundError: [Errno 2] No such file or directory: './VCTK-Corpus/txt/p315/p315_041.txt'

Environment

  • What commands did you used to install torchaudio (conda/pip/build from source)?
    pip
  • If you are building from source, which commit is it?
    Not applicable
  • What does torchaudio.__version__ print? (If applicable)
    1.4.0
PyTorch version: 1.4.0
Is debug build: No
CUDA used to build PyTorch: 10.1

OS: Ubuntu 18.04.3 LTS
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
CMake version: version 3.12.0

Python version: 3.6
Is CUDA available: No
CUDA runtime version: 10.1.243
GPU models and configuration: Could not collect
Nvidia driver version: Could not collect
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5

Versions of relevant libraries:
[pip3] numpy==1.18.2
[pip3] torch==1.4.0
[pip3] torchaudio==0.4.0
[pip3] torchsummary==1.5.1
[pip3] torchtext==0.3.1
[pip3] torchvision==0.5.0
[conda] Could not collect

Additional context

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions