torchaudio.compliance.kaldi.fbank does NOT support GPU #613

csukuangfj · 2020-05-06T07:31:41Z

The readme.md says

The aim of torchaudio is to apply PyTorch to the audio domain. By supporting PyTorch, torchaudio follows the same philosophy of providing strong GPU acceleration,

However, strong GPU acceleration is not available for torchaudio.compliance.kaldi.fbank.

The following test code

    filename = '/path/to/test.wav'
    waveform, sample_rate = torchaudio.load(filename)
    device = torch.device('cuda', 0)
    waveform = waveform.to(device)

    fbank = torchaudio.compliance.kaldi.fbank(waveform)

produces the following error:

Traceback (most recent call last):
  File "./fbank_test.py", line 27, in <module>
    main()
  File "./fbank_test.py", line 23, in main
    fbank = torchaudio.compliance.kaldi.fbank(waveform)
  File "/xxxxxx/py35/lib/python3.5/site-packages/torchaudio/compliance/kaldi.py", 
line 554, in fbank
    snip_edges, raw_energy, energy_floor, dither, remove_dc_offset, preemphasis_coefficient)
  File "/xxxxx/py35/lib/python3.5/site-packages/torchaudio/compliance/kaldi.py", 
line 180, in _get_window
    signal_log_energy = _get_log_energy(strided_input, EPSILON, energy_floor)  # size (m)
  File "/xxxxx/py35/lib/python3.5/site-packages/torchaudio/compliance/kaldi.py", 
line 113, in _get_log_energy
    log_energy = torch.max(strided_input.pow(2).sum(1), epsilon).log()  # size (m)
RuntimeError: iter.device(arg).is_cuda() INTERNAL ASSERT FAILED at /pytorch/aten/src/ATen/native/cuda/Loops.cuh:56, 
please report a bug to PyTorch.

pytorchaudito is installed using

pip install torchaudio

    print(torch.version.__version__)
    print(torch.version.git_version)
    print(torchaudio.version.__version__)
    print(torchaudio.version.git_version)

produces

1.5.0
4ff3872a2099993bf7e8c588f7182f3df777205b
0.5.0
3305d5c2f935b1512c58faad0aa770432dea6d21

The text was updated successfully, but these errors were encountered:

mthrok · 2020-05-07T02:33:57Z

I could reproduce it with 'test/assets/kaldi_file_8000.wav'.

Env

$ python -m torch.utils.collect_env
Collecting environment information...
PyTorch version: 1.5.0
Is debug build: No
CUDA used to build PyTorch: 9.2

OS: Ubuntu 18.04.3 LTS
GCC version: (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
CMake version: version 3.10.2

Python version: 3.5
Is CUDA available: Yes
CUDA runtime version: Could not collect
GPU models and configuration:
GPU 0: Quadro GP100
GPU 1: Quadro GP100

Nvidia driver version: 418.116.00
cuDNN version: Could not collect

Versions of relevant libraries:
[pip] numpy==1.14.2
[pip] torch==1.5.0
[pip] torchaudio==0.5.0a0+3305d5c
[conda] blas                      1.0                         mkl
[conda] mkl                       2020.0                      166
[conda] pytorch                   1.5.0           py3.5_cuda9.2.148_cudnn7.6.3_0    pytorch
[conda] torchaudio                0.5.0                      py35    pytorch

It happens both on CUDA 9.2 and CUDA 10.1.

It happens on torchaudio master. Something is definitely wrong on either torchaudio or pytorch. I will take a look at torchaudio side.

mthrok · 2020-05-08T03:07:03Z

Hi @csukuangfj

So it turned out that fbank was never GPU-compatible. Sorry about that.
I am working on fixing this issue and so far I have a fix for fbank.
If you have a list of functions from torchaudio.compliance.kaldi you intend to use, can you give me that?
I will also fix them.

csukuangfj · 2020-05-08T03:17:07Z

I changed torch audio locally to support GPU.

I only need torchaudio.compliance.kaldi.fbank. It would be great if

the window function and the melbank matrix are NOT recomputed every time fbank is called
an interface of computing fbank for a batch of audio files is available
a similar interface FBank as the following MFCC is implemented

audio/torchaudio/transforms.py

Line 427 in 7a0d419

class MFCC(torch.nn.Module):

I guess during an application, the sample rate is fixed, so the window function and
melbank matrix can be reused.

So it turned out that fbank was never GPU-compatible

Yes, I found that. For example, the EPSILON tensor is always a CPU tensor.

mthrok · 2020-05-08T23:08:37Z

I only need torchaudio.compliance.kaldi.fbank. It would be great if

the window function and the melbank matrix are NOT recomputed every time fbank is called

an interface of computing fbank for a batch of audio files is available

a similar interface FBank as the following MFCC is implemented

audio/torchaudio/transforms.py

Line 427 in 7a0d419

class MFCC(torch.nn.Module):

I guess during an application, the sample rate is fixed, so the window function and
melbank matrix can be reused.

That is a very valid point. I think a simple approach for torch audio to support that is

have a version of fbank where user can provide precomputed melbank and window function
then put them in a Transform.

csukuangfj · 2020-05-11T09:49:57Z

That is a very valid point. I think a simple approach for torch audio to support that is

have a version of fbank where user can provide precomputed melbank and window function

then put them in a Transform.

Are there any plans to implement the items listed above?

mthrok · 2020-05-13T02:40:22Z

That is a very valid point. I think a simple approach for torch audio to support that is

have a version of fbank where user can provide precomputed melbank and window function

then put them in a Transform.

Are there any plans to implement the items listed above?

No, we do not have resource for that at this moment, but we welcome contributions.

shanesyy-1992 · 2020-11-22T02:07:37Z

Hi @csukuangfj

So it turned out that fbank was never GPU-compatible. Sorry about that.
I am working on fixing this issue and so far I have a fix for fbank.
If you have a list of functions from torchaudio.compliance.kaldi you intend to use, can you give me that?
I will also fix them.

Hello, I got the same issue here. Because of some environment problems, I need to stick on using torch 1.5.1, so the corresponding torchaudio version is 0.5.1. However, I found that only the version starts from 0.6.0 has the fix. I wonder is that possible that the changes here can be updated into the 0.5.1 released version? Many Thanks!

mthrok · 2020-11-22T03:23:54Z

Hi @shanesyy-1992

Unfortunately, we do not have a plan to update the past releases. Since the code is pure Python, the best you can do is to create a patch file, then apply it to your installation.

shanesyy-1992 · 2020-11-22T09:41:06Z

Got it. Thanks a lot for the reply!

`torch::jit::RegisterOperators` is being deprecated. Switch to using `torch::RegisterOperators`

vincentqb assigned mthrok May 6, 2020

vincentqb mentioned this issue May 6, 2020

Migrate Kaldi tests #597

Open

11 tasks

mthrok added bug CUDA Kaldi labels May 7, 2020

mthrok mentioned this issue May 7, 2020

Introduce common utility for defining test matrix for device/dtype #616

Merged

mthrok mentioned this issue May 8, 2020

Make Kaldi fbank support cuda #619

Merged

vincentqb closed this as completed in #619 May 12, 2020

mthrok pushed a commit to mthrok/audio that referenced this issue Feb 26, 2021

Merge pull request pytorch#613 from Krovatkin/krovatkin/fix_custom_op

56bd7f7

`torch::jit::RegisterOperators` is being deprecated. Switch to using `torch::RegisterOperators`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

torchaudio.compliance.kaldi.fbank does NOT support GPU #613

torchaudio.compliance.kaldi.fbank does NOT support GPU #613

csukuangfj commented May 6, 2020

mthrok commented May 7, 2020 •

edited

mthrok commented May 8, 2020

csukuangfj commented May 8, 2020 •

edited

mthrok commented May 8, 2020

csukuangfj commented May 11, 2020

mthrok commented May 13, 2020

shanesyy-1992 commented Nov 22, 2020 •

edited

mthrok commented Nov 22, 2020

shanesyy-1992 commented Nov 22, 2020

torchaudio.compliance.kaldi.fbank does NOT support GPU #613

torchaudio.compliance.kaldi.fbank does NOT support GPU #613

Comments

csukuangfj commented May 6, 2020

mthrok commented May 7, 2020 • edited

mthrok commented May 8, 2020

csukuangfj commented May 8, 2020 • edited

mthrok commented May 8, 2020

csukuangfj commented May 11, 2020

mthrok commented May 13, 2020

shanesyy-1992 commented Nov 22, 2020 • edited

mthrok commented Nov 22, 2020

shanesyy-1992 commented Nov 22, 2020

mthrok commented May 7, 2020 •

edited

csukuangfj commented May 8, 2020 •

edited

shanesyy-1992 commented Nov 22, 2020 •

edited