Skip to content

Kernel size can't be greater than actual input size #362

@edwardyoon

Description

@edwardyoon

I've downloaded pre-trained model and deep speech.pytorch-1.1. When I try to transcribe audio, it throws RuntimeError: Calculated padded input size per channel: (61 x 6). Kernel size: (21 x 11). Kernel size can't be greater than actual input size. Any advice?

I'm using CUDA-10.0, installations were all finished successfully.

edward@GPU-machine:~/deepspeech.pytorch-1.1$ python transcribe.py --model_path models/librispeech_pretrained.pth --audio_path fox_question.wav 
Traceback (most recent call last):
  File "transcribe.py", line 88, in <module>
    out = model(Variable(spect, volatile=True))
  File "/home/edward/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/edward/deepspeech.pytorch-1.1/model.py", line 172, in forward
    x = self.conv(x)
  File "/home/edward/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/edward/.local/lib/python3.6/site-packages/torch/nn/modules/container.py", line 91, in forward
    input = module(input)
  File "/home/edward/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/edward/.local/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 301, in forward
    self.padding, self.dilation, self.groups)
RuntimeError: Calculated padded input size per channel: (61 x 6). Kernel size: (21 x 11). Kernel size can't be greater than actual input size at /pytorch/aten/src/THNN/generic/SpatialConvolutionMM.c:48

'''

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions