Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to generate timestamp for nvidia/parakeet-tdt-1.1b #8451

Closed
leohuang2013 opened this issue Feb 17, 2024 · 6 comments
Closed

Failed to generate timestamp for nvidia/parakeet-tdt-1.1b #8451

leohuang2013 opened this issue Feb 17, 2024 · 6 comments
Labels
bug Something isn't working stale

Comments

@leohuang2013
Copy link

leohuang2013 commented Feb 17, 2024

Describe the bug

When I tried to generate timestamp with model: nvidia/parakeet-tdt-1.1b, I got following error,
ValueError: char_offsets: [{'char': [tensor(607, dtype=torch.int32)], 'start_offset': 28, 'end_offset': 29}....

call stack,

Traceback (most recent call last):
  File "/tmp/inference/nvidia_asr.py", line 103, in <module>
    main()
  File "/tmp/inference/nvidia_asr.py", line 94, in main
    tt = parakeet_rnnt( audio, 'tdt' )
  File "/tmp/inference/nvidia_asr.py", line 45, in parakeet_rnnt
    hypothesis = asr_model.transcribe([audio], return_hypotheses=True)[0][0]
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/nemo/collections/asr/models/rnnt_models.py", line 298, in transcribe
    best_hyp, all_hyp = self.decoding.rnnt_decoder_predictions_tensor(
  File "/usr/local/lib/python3.10/dist-packages/nemo/collections/asr/metrics/rnnt_wer.py", line 497, in rnnt_decoder_predictions_tensor
    hypotheses[hyp_idx] = self.compute_rnnt_timestamps(hypotheses[hyp_idx], timestamp_type)
  File "/usr/local/lib/python3.10/dist-packages/nemo/collections/asr/metrics/rnnt_wer.py", line 699, in compute_rnnt_timestamps
    raise ValueError(

Steps/Code to reproduce bug
The code to reproduce above the bug,
(The code below can be used to get timestamp if use parakeet rnnt-1.1b model )

asr_model = nemo_asr.models.ASRModel.from_pretrained("nvidia/parakeet-tdt-1.1b")
decoding_cfg = asr_model.cfg.decoding
with open_dict(decoding_cfg):
    decoding_cfg.preserve_alignments = True
    decoding_cfg.compute_timestamps = True
    decoding_cfg.rnnt_timestamp_type = 'word'
asr_model.change_decoding_strategy(decoding_cfg)
hypothesis = asr_model.transcribe([audio], return_hypotheses=True)[0][0]
timestamp_dict = hypothesis.timestep
word_timestamps = timestamp_dict['word']
print(word_timestamps)

Expected behavior
It should output word timestamps instead of exception.

Environment overview (please complete the following information)

  • Environment location: run in local ubuntu22.04 machine.
  • Method of NeMo install: pip install nemo_toolkit['all']

Environment details

If NVIDIA docker image is used you don't need to specify these.
Otherwise, please provide:

  • OS version: ubuntu 22.04
  • PyTorch version: 2.2.0+cu118
  • Python version: 3.10.12

Additional context

Add any other context about the problem here.
GPU model: GTX 1080T

@leohuang2013 leohuang2013 added the bug Something isn't working label Feb 17, 2024
@isaac-mcfadyen
Copy link
Contributor

Also seeing this error.

My temporary workaround is to catch ValueErrors and just add a second or two of blank audio to the end of the file before re-processing which seems to work as a temporary stop-gap until this can be fixed.

Copy link
Contributor

github-actions bot commented Apr 5, 2024

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the stale label Apr 5, 2024
@bradmurray-dt
Copy link

Adding comment to prevent this issue from closing.

@github-actions github-actions bot removed the stale label Apr 10, 2024
Copy link
Contributor

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the stale label May 10, 2024
Copy link
Contributor

This issue was closed because it has been inactive for 7 days since being marked as stale.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale May 18, 2024
@anshulwadhawan
Copy link

anshulwadhawan commented Jun 19, 2024

@bradmurray-dt Still facing this issue with 'parakeet-tdt-1.1b' and 'parakeet-tdt-ctc-1.1b':

  File "/lib/python3.10/site-packages/nemo/collections/asr/parts/submodules/rnnt_decoding.py", line 510, in rnnt_decoder_predictions_tensor
    hypotheses[hyp_idx] = self.compute_rnnt_timestamps(hypotheses[hyp_idx], timestamp_type)
  File "/lib/python3.10/site-packages/nemo/collections/asr/parts/submodules/rnnt_decoding.py", line 753, in compute_rnnt_timestamps
    raise ValueError(
ValueError: `char_offsets`: [{'char': [tensor(386, dtype=torch.int32)], 'start_offset': 2, 'end_offset': 3},.....
have to be of the same length, but are: `len(offsets)`: 102 and `len(processed_tokens)`: 103

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stale
Projects
None yet
Development

No branches or pull requests

4 participants