auto_model 中punc模型入参为空触发的bug #1660

clb-123 · 2024-04-25T10:03:57Z

🐛 Bug

punc入参的文本为空时报错如下：

Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got torch.cuda.DoubleTensor instead (while checking arguments for embedding)

To Reproduce

Steps to reproduce the behavior (always include the command you ran):
1.在使用pipeline调用speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch模型进行asr推理，当asr识别结果为空时（没有识别有人说话），组合中的punc模型对asr的空文本结果进行推理，导致报错。

pipeline(task=Tasks.auto_speech_recognition, model='iic/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch', model_revision="v2.0.4", vad_model='iic/speech_fsmn_vad_zh-cn-16k-common-pytorch', vad_model_revision="v2.0.4", punc_model='iic/punc_ct-transformer_zh-cn-common-vocab272727-pytorch', punc_model_revision="v2.0.4", spk_model="iic/speech_campplus_sv_zh-cn_16k-common", spk_model_revision="v2.0.2", spk_mode='punc_segment', )

See error

Code sample

File "/root/miniconda3/envs/funasr/lib/python3.8/site-packages/modelscope/pipelines/audio/funasr_pipeline.py", line 73, in call
output = self.model(*args, **kwargs)
File "/root/miniconda3/envs/funasr/lib/python3.8/site-packages/modelscope/models/base/base_model.py", line 35, in call
return self.postprocess(self.forward(*args, **kwargs))
File "/root/miniconda3/envs/funasr/lib/python3.8/site-packages/modelscope/models/audio/funasr/model.py", line 61, in forward
output = self.model.generate(*args, **kwargs)
File "/root/miniconda3/envs/funasr/lib/python3.8/site-packages/funasr/auto/auto_model.py", line 205, in generate
return self.inference_with_vad(input, input_len=input_len, **cfg)
File "/root/miniconda3/envs/funasr/lib/python3.8/site-packages/funasr/auto/auto_model.py", line 386, in inference_with_vad
punc_res = self.inference(result["text"], model=self.punc_model, kwargs=self.punc_kwargs, disable_pbar=True, **cfg)
File "/root/miniconda3/envs/funasr/lib/python3.8/site-packages/funasr/auto/auto_model.py", line 237, in inference
results, meta_data = model.inference(**batch, **kwargs)
File "/root/miniconda3/envs/funasr/lib/python3.8/site-packages/funasr/models/ct_transformer/model.py", line 272, in inference
y, _ = self.punc_forward(**data)
File "/root/miniconda3/envs/funasr/lib/python3.8/site-packages/funasr/models/ct_transformer/model.py", line 83, in punc_forward
x = self.embed(text)
File "/root/miniconda3/envs/funasr/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/root/miniconda3/envs/funasr/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/root/miniconda3/envs/funasr/lib/python3.8/site-packages/torch/nn/modules/sparse.py", line 162, in forward
return F.embedding(
File "/root/miniconda3/envs/funasr/lib/python3.8/site-packages/torch/nn/functional.py", line 2233, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got torch.cuda.DoubleTensor instead (while checking arguments for embedding)

Expected behavior

纠正这个bug，如果已纠正，请告知正确的funasr版本

Environment

Linux: Alibaba Cloud Linux 3.2104 LTS 64位
FunASR Version :1.0.8
ModelScope Version :1.11.1
PyTorch Version :2.1.2
pip install funasr
Python version : 3.8
GPU (NVIDIA T4)
CUDA/cuDNN version : NVIDIA-SMI 530.30.02 Driver Version: 530.30.02 CUDA Version: 12.1

The text was updated successfully, but these errors were encountered:

LauraGPT · 2024-05-07T16:43:11Z

pip install -U funasr modelscope

clb-123 · 2024-05-08T01:55:00Z

pip install -U funasr modelscope

I have updated the version. This is the current version:
funasr==1.0.25
modelscope==1.14.0

There is a new problem now. When the input audio don't exclude active audio, the following errors occurs:
Traceback (most recent call last): File "D:\work_program\call-center-asr\engine_frame\web\service\webrtc_asr.py", line 176, in dbfs_check result.asr_content = double_channel_wav_asr(wavfile_path, file_right_path, file_left_path, is_dbfs=True, File "D:\work_program\call-center-asr\engine_frame\web\service\webrtc_asr.py", line 190, in double_channel_wav_asr return asr_pipeline_by_file(file_left_path, file_right_path, ascii_flag, right_bfs_cal) File "D:\work_program\call-center-asr\engine_frame\web\service\webrtc_asr.py", line 254, in asr_pipeline_by_file result.left_content = model.generate(input=file_left_path) File "D:\work_install\miniconda3\envs\funasr\lib\site-packages\funasr\auto\auto_model.py", line 232, in generate return self.inference_with_vad(input, input_len=input_len, **cfg) File "D:\work_install\miniconda3\envs\funasr\lib\site-packages\funasr\auto\auto_model.py", line 434, in inference_with_vad if raw_text is None: UnboundLocalError: local variable 'raw_text' referenced before assignment

LauraGPT · 2024-05-09T02:51:22Z

Please offer details to reproduce, only use the code of demo.

clb-123 · 2024-05-09T03:11:16Z

Please offer details to reproduce, only use the code of demo.

demo：
model = AutoModel(model="iic/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch", model_revision="v2.0.4", vad_model="iic/speech_fsmn_vad_zh-cn-8k-common", vad_model_revision="v2.0.4", punc_model="ct-punc-c", punc_model_revision="v2.0.4", spk_model="cam++", spk_model_revision="v2.0.2", spk_mode='punc_segment' ) res = model.generate(input=wav_file)
this is a test audio which can trigger this bug:
test_audio.zip

clb-123 added the bug Something isn't working label Apr 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

auto_model 中punc模型入参为空触发的bug #1660

auto_model 中punc模型入参为空触发的bug #1660

clb-123 commented Apr 25, 2024

LauraGPT commented May 7, 2024

clb-123 commented May 8, 2024 •

edited

LauraGPT commented May 9, 2024

clb-123 commented May 9, 2024

auto_model 中punc模型入参为空触发的bug #1660

auto_model 中punc模型入参为空触发的bug #1660

Comments

clb-123 commented Apr 25, 2024

🐛 Bug

To Reproduce

Code sample

Expected behavior

Environment

LauraGPT commented May 7, 2024

clb-123 commented May 8, 2024 • edited

LauraGPT commented May 9, 2024

clb-123 commented May 9, 2024

clb-123 commented May 8, 2024 •

edited