Skip to content

模型输出的文本和时间戳长度不同,如何进行对应? #1795

@kirayomato

Description

@kirayomato

Notice: In order to resolve issues more efficiently, please raise issue following the template.
(注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节)

❓ Questions and Help

Before asking:

  1. search the issues.
  2. search the docs.

What is your question?

我尝试利用funasr为我的视频生成字幕,但是发现识别得到的文本长度和时间戳长度并不相同。请问如何将文本和时间戳进行对应?

Code

model = AutoModel(model="paraformer-zh",
                  vad_model="fsmn-vad",
                  punc_model="ct-punc",
                  # spk_model="cam++"
                  )
res = model.generate(input=video_path,
                     batch_size_s=300,
                     # hotword='魔搭'
                     )
text = res[0]['text']
ts = res[0]['timestamp']
print(len(text), len(ts))

What have you tried?

What's your environment?

  • OS (e.g., Linux): Windows 11
  • FunASR Version (e.g., 1.0.0): 1.0.27
  • ModelScope Version (e.g., 1.11.0): 1.12.0
  • PyTorch Version (e.g., 2.0.0): 2.2.1
  • How you installed funasr (pip, source): pip
  • Python version: 3.11.0
  • GPU (e.g., V100M32): RTX 4060
  • CUDA/cuDNN version (e.g., cuda11.7): cuda 12.1
  • Docker version (e.g., funasr-runtime-sdk-cpu-0.4.1)
  • Any other relevant information:

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions