Skip to content

Fun-ASR-Nano-2512 load checkpoint error 載入時出現錯誤 #2799

@fumin

Description

@fumin

🐛 Bug

Just loading Fun-ASR-Nano-2512 results in error: binascii.Error: Incorrect padding
單純載入 Fun-ASR-Nano-2512直接發生錯誤:binascii.Error: Incorrect padding

To Reproduce

Steps to reproduce the behavior (always include the command you ran):

  1. Run the below code sample that does nothing but AutoModel(model="Fun-ASR-Nano-2512")
  2. See error
funasr version: 1.3.1.
Check update of funasr, and it would cost few times. You may disable it by set `disable_update=True` in AutoModel
You are using the latest version of funasr-1.3.1
WARNING:root:trust_remote_code: False
Traceback (most recent call last):
  File "/home/shaoyu/asr/Fun-ASR/qq.py", line 4, in <module>
    model = AutoModel(
            ^^^^^^^^^^
  File "/home/shaoyu/asr/asr/.venv/lib/python3.12/site-packages/funasr/auto/auto_model.py", line 135, in __init__
    model, kwargs = self.build_model(**kwargs)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shaoyu/asr/asr/.venv/lib/python3.12/site-packages/funasr/auto/auto_model.py", line 285, in build_model
    model = model_class(**model_conf)
            ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shaoyu/asr/asr/.venv/lib/python3.12/site-packages/funasr/models/fun_asr_nano/model.py", line 125, in __init__
    ctc_tokenizer = ctc_tokenizer_class(**ctc_tokenizer_conf)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shaoyu/asr/asr/.venv/lib/python3.12/site-packages/funasr/tokenizer/whisper_tokenizer.py", line 37, in SenseVoiceTokenizer
    tokenizer = get_tokenizer(
                ^^^^^^^^^^^^^^
  File "/home/shaoyu/asr/asr/.venv/lib/python3.12/site-packages/funasr/models/sense_voice/whisper_lib/tokenizer.py", line 454, in get_tokenizer
    encoding = get_encoding(name=encoding_name, num_languages=num_languages, vocab_path=vocab_path)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shaoyu/asr/asr/.venv/lib/python3.12/site-packages/funasr/models/sense_voice/whisper_lib/tokenizer.py", line 374, in get_encoding
    base64.b64decode(token): int(rank)
    ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/base64.py", line 88, in b64decode
    return binascii.a2b_base64(s, strict_mode=validate)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
binascii.Error: Incorrect padding

Code sample

from funasr import AutoModel

model_dir = "../Fun-ASR-Nano-2512"
model = AutoModel(
    model=model_dir,
    device="cpu",
)

Expected behavior

I expect no errors.
應該沒有任何錯誤才對。

Environment

  • OS (e.g., Linux): Ubuntu 24.04.3 LTS
  • FunASR Version (e.g., 1.0.0): 1.3.1
  • ModelScope Version (e.g., 1.11.0): 1.34.0
  • PyTorch Version (e.g., 2.0.0): 2.10.0+cu128
  • How you installed funasr (pip, source): pip
  • Python version: 3.12.3
  • GPU (e.g., V100M32): None, using CPU
  • CUDA/cuDNN version (e.g., cuda11.7): None, using CPU
  • Docker version (e.g., funasr-runtime-sdk-cpu-0.4.1): Not using docker, using Ubuntu directly
  • Any other relevant information:

Additional context

There were no issues last week, a fresh reinstall resulted in this error.
上禮拜沒問題,這幾天重新安裝後發生錯誤。

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions