Skip to content

Commit

Permalink
Merge pull request #5666 from pengchengguo/whisper_tokenizer
Browse files Browse the repository at this point in the history
Correct the argument errors in the whisper tokenizer language.
  • Loading branch information
sw005320 committed Feb 19, 2024
2 parents d8b53fd + e2fd18c commit a50d6a0
Show file tree
Hide file tree
Showing 4 changed files with 6 additions and 6 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,8 @@ decoder_conf:

preprocessor: default
preprocessor_conf:
tokenizer_language: "zh"
whisper_language: "zh"
whisper_task: "transcribe"

model_conf:
ctc_weight: 0.0
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,8 @@ decoder_conf:

preprocessor: default
preprocessor_conf:
tokenizer_language: "zh"
whisper_language: "zh"
whisper_task: "transcribe"

model_conf:
ctc_weight: 0.0
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,8 @@ decoder_conf:

preprocessor: default
preprocessor_conf:
tokenizer_language: "zh"
whisper_language: "zh"
whisper_task: "transcribe"

model_conf:
ctc_weight: 0.0
Expand Down
3 changes: 0 additions & 3 deletions espnet2/bin/asr_inference.py
Original file line number Diff line number Diff line change
Expand Up @@ -386,9 +386,6 @@ def __init__(
else:
tokenizer = None
elif "whisper" in token_type:
tokenizer_language = asr_train_args.preprocessor_conf.get(
"tokenizer_language", "en"
)
tokenizer = build_tokenizer(
token_type=token_type,
bpemodel=bpemodel,
Expand Down

0 comments on commit a50d6a0

Please sign in to comment.