Skip to content
Discussion options

You must be logged in to vote

The models are not trained to transcribe into non-English languages, so it's actually not an intended use case (related: #649 (comment)). That being said, I guess in theory you could modify the decoding logic (the update() method below in particular`) to reject words or phrases in English.

whisper/whisper/decoding.py

Lines 199 to 211 in 0b5dcfd

def update(self, tokens: Tensor, logits: Tensor, sum_logprobs: Tensor) -> Tuple[Tensor, bool]:
"""Specify how to select the next token, based on the current trace and logits
Parameters
----------
tokens : Tensor, shape = (n_batch, current_sequence_length)
all tokens in the context so far, including the…

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by jongwook
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants