Skip to content

Audio: ASR automatically infers the wrong language.  #515

@neldivad

Description

@neldivad

Describe the bug

openai.Audio.transcribe(model= 'whisper-1',file = wavfile, options = {"language" : "en",})

openai whisper automatically transcribes and translates input voice to a different language. Specifying the language parameter does nothing.

Nor does adding a prefix of [en] solve the problem. It appears that accents and environmental noise is capable of influencing the language inference model.

From the openai.Audio.transcribe() function, accepted parameters are 'model', 'file', 'apikey', 'apibase', 'apitype', 'apiversion', 'organization', and 'kwargs'. There is no parameter to set the language.

Would be nice for a parameter to force the ASR to a language if the auto-inference model is not reliable.

To Reproduce

tmp.write(audio.get_wav_data())
tmp.seek(0)  # Reset file pointer to beginning
whisp  = openai.Audio.transcribe(model= 'whisper-1',file =tmp, options = {"language" : "en",})
user_input = "[en] " + whisp['text']

print(f'\nYou said:\n"{user_input}"')

messages.append(       {"role": "user", "content": user_input}     )

print("\nResponding...\n")
      
completion = openai.ChatCompletion.create(
        model=MODEL,
        messages=messages,
        temperature=0.8
      )
created  = str( get_timestamp_from_unix( completion['created'] ) )
response = completion['choices'][0].message.content
messages.append({"role": "assistant", "content": f'Created: {created}\n{response}'}) 
print(f"\n{created}\n{response}\n")
You said:
"[en] Jadi, perkara ini masih tidak berfungsi. Saya telah menambahkan sufis bahasa Inggeris dan untuk sebab tertentu, ia masih memberi balik transkripsi yang diterjemahkan oleh Malaysia. Jadi, saya yakin ada masalah dengan masalah transkripsi dari 
OpenAI Whisper."

Responding...


2023-07-03 15:31:47
Created: 2023-07-03 15:31:42
Maaf atas ketidaknyamanannya. Terkadang, ada kesalahan dalam transkripsi bahasa yang dapat menyebabkan masalah seperti yang 
Anda alami. Saya akan mencatat masalah ini dan melaporkannya kepada tim teknis untuk diperiksa. Apakah ada pertanyaan lain yang dapat saya bantu jawab?

Code snippets

No response

OS

Windows

Python version

Python 3.8.3

Library version

openai==0.27.7

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions