Audio transcription sent to undeclared/test Google Account and not to the provided llm client

Transcribing audio content will use an unspecified Google API key for the transcription, and not as expected use the provided llm_client.

This is in part solved by #326 which at least provides an option to not route everything this way.

---

Instead of relying on the provided LLM `llm_client` markitdown will process audio via the [SpeechRecognition](https://github.com/Uberi/speech_recognition) library `sr`:
https://github.com/microsoft/markitdown/blob/da7bcea527ed04cf6027cc8ece1e1aad9e08a9a1/packages/markitdown/src/markitdown/converters/_transcribe_audio.py#L45-L49

In SpeechRecognition `recognize_google` is mapped to `google_legacy`:

https://github.com/Uberi/speech_recognition/blob/46e70560f605ed190b3b0c16f198ee34978de585/speech_recognition/__init__.py#L1288

The `google_legacy` method even comes with this warning (although does not declare where this key comes from and how the data may be used):
> The Google Speech Recognition API key is specified by ``key``. If not specified, it uses a generic key that works out of the box. This should generally be used for personal or testing purposes only, as it **may be revoked by Google at any time**.

https://github.com/Uberi/speech_recognition/blob/46e70560f605ed190b3b0c16f198ee34978de585/speech_recognition/recognizers/google.py#L225-L262

As it's using some unspecified API key
https://github.com/Uberi/speech_recognition/blob/46e70560f605ed190b3b0c16f198ee34978de585/speech_recognition/recognizers/google.py#L118-L119

	recognizer = sr.Recognizer()
	with sr.AudioFile(audio_source) as source:
	audio = recognizer.record(source)
	transcript = recognizer.recognize_google(audio).strip()
	return "[No speech detected]" if transcript == "" else transcript

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Audio transcription sent to undeclared/test Google Account and not to the provided llm client #1284

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Audio transcription sent to undeclared/test Google Account and not to the provided llm client #1284

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions