Description
Transcribing audio content will use an unspecified Google API key for the transcription, and not as expected use the provided llm_client.
This is in part solved by #326 which at least provides an option to not route everything this way.
Instead of relying on the provided LLM llm_client
markitdown will process audio via the SpeechRecognition library sr
:
In SpeechRecognition recognize_google
is mapped to google_legacy
:
The google_legacy
method even comes with this warning (although does not declare where this key comes from and how the data may be used):
The Google Speech Recognition API key is specified by
key
. If not specified, it uses a generic key that works out of the box. This should generally be used for personal or testing purposes only, as it may be revoked by Google at any time.
As it's using some unspecified API key
https://github.com/Uberi/speech_recognition/blob/46e70560f605ed190b3b0c16f198ee34978de585/speech_recognition/recognizers/google.py#L118-L119