feat(api): Add transcribe response format request parameter & adjust STT backends#8318
Conversation
✅ Deploy Preview for localai ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
…lso work on CLI Signed-off-by: Andres Smith <andressmithdev@pm.me> (cherry picked from commit 69a9397) Signed-off-by: Andres Smith <andressmithdev@pm.me>
77bf6a3 to
f0e5b46
Compare
I'm totally fine to do it in a separate PR, for now it's looking good!
That's nice, thank you for switching to the official client (we should probably do the same across the codebase).
I don't see these changes in the PR - is this intentional? In any case, it looks good here - thanks! |
Description
Closes #1071.
This PR adds support for the
response_formatrequest parameter in the transcription endpoint of the API, and to thetranscribeCLI command, in accordance with the official OpenAI API (with the addition of thelrcformat, which I am particularly interested in :) ). The responses of the transcription endpoint now mirror the behaviour of the official API, with the exception of the behaviour when the parameter is omitted, which in our case will do what it did previously, to not break existing use cases.The start/end values for each segment have also been adjusted in the
whisperandfaster-whisperbackends, since they were returning values that when converted to thetime.Durationfield in the main application yielded incorrect values.In the
faster-whisperbackend, thecompute_typewas changed todefaultfromfloat16since I was mistakenly running the model on CPU and it was failing due to my CPU not supporting float16 properly. Once the model was configured to usecuda, it worked fine, but this means thatfaster-whispercurrently won't work on some CPUs.defaultworks on all devices, we can add a check for thef16config if we want to enable float16 support in this backend, either in this PR or in a separate PR.The tests were updated to use the official OpenAi Go client, since the one used before did not support the response format request param properly.
I also took the liberty of restricting certain CI workflows to only run on the main repo, since our forks will not have the credentials to run them correctly, nor should they. If you'd rather I remove these changes, I'll undo them, it's just for convenience to avoid the email spam for all the pipelines that fail constantly.
Notes for Reviewers
These changes were tested against the
whisperandfaster-whisperbackends. I was unable to test withqwen-asr.The AIO tests were also run successfully.
Signed commits