Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add speech endpoint #130

Merged
merged 3 commits into from
Nov 6, 2023
Merged

Add speech endpoint #130

merged 3 commits into from
Nov 6, 2023

Conversation

m1guelpf
Copy link
Contributor

@m1guelpf m1guelpf commented Nov 6, 2023

  • Added a post_raw and execute_raw internal functions, since we don't want to JSON-decode the response body for the speech endpoints (which return raw audio). Modified the execute function to wrap around execute_raw.
  • Added basic support for the TTS api

Note that the struct corresponding to the response_format parameter was named SpeechResponseFormat instead of AudioResponseFormat, since the later already existed for defining whisper response formats.

I'd love to add an enum for voices as well (with an Other wildcard and marked as non-exhaustive), but refrained since the library's pattern seems to be deferring to strings (same case as model IDs all across).

@64bit
Copy link
Owner

64bit commented Nov 6, 2023

Thank you! 🎉

( There's https://crates.io/crates/serde_bytes too, not sure what difference would be between bytes crate and serde_bytes )

'd love to add an enum for voices as well (with an Other wildcard and marked as non-exhaustive), but refrained since the library's pattern seems to be deferring to strings (same case as model IDs all across).

I think for Voice as enum makes sense - go for it . I usually defer to spec when it lists a filed of type string + enum. They haven't released updated spec though.


I'm getting "The model tts-1 does not exist or you do not have access to it." Not sure how to get access to the new models?

@64bit
Copy link
Owner

64bit commented Nov 6, 2023

OH they just did release spec 11 mins ago :D

@m1guelpf
Copy link
Contributor Author

m1guelpf commented Nov 6, 2023

Not sure what the difference is either, but bytes is the type reqwest uses, we just import the crate to access the type really.

Will add the enum in a sec. Model releases in 1h (supposedly) so not available yet lmeow

@64bit
Copy link
Owner

64bit commented Nov 6, 2023

Aha here, https://openai.com/blog/new-models-and-developer-products-announced-at-devday

We’ll begin rolling out new features to OpenAI customers starting at 1pm PT today.

30mins to go 🕐

Copy link
Owner

@64bit 64bit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. will target releasing this in v0.16.0 with other new APIs

@64bit 64bit merged commit e085d30 into 64bit:main Nov 6, 2023
@m1guelpf m1guelpf deleted the speech branch November 6, 2023 20:43
64bit pushed a commit that referenced this pull request Nov 7, 2023
* Add speech endpoint

* Add voice parameter and example

* Add voice enum
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants