Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Audio APIs from updated spec #202

Merged
merged 2 commits into from
Mar 24, 2024
Merged

Conversation

emk
Copy link
Contributor

@emk emk commented Mar 16, 2024

This PR adds support for AudioResponseFormat::VerboseJson and TimestampGranularity, including updated example code. These were defined as types before, but not fully implemented.

Implements #201.

Please let me know if I can improve this PR! And thank you for a very helpful crate.

This PR adds support for `AudioResponseFormat::VerboseJson` and
`TimestampGranularity`, including updated example code. These were
defined as types before, but not fully implemented.

Implements 64bit#201.
@64bit
Copy link
Owner

64bit commented Mar 16, 2024

Hello,

Thank you for the issue and the implementation! I'll leave review in a separate comment

@64bit
Copy link
Owner

64bit commented Mar 16, 2024

It seems they've updated spec for this.

While logistically this implementation looks to fill the gap - however it doesn't adhere to the new spec - all doc comments, type names must match the spec.

A quick look on spec suggest there are multiple new types (CreateTranscriptionResponseJson, CreateTranscriptionResponseJsonVerbose, etc.) and this PR may require a bit more work to incorporate them all.

@emk
Copy link
Contributor Author

emk commented Mar 18, 2024

While logistically this implementation looks to fill the gap - however it doesn't adhere to the new spec - all doc comments, type names must match the spec.

Ah, great! Do you have a tool to auto-generate this code from the spec, or are you synchronizing it by hand?

I was working from these documents here originally, because they contained quite a bit of information that was missing elsewhere.

@64bit
Copy link
Owner

64bit commented Mar 18, 2024

Unfortunately no tool to auto generate and its written by hand, I have full context about it here #163

@64bit
Copy link
Owner

64bit commented Mar 18, 2024

I was working from these documents here originally, because they contained quite a bit of information that was missing elsewhere.

They also seem to generate website from the spec - and so I go with spec as source of truth.

@emk
Copy link
Contributor Author

emk commented Mar 19, 2024

Ah, great! I'll try to get to this either during the week or the coming weekend.

Supporting the return types CreateTranscriptionResponseJson and CreateTranscriptionResponseJsonVerbose is going to be a bit tricky, because the API chooses between them based on the function arguments, and the Rust version can only return a single type. We have two choices here:

  1. Always return CreateTranscriptionResponseJsonVerbose, but make the fields which aren't in CreateTranscriptionResponseJson optional.
  2. Create a new return type, enum CreateTranscriptionResponse { JsonVerbose(CreateTranscriptionResponseJsonVerbose), Json(CreateTranscriptionResponseJson) }, and rely on serde's smart decoding to pick the first enum option that matches.

I think (2) is a bit more future-proof, but it would require introducing a new type CreateTranscriptionResponse which isn't explicitly present in the official API spec at the moment. I'd be inclined to go with (2), but please let me know if you'd prefer a different choice!

@64bit
Copy link
Owner

64bit commented Mar 19, 2024

Thank you for the detailed choices. (2) is a more reasonable option. However:

because the API chooses between them based on the function arguments

For this a pattern is already used in the crate: to have separate functions to return separate types based on different value to a parameter to same request object.

For example, please take a look at Chat::create and Chat::create_stream, and a more recent addition of Embedding::create and Embedding::create_base64 . Same pattern can be applied here too.

@emk
Copy link
Contributor Author

emk commented Mar 24, 2024

I've gotten caught up with emergency work stuff this weekend, and haven't had a chance to redo this PR. It's still very high on my list, and I hope to get to it ASAP!

- Rename `CreateTranscriptionRespose` to `CreateTranscriptionResponseJson` (to match API spec)
- Add `CreateTranscriptionResponseVerboseJson` and `transcribe_verbose_json`
- Add `transcribe_raw` for SRT output
- Add `post_form_raw`
- Update example code
@emk
Copy link
Contributor Author

emk commented Mar 24, 2024

OK, here's an improved version! I focused on two things:

  1. Making the API as consistent as possible with the spec, and
  2. After satisfying (1), trying to make as few breaking changes as possible.

Here are the specific changes I made:

  • Modify comments and type names to match spec as precisely as possible.
  • BREAKING CHANGE: Rename CreateTranscriptionRespose to CreateTranscriptionResponseJson (to match latest API spec)
  • Add CreateTranscriptionResponseVerboseJson and transcribe_verbose_json
  • Add transcribe_raw for SRT output, which is poorly documented in the OpenAPI spec. They document the parameters to ask for this, but not how it gets returned on the wire. (It's returned as raw subtitle bytes, at least for SRT and VTT formats.)
  • Add post_form_raw
  • Update example code

I'm pretty sure this could be improved in various ways by making different tradeoffs, but I hope it's closer. Thank you for your feedback and for a very helpful library!

@64bit
Copy link
Owner

64bit commented Mar 24, 2024

I've gotten caught up with emergency work stuff this weekend, and haven't had a chance to redo this PR. It's still very high on my list, and I hope to get to it ASAP!

No worries, unlike work, here, there's is no pressure or deadline or emergency. Thank you for taking time to contribute!

Copy link
Owner

@64bit 64bit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good, thank you for matching with spec and adding support for all format!

Your observation about raw/SRT etc. is worth documenting as doc comments for these new function(s) because then it becomes available for everyone on docs.rs, I might make edits for it before publishing.

@64bit 64bit changed the title Implement CreateTranscriptRequest::response_granularities Update Audio APIs from updated spec Mar 24, 2024
@64bit 64bit merged commit db4c213 into 64bit:main Mar 24, 2024
@emk
Copy link
Contributor Author

emk commented Mar 25, 2024

Please do feel free to edit the doc comments! I was trying to stick extremely strictly to OpenAPI spec's comments, but it would very likely help users to add more information to the transcribe_raw function in particular.

64bit added a commit that referenced this pull request Apr 10, 2024
* Update to Assistants example (#146)

* Update to Assistants example

* Update examples/assistants/src/main.rs

update api config for consistency and secutity

Co-authored-by: Himanshu Neema <himanshun.iitkgp@gmail.com>

* added assistant creation

* exit, deconstruct assistant, improved readme

---------

Co-authored-by: Himanshu Neema <himanshun.iitkgp@gmail.com>

* Add examples tool-call and tool-call-stream (#153)

* add names (#150)

* Link to openai-func-enums (#152)

* Link to openai-func-enums

* Link to openai-func-enums

* Update async-openai/README.md

---------

Co-authored-by: Himanshu Neema <himanshun.iitkgp@gmail.com>

* In memory files (#154)

* Added ability to use in-memory files (Bytes, vec[u8])

* Removed unnecessary trait impls

* Polished example

* Spec, readme, and crate description updates (#156)

* get latest spec

* update description

* add WASM

* WASM support on experiments branch

* chore: Release

* Make tool choice lower case (#158)

* Fix: post_form to be Sendable (#157)

* changed to allow Send.

* add simple tests for sendable

* fix test name

* chore: Release

* Add support for rustls-webpki-roots (#168)

* Refactor `types` module (#170)

* Document `impl_from!` macro

* Fix up `impl_from!` docs

* Documents `impl_default!` macro

* Document `impl_input!` macro

* Factor out types from `assistants` module in `types`

* Factor out `model`

* Factor out `audio`

* Factor out `image`

* Factor out `file`

* Factor out `fine_tune`

* Factor out `moderation`

* Factor out `edit`

* Factor out `fine_tuning`

* Factor out missed `DeleteModelResponse` into `model`

* Factor out `embedding`

* Factor out `chat`

* Factor out `completion` and eliminate `types`

* Satisfy clippy

---------

Co-authored-by: Sharif Haason <sharif.haason@gmail.com>

* Sync updates from Spec (#171)

* updates to doc comments and types

* deprecated

* update ChatCompletionFunctions to FunctionObject

* More type updates

* add logprobs field

* update from spec

* updated spec

* fixes suggested by cargo clippy

* add query param to list files (#172)

* chore: Release

* Optional model in ModifyAssistantRequest (#174)

All fields (including model) are optional in OpenAI API.

* update contribution guidelines (#182)

* update contribution guidelines

* fix link

* update

* consistency

* Code of conduct

* chore: Release

* fix file test by providing query param

* Added dimensions param to embedding request (#185)

* chore: Release

* fix: CreateTranscriptionRequest language field not convert (#188)

* chore: Release

* Add usage information to the run object (#195)

* Updates from Spec (#196)

* updates from spec

* remove Edits

* remove Fine-Tunes (was deprecated)

* update spec

* cargo fix

* cargo fmt

* chore: Release

* Add Client::build for full customizability during instantiation (#197)

* Change std::sleep to tokio's sleep (#200)

* chore: Release

* add support for base64 embeddings (#190)

* add support for base64 embeddings

* Base64Embedding is an implementation detail

* feat: separate Embeddings::create_base64 method

* chore: use newtype for hosting base64 decoding instead

* chore: remove unused error variant

* Add vision-chat example (#203)

Example matches quickstart from https://platform.openai.com/docs/guides/vision
It showcases struct derived from ChatCompletionRequestMessageContent

* Update Audio APIs from updated spec (#202)

* Implement CreateTranscriptRequest::response_granularities

This PR adds support for `AudioResponseFormat::VerboseJson` and
`TimestampGranularity`, including updated example code. These were
defined as types before, but not fully implemented.

Implements #201.

* Modify transcription API to be more like spec

- Rename `CreateTranscriptionRespose` to `CreateTranscriptionResponseJson` (to match API spec)
- Add `CreateTranscriptionResponseVerboseJson` and `transcribe_verbose_json`
- Add `transcribe_raw` for SRT output
- Add `post_form_raw`
- Update example code

* Upgrade dependencies: Rust crates in Cargo.toml (#204)

* upgrade reqwest

* update reqwest-eventsource

* cargo test working (#207)

* fix: cargo fmt and compiler warnings fixes (#208)

* cargo fmt

* fix imports

* chore: Release

* fixed problems due to code sync

* update worker dependency to resolve build issue

* update test to fix test compilation issue

* add conditional imports

* change default of InputSource and bring back builders of file-related structs

* update doc

---------

Co-authored-by: Gravel Hill <120497873+Strange-Knoll@users.noreply.github.com>
Co-authored-by: Himanshu Neema <himanshun.iitkgp@gmail.com>
Co-authored-by: Frank Fralick <frankfralick@gmail.com>
Co-authored-by: Sam F <43347795+Prosammer@users.noreply.github.com>
Co-authored-by: David Weis <dweis7@gmail.com>
Co-authored-by: yykt <yuuya.yk.katoh@prism.ricoh>
Co-authored-by: XTY <xutianyi1999@live.com>
Co-authored-by: sharif <61516383+sharifhsn@users.noreply.github.com>
Co-authored-by: Sharif Haason <sharif.haason@gmail.com>
Co-authored-by: Sebastian Sosa <37946988+CakeCrusher@users.noreply.github.com>
Co-authored-by: vmg-dev <121135566+vmg-dev@users.noreply.github.com>
Co-authored-by: TAO <38341571+Taoaozw@users.noreply.github.com>
Co-authored-by: turingbuilder <144046780+turingbuilder@users.noreply.github.com>
Co-authored-by: Gabriel Bianconi <1275491+GabrielBianconi@users.noreply.github.com>
Co-authored-by: Santhanagopalan Krishnamoorthy <santhanagopalan.krishnamoorthy@cuanschutz.edu>
Co-authored-by: Adrien Wald <adrien@genei.io>
Co-authored-by: Gabriel <Gabriel2409@users.noreply.github.com>
Co-authored-by: Eric Kidd <git@randomhacks.net>
Co-authored-by: Samuel Batissou Tiburcio <samuelbatissou@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants