Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Embedding deserialization error when using encoding_format EncodingFormat::Base64 #189

Closed
adri1wald opened this issue Feb 8, 2024 · 0 comments · Fixed by #190
Closed
Labels
bug Something isn't working

Comments

@adri1wald
Copy link
Contributor

adri1wald commented Feb 8, 2024

OpenAI returns embeddings in base64 string representation (offers better compactness than a JSON array) when specifying EncodingFormat::Base64. However, this is not handled in deserialization logic of async-openai.

Imo, Base64 should also be the default since much less data is transferred (as is the case in the python client).

Snippet:

async fn embed(openai: &Client<OpenAIConfig>, text: String) -> Result<Embedding> {
    let input: String = text.into();
    let request = CreateEmbeddingRequestArgs::default()
        .model("text-embedding-3-small")
        .input(input)
        .encoding_format(EncodingFormat::Base64)
        .build()
        .context("OpenAI embedder: failed to build text embedding request")?;
    let mut response = openai
        .embeddings()
        .create(request)
        .await
        .context("OpenAI embedder: failed to get text embedding")?;
    if response.data.len() != 1 {
        anyhow::bail!("Expected 1 embedding, got {}.", response.data.len());
    }
    Ok(response.data.remove(0).embedding.into())
}
@adri1wald adri1wald changed the title Deserialization error when using encoding_format: EncodingFormat::Base64 Embedding deserialization error when using encoding_format: EncodingFormat::Base64 Feb 8, 2024
@adri1wald adri1wald changed the title Embedding deserialization error when using encoding_format: EncodingFormat::Base64 Embedding deserialization error when using encoding_format EncodingFormat::Base64 Feb 8, 2024
@64bit 64bit added the bug Something isn't working label Feb 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants