Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Different embedding 3 vectors in Azure vs. OpenAI #40243

Open
3 tasks done
thai-op opened this issue May 17, 2024 · 3 comments
Open
3 tasks done

[BUG] Different embedding 3 vectors in Azure vs. OpenAI #40243

thai-op opened this issue May 17, 2024 · 3 comments
Assignees
Labels
customer-reported Issues that are reported by GitHub users external to the Azure organization. OpenAI question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention This issue is responsible by Azure service team.

Comments

@thai-op
Copy link

thai-op commented May 17, 2024

Describe the bug
When you request the same input, same api version, same model for text embedding large 3 in Azure vs. in OpenAI, you get slightly different results on the vector. They are small in floating values (< 0.0001), but in aggregate they are different enough to get bad ranking results when we mix the two together.

I don't see any documentation describing this behavior so I'm just asking here in case anyone from the Azure team knows an answer to this.

curl -H 'Authorization:Bearer <key>' -H 'Content-Type: application/json' https://api.openai.com/v1/embeddings\?api-version\=2024-03-01-preview -d "{\"input\":\"Pancreatitis in Dogs\",\"model\":\"text-embedding-3-large\"}" -o /tmp/embedding-2024-03-01-preview.openai.json

curl -H 'api-key: <key>' -H 'Content-Type: application/json' https://<region_deployment>.azure.com/openai/deployments/<deployment>/embeddings\?api-version\=2024-03-01-preview -d "{\"input\":\"Pancreatitis in Dogs\",\"model\":\"text-embedding-3-large\"}" -o /tmp/embedding-2024-03-01-preview.azure.json

diff /tmp/embedding-2024-03-01-preview.azure.json /tmp/embedding-2024-03-01-preview.openai.json -y | less

{                                                                       {
  "object": "list",                                                       "object": "list",
  "data": [                                                               "data": [
    {                                                                       {
      "object": "embedding",                                                  "object": "embedding",
      "index": 0,                                                             "index": 0,
      "embedding": [                                                          "embedding": [
        -0.01958797,                                            |               -0.019557063,
        0.00020787802,                                          |               0.00021087051,
        -0.013025163,                                           |               -0.013026878,
        0.05607405,                                             |               0.055992138,
        0.02763522,                                             |               0.027616536,
        0.012355488,                                            |               0.012345954,
        0.014163609,                                            |               0.014165475,
        -0.019387066,                                           |               -0.019389622,
        0.03151933,                                             |               0.031568136,
        0.0012730785,                                           |               0.0012551069,
        0.009816307,                                            |               0.009778531,
        0.016228437,                                            |               0.016241739,
        0.036809757,                                            |               0.03685926,
        -0.045493197,                                           |               -0.04547687,
        -0.023371628,                                           |               -0.023374708,

Exception or Stack Trace
See above

To Reproduce
See above

Code Snippet
Add the code snippet that causes the issue.

Expected behavior
The vectors should be exactly the same

Setup (please complete the following information):
n/a

Information Checklist
Kindly make sure that you have added all the following information above and checkoff the required fields otherwise we will treat the issuer as an incomplete report

  • Bug Description Added
  • Repro Steps Added
  • Setup information Added
@github-actions github-actions bot added customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-triage This is a new issue that needs to be triaged to the appropriate team. question The issue doesn't require a change to the product in order to be resolved. Most issues start as that labels May 17, 2024
@alzimmermsft
Copy link
Member

Hi @thai-op, given your sample in this issue is using cURL, and not the azure-ai-openai SDK, this seems to be a service issue and not an SDK issue, is that correct?

@thai-op
Copy link
Author

thai-op commented May 17, 2024 via email

@alzimmermsft alzimmermsft added Service Attention This issue is responsible by Azure service team. OpenAI labels May 17, 2024
@github-actions github-actions bot removed the needs-triage This is a new issue that needs to be triaged to the appropriate team. label May 17, 2024
@alzimmermsft
Copy link
Member

@brandom-msft @jpalvarezl do you know where this feedback should be re-routed to?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
customer-reported Issues that are reported by GitHub users external to the Azure organization. OpenAI question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention This issue is responsible by Azure service team.
Projects
None yet
Development

No branches or pull requests

4 participants