Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add JSON mode to Gemini (Vertex AI) #2964

Merged
merged 4 commits into from Apr 12, 2024

Conversation

Manouchehri
Copy link
Collaborator

Fixes #2962.

Example code:

#!/usr/bin/env python3.11
# -*- coding: utf-8 -*-
# Author: David Manouchehri

import os
import asyncio
import openai
import logging

logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)

c_handler = logging.StreamHandler()
logger.addHandler(c_handler)

OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
OPENAI_API_BASE = os.getenv("OPENAI_API_BASE") or "https://api.openai.com/v1"

client = openai.OpenAI(
    api_key=OPENAI_API_KEY,
    base_url=OPENAI_API_BASE,
)

async def main():
    response = client.chat.completions.create(
        response_format={"type": "json_object"},
        model="gemini-1.5-pro-preview-0409",
        messages=[
            {
                "role": "user",
                "content": f"Tell me a joke right now about V8 in JSON",
            },
        ],
        stream=False,
        temperature=0.0,
    )
    try:
        logger.debug(response.model_dump_json(indent=2))
        print(response.choices[0].message.content)
    except Exception:
        logger.debug("Failed to print non-stream")

if __name__ == "__main__":
    asyncio.run(main())

Output:

{
  "id": "chatcmpl-cf3cc40e-ebfa-4b73-81b4-0edb24bde4e2",
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null,
      "message": {
        "content": "[\n  {\n    \"joke\": \"Why did the JavaScript engine go to therapy? Because it had too many unresolved V8 issues!\"\n  }\n]\n",
        "role": "assistant",
        "function_call": null,
        "tool_calls": null
      }
    }
  ],
  "created": 1712880452,
  "model": "gemini-1.5-pro-preview-0409",
  "object": "chat.completion",
  "system_fingerprint": null,
  "usage": {
    "completion_tokens": 35,
    "prompt_tokens": 11,
    "total_tokens": 46
  }
}
[
  {
    "joke": "Why did the JavaScript engine go to therapy? Because it had too many unresolved V8 issues!"
  }
]

Copy link

vercel bot commented Apr 12, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
litellm ✅ Ready (Inspect) Visit Preview 💬 Add feedback Apr 12, 2024 0:09am

@Manouchehri
Copy link
Collaborator Author

You can tell JSON mode is working, because if you remove response_format={"type": "json_object"}, from my example code, Gemini begins to respond with JSON embedded in Markdown.

{
  "id": "chatcmpl-7d3eac73-f86c-4543-8d85-7532d5429c80",
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null,
      "message": {
        "content": "```json\n{\n  \"joke\": \"Why did the V8 engine go to therapy? Because it had too many cylinders to deal with!\"\n}\n``` \n",
        "role": "assistant",
        "function_call": null,
        "tool_calls": null
      }
    }
  ],
  "created": 1712880591,
  "model": "gemini-1.5-pro-preview-0409",
  "object": "chat.completion",
  "system_fingerprint": null,
  "usage": {
    "completion_tokens": 36,
    "prompt_tokens": 11,
    "total_tokens": 47
  }
}
```json
{
  "joke": "Why did the V8 engine go to therapy? Because it had too many cylinders to deal with!"
}
```

@Manouchehri
Copy link
Collaborator Author

For some odd reason, response_mime_type is missing from the GenerationConfig class, but it's there in the auto-gen protobuf stuff. That's why I had to resort to the dirty custom class for now; it can be removed in a few days/weeks/whenever Google fixes it.

https://github.com/googleapis/python-aiplatform/blob/36698f428d9fa93df46b4af677b7400bdb8c0a93/vertexai/generative_models/_generative_models.py#L1180-L1225

https://github.com/googleapis/python-aiplatform/blob/aa918e31fcc40878e9f29affa02a4527d90188aa/google/cloud/aiplatform_v1/types/content.py#L262-L311

@Manouchehri Manouchehri added the enhancement New feature or request label Apr 12, 2024
@Manouchehri
Copy link
Collaborator Author

Oh, and stop works too, I tested it with stop="did" and stop=["did", "the"]. :) It behaves exactly like gpt-3.5-turbo-0125.

{
  "id": "chatcmpl-f96dc776-4b1a-4c52-8598-9f850c00b215",
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null,
      "message": {
        "content": "[\n  {\n    \"joke\": \"Why ",
        "role": "assistant",
        "function_call": null,
        "tool_calls": null
      }
    }
  ],
  "created": 1712882359,
  "model": "gemini-1.5-pro-preview-0409",
  "object": "chat.completion",
  "system_fingerprint": null,
  "usage": {
    "completion_tokens": 12,
    "prompt_tokens": 13,
    "total_tokens": 25
  }
}
[
  {
    "joke": "Why

@krrishdholakia
Copy link
Contributor

LGTM - will take care of fixing any issues that show up during ci/cd

@krrishdholakia krrishdholakia merged commit cd834e9 into BerriAI:main Apr 12, 2024
2 checks passed
@Manouchehri
Copy link
Collaborator Author

@krrishdholakia Hmm, did you do any testing after 77d6b88? JSON mode doesn't seem to work anymore after I merged in v1.35.2.

@krrishdholakia
Copy link
Contributor

hey @Manouchehri I added a check if the sdk supports 'response_mime_type' -
Screenshot 2024-04-12 at 8 28 45 AM

reason: it was failing ci/cd since the vertex version didn't have 'response_mime_type' as a supported init param

@Manouchehri
Copy link
Collaborator Author

Manouchehri commented Apr 12, 2024

Oh. That won't work, since no version of the Vertex AI Python SDK has response_mime_type yet. My hacky workaround does work though, at least for gemini-1.5-pro-preview-0409 and gemini-experimental.

I reverted 77d6b88 and it works as expected; could we find a another solution? :)

This is what you should see when it's working:

{
  "id": "chatcmpl-84b126f1-01ab-42a3-a600-fe56856b041b",
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null,
      "message": {
        "content": "{\n \"joke\": \"Why did the V8 engine stall at 11:06:41? Because it ran out of juice!\"\n}\n",
        "role": "assistant",
        "function_call": null,
        "tool_calls": null
      }
    }
  ],
  "created": 1712938040,
  "model": "gemini-1.5-pro-preview-0409",
  "object": "chat.completion",
  "system_fingerprint": null,
  "usage": {
    "completion_tokens": 34,
    "prompt_tokens": 30,
    "total_tokens": 64
  }
}

And this is what you should see when it's not working (see the Markdown wrapped output):

{
  "id": "chatcmpl-e7c447a7-f047-42ac-a039-becb982b48fb",
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null,
      "message": {
        "content": "```json\n{\n \"joke\": \"Why did the V8 engine laugh at 11:07:37? Because it was past its bedtime, silly! V8s need their rest to keep running smoothly.\"\n}\n``` \n",
        "role": "assistant",
        "function_call": null,
        "tool_calls": null
      }
    }
  ],
  "created": 1712938061,
  "model": "gemini-1.5-pro-preview-0409",
  "object": "chat.completion",
  "system_fingerprint": null,
  "usage": {
    "completion_tokens": 53,
    "prompt_tokens": 30,
    "total_tokens": 83
  }
}

@krrishdholakia
Copy link
Contributor

@Manouchehri open to suggestions

how're you able to make the call without response_mime_type being supported?

it failed during testing for us - https://app.circleci.com/pipelines/github/BerriAI/litellm/8432/workflows/e75a83e3-7ffe-431c-861e-1da736d0b857/jobs/17175/steps

@Manouchehri
Copy link
Collaborator Author

Manouchehri commented Apr 12, 2024

Not sure to be honest, but it's definitely working. 😅

That's odd, it looks like only the sync version is failing, while the async one is fine?

@Manouchehri
Copy link
Collaborator Author

Okay, we can wait until googleapis/python-aiplatform#3639 is merged soon and I'll just make a PR to revert this once it's updated. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature]: Support responseMimeType for Gemini (aka JSON mode?)
2 participants