Add JSON mode to Gemini (Vertex AI) #2964

Manouchehri · 2024-04-12T00:08:33Z

Example code:

#!/usr/bin/env python3.11
# -*- coding: utf-8 -*-
# Author: David Manouchehri

import os
import asyncio
import openai
import logging

logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)

c_handler = logging.StreamHandler()
logger.addHandler(c_handler)

OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
OPENAI_API_BASE = os.getenv("OPENAI_API_BASE") or "https://api.openai.com/v1"

client = openai.OpenAI(
    api_key=OPENAI_API_KEY,
    base_url=OPENAI_API_BASE,
)

async def main():
    response = client.chat.completions.create(
        response_format={"type": "json_object"},
        model="gemini-1.5-pro-preview-0409",
        messages=[
            {
                "role": "user",
                "content": f"Tell me a joke right now about V8 in JSON",
            },
        ],
        stream=False,
        temperature=0.0,
    )
    try:
        logger.debug(response.model_dump_json(indent=2))
        print(response.choices[0].message.content)
    except Exception:
        logger.debug("Failed to print non-stream")

if __name__ == "__main__":
    asyncio.run(main())

Output:

{
  "id": "chatcmpl-cf3cc40e-ebfa-4b73-81b4-0edb24bde4e2",
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null,
      "message": {
        "content": "[\n  {\n    \"joke\": \"Why did the JavaScript engine go to therapy? Because it had too many unresolved V8 issues!\"\n  }\n]\n",
        "role": "assistant",
        "function_call": null,
        "tool_calls": null
      }
    }
  ],
  "created": 1712880452,
  "model": "gemini-1.5-pro-preview-0409",
  "object": "chat.completion",
  "system_fingerprint": null,
  "usage": {
    "completion_tokens": 35,
    "prompt_tokens": 11,
    "total_tokens": 46
  }
}
[
  {
    "joke": "Why did the JavaScript engine go to therapy? Because it had too many unresolved V8 issues!"
  }
]

…eased in the Python SDK.

vercel · 2024-04-12T00:08:37Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
litellm	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Apr 12, 2024 0:09am

Manouchehri · 2024-04-12T00:11:28Z

You can tell JSON mode is working, because if you remove response_format={"type": "json_object"}, from my example code, Gemini begins to respond with JSON embedded in Markdown.

{
  "id": "chatcmpl-7d3eac73-f86c-4543-8d85-7532d5429c80",
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null,
      "message": {
        "content": "```json\n{\n  \"joke\": \"Why did the V8 engine go to therapy? Because it had too many cylinders to deal with!\"\n}\n``` \n",
        "role": "assistant",
        "function_call": null,
        "tool_calls": null
      }
    }
  ],
  "created": 1712880591,
  "model": "gemini-1.5-pro-preview-0409",
  "object": "chat.completion",
  "system_fingerprint": null,
  "usage": {
    "completion_tokens": 36,
    "prompt_tokens": 11,
    "total_tokens": 47
  }
}
```json
{
  "joke": "Why did the V8 engine go to therapy? Because it had too many cylinders to deal with!"
}
```

Manouchehri · 2024-04-12T00:23:02Z

For some odd reason, response_mime_type is missing from the GenerationConfig class, but it's there in the auto-gen protobuf stuff. That's why I had to resort to the dirty custom class for now; it can be removed in a few days/weeks/whenever Google fixes it.

https://github.com/googleapis/python-aiplatform/blob/36698f428d9fa93df46b4af677b7400bdb8c0a93/vertexai/generative_models/_generative_models.py#L1180-L1225

https://github.com/googleapis/python-aiplatform/blob/aa918e31fcc40878e9f29affa02a4527d90188aa/google/cloud/aiplatform_v1/types/content.py#L262-L311

Manouchehri · 2024-04-12T00:43:26Z

Oh, and stop works too, I tested it with stop="did" and stop=["did", "the"]. :) It behaves exactly like gpt-3.5-turbo-0125.

{
  "id": "chatcmpl-f96dc776-4b1a-4c52-8598-9f850c00b215",
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null,
      "message": {
        "content": "[\n  {\n    \"joke\": \"Why ",
        "role": "assistant",
        "function_call": null,
        "tool_calls": null
      }
    }
  ],
  "created": 1712882359,
  "model": "gemini-1.5-pro-preview-0409",
  "object": "chat.completion",
  "system_fingerprint": null,
  "usage": {
    "completion_tokens": 12,
    "prompt_tokens": 13,
    "total_tokens": 25
  }
}
[
  {
    "joke": "Why

krrishdholakia · 2024-04-12T00:51:24Z

LGTM - will take care of fixing any issues that show up during ci/cd

Manouchehri · 2024-04-12T15:16:09Z

@krrishdholakia Hmm, did you do any testing after 77d6b88? JSON mode doesn't seem to work anymore after I merged in v1.35.2.

krrishdholakia · 2024-04-12T15:29:21Z

hey @Manouchehri I added a check if the sdk supports 'response_mime_type' -

reason: it was failing ci/cd since the vertex version didn't have 'response_mime_type' as a supported init param

Manouchehri · 2024-04-12T16:08:25Z

Oh. That won't work, since no version of the Vertex AI Python SDK has response_mime_type yet. My hacky workaround does work though, at least for gemini-1.5-pro-preview-0409 and gemini-experimental.

I reverted 77d6b88 and it works as expected; could we find a another solution? :)

This is what you should see when it's working:

{
  "id": "chatcmpl-84b126f1-01ab-42a3-a600-fe56856b041b",
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null,
      "message": {
        "content": "{\n \"joke\": \"Why did the V8 engine stall at 11:06:41? Because it ran out of juice!\"\n}\n",
        "role": "assistant",
        "function_call": null,
        "tool_calls": null
      }
    }
  ],
  "created": 1712938040,
  "model": "gemini-1.5-pro-preview-0409",
  "object": "chat.completion",
  "system_fingerprint": null,
  "usage": {
    "completion_tokens": 34,
    "prompt_tokens": 30,
    "total_tokens": 64
  }
}

And this is what you should see when it's not working (see the Markdown wrapped output):

{
  "id": "chatcmpl-e7c447a7-f047-42ac-a039-becb982b48fb",
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null,
      "message": {
        "content": "```json\n{\n \"joke\": \"Why did the V8 engine laugh at 11:07:37? Because it was past its bedtime, silly! V8s need their rest to keep running smoothly.\"\n}\n``` \n",
        "role": "assistant",
        "function_call": null,
        "tool_calls": null
      }
    }
  ],
  "created": 1712938061,
  "model": "gemini-1.5-pro-preview-0409",
  "object": "chat.completion",
  "system_fingerprint": null,
  "usage": {
    "completion_tokens": 53,
    "prompt_tokens": 30,
    "total_tokens": 83
  }
}

krrishdholakia · 2024-04-12T17:33:07Z

@Manouchehri open to suggestions

how're you able to make the call without response_mime_type being supported?

it failed during testing for us - https://app.circleci.com/pipelines/github/BerriAI/litellm/8432/workflows/e75a83e3-7ffe-431c-861e-1da736d0b857/jobs/17175/steps

Manouchehri · 2024-04-12T17:35:44Z

Not sure to be honest, but it's definitely working. 😅

That's odd, it looks like only the sync version is failing, while the async one is fine?

Manouchehri · 2024-04-18T21:27:20Z

Okay, we can wait until googleapis/python-aiplatform#3639 is merged soon and I'll just make a PR to revert this once it's updated. :)

Manouchehri added 4 commits April 12, 2024 00:03

(feat) - Add support for JSON mode in Vertex AI

649c3bb

(feat) - Dirty hack to get response_mime_type working before it's rel…

d08674b

…eased in the Python SDK.

(feat) - Extreme dirty hack for response_mime_type in Vertex AI.

0535003

(feat) - Bump version for Vertex AI SDK.

9c55be3

vercel bot deployed to Preview April 12, 2024 00:09 View deployment

Manouchehri requested a review from krrishdholakia April 12, 2024 00:41

Manouchehri added the enhancement New feature or request label Apr 12, 2024

Manouchehri mentioned this pull request Apr 12, 2024

GEMINI-1.5-PRO Main Day-1 support🧵 #2881

Open

krrishdholakia merged commit cd834e9 into BerriAI:main Apr 12, 2024
2 checks passed

Manouchehri mentioned this pull request Apr 16, 2024

Unit tests for json mode (in supported models) #3061

Open

Manouchehri mentioned this pull request Apr 19, 2024

[Bug]: ExtendedGenerationConfig complains about missing stream #3167

Closed

Manouchehri mentioned this pull request Apr 30, 2024

Add JSON mode for Gemini via Google AI Studio #3366

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add JSON mode to Gemini (Vertex AI) #2964

Add JSON mode to Gemini (Vertex AI) #2964

Manouchehri commented Apr 12, 2024

vercel bot commented Apr 12, 2024 •

edited

Manouchehri commented Apr 12, 2024

Manouchehri commented Apr 12, 2024

Manouchehri commented Apr 12, 2024

krrishdholakia commented Apr 12, 2024

Manouchehri commented Apr 12, 2024

krrishdholakia commented Apr 12, 2024

Manouchehri commented Apr 12, 2024 •

edited

krrishdholakia commented Apr 12, 2024

Manouchehri commented Apr 12, 2024 •

edited

Manouchehri commented Apr 18, 2024

Add JSON mode to Gemini (Vertex AI) #2964

Add JSON mode to Gemini (Vertex AI) #2964

Conversation

Manouchehri commented Apr 12, 2024

vercel bot commented Apr 12, 2024 • edited

Manouchehri commented Apr 12, 2024

Manouchehri commented Apr 12, 2024

Manouchehri commented Apr 12, 2024

krrishdholakia commented Apr 12, 2024

Manouchehri commented Apr 12, 2024

krrishdholakia commented Apr 12, 2024

Manouchehri commented Apr 12, 2024 • edited

krrishdholakia commented Apr 12, 2024

Manouchehri commented Apr 12, 2024 • edited

Manouchehri commented Apr 18, 2024

vercel bot commented Apr 12, 2024 •

edited

Manouchehri commented Apr 12, 2024 •

edited

Manouchehri commented Apr 12, 2024 •

edited