Add support for Gemini API #441

AmgadHasan · 2024-02-17T14:55:24Z

The new Gemini api introduced support for function calling. You define a set of functions with their expected arguments and you pass them in the tools argument.

Can we add gemini support to instructor so it can be used with gemini pro models and later gemini ultra?

jxnl · 2024-02-17T21:33:34Z

can you link docs and a more complete example?

AmgadHasan · 2024-02-18T20:23:30Z

can you link docs and a more complete example?

@jxnl
Sure. Just as a heads up, there are two versions of the gemini API:

The google cloud version: This one is a service offered on gcp and requires creating a project and service account. This is similar to Azure OpenAI
The google AI Dev version: This is targeted towards individual developers. You sign up and create an API hey and use it when calling the API. It's similar to the normal OpenAI API. This is the one that I used.

Here are the docs for function calling usinf the Google ai dev api

https://ai.google.dev/tutorials/function_calling_python_quickstart

A quick example:

calculator = glm.Tool(
    function_declarations=[
      glm.FunctionDeclaration(
        name='add',
        description="Returns the sum of two numbers.",
        parameters=glm.Schema(
            type=glm.Type.OBJECT,
            properties={
                'a': glm.Schema(type=glm.Type.NUMBER),
                'b': glm.Schema(type=glm.Type.NUMBER)
            },
            required=['a','b']
        )
      ),
      glm.FunctionDeclaration(
        name='multiply',
        description="Returns the product of two numbers.",
        parameters=glm.Schema(
            type=glm.Type.OBJECT,
            properties={
                'a':glm.Schema(type=glm.Type.NUMBER),
                'b':glm.Schema(type=glm.Type.NUMBER)
            },
            required=['a','b']
        )
      )
    ])

model = genai.GenerativeModel('gemini-pro', tools=[calculator])
chat = model.start_chat()

response = chat.send_message(
    f"What's {a} X {b} ?",
) 
response.candidates

[index: 0
content {
  parts {
    function_call {
      name: "multiply"
      args {
        fields {
          key: "b"
          value {
            number_value: 234234
          }
        }
        fields {
          key: "a"
          value {
            number_value: 2312371
          }
        }
      }
    }
  }
  role: "model"
}
finish_reason: STOP
]

jxnl · 2024-02-18T20:41:11Z

lol WHY WOULD THEY DO THIS

AmgadHasan · 2024-02-18T20:43:13Z

lol WHY WOULD THEY DO THIS

I think they want indie developers to hack with thei googleai dev api while also offering their gcp to their enterprise customers.

Just like openai and azure

AmgadHasan · 2024-02-19T16:09:33Z

@jxnl

So is there any hope for this?

jxnl · 2024-02-19T16:11:24Z

Def hope!

The nuance is that the api is slightly different .chat() and .send_message()

jxnl · 2024-02-19T16:12:25Z

So I want to think about how to give a good matching experience.

AmgadHasan · 2024-02-19T16:19:10Z

Yes, it's definitely annoying. You have to specify the tools when you create the model object, not when you actually send the prompt.

Like why even do this? LLMs are fucking stateless; no point in defining the functions when creating the model object. Unless they're using something completely different from OpenAI.

AmgadHasan · 2024-02-19T17:11:41Z

Okay so after taking a second deep dive in their not-so-great docs, I think there's a different way for doing this:

# Create a client
from google.ai import generativelanguage as glm

client = glm.GenerativeServiceClient(
    client_options={'api_key': GOOGLE_API_KEY})

# Create function tools
my_tool = glm.Tool(
    function_declarations=[
      glm.FunctionDeclaration(
        name='add',
        description="Returns the sum of two numbers.",
        parameters=glm.Schema(
            type=glm.Type.OBJECT,
            properties={
                'a': glm.Schema(type=glm.Type.NUMBER),
                'b': glm.Schema(type=glm.Type.NUMBER)
            },
            required=['a','b']
        )
      ),
      glm.FunctionDeclaration(
        name='multiply',
        description="Returns the product of two numbers.",
        parameters=glm.Schema(
            type=glm.Type.OBJECT,
            properties={
                'a':glm.Schema(type=glm.Type.NUMBER),
                'b':glm.Schema(type=glm.Type.NUMBER)
            },
            required=['a','b']
        )
      )
    ])

request = {
    "model": 'models/gemini-1.0-pro-001',
    "contents": [{"parts": [{"text": "Send an email to my friend Oliver wishing them a happt birthday"}], "role": "user"}],
    "tools": [my_tool],

}
response = client.generate_content(request=request)

This is somewhat similar to the conventional way of doing function calling using OpenAI's client.
@jxnl

ankrgyl · 2024-02-19T18:19:05Z

@AmgadHasan I maintain https://github.com/braintrustdata/braintrust-proxy which allows you to access gemini models through the OpenAI format. We haven't yet translated the gemini tool call syntax over, but based on your code snippets, my guess is that it is just sending json-schema and should be easy to do.

Want to collaborate on that? Then, you could just set OPENAI_BASE_URL to "https://braintrustproxy.com/v1" and it'll work out of the box with instructor

davhin · 2024-03-25T16:54:24Z

what is the current state on this? happy to contribute!

jxnl · 2024-03-25T23:04:20Z

o work done would love a contrib

oliverbj · 2024-04-10T11:56:01Z

Hi!

What's the update on this? :)

jxnl · 2024-04-10T13:56:00Z

Hi!

What's the update on this? :)

I won't be working on this anytime soon, so it'll likley have to be from another contributor unless you go tough litellm or braintrust proxy.

M-Gonzalo · 2024-05-11T14:33:52Z

Can we get Gemini to implement itself here? I'm only half-joking BTW lol
I read it's got a 1.000.000 token window so it can probably read the whole project in one go, and it understands Elixir...

ssonal · 2024-05-19T17:28:19Z

Managed to implement gemini support here. Not 100% compatible with all of instructor's concept given the mismatch in API design but it's a start. Would love some feedback.

samrat mentioned this issue Apr 11, 2024

Google Gemini support thmsmlr/instructor_ex#45

Open

ajac-zero mentioned this issue May 29, 2024

Add Support for VertexAI Gemini #711

Merged

jxnl closed this as completed Jun 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for Gemini API #441

Add support for Gemini API #441

AmgadHasan commented Feb 17, 2024

jxnl commented Feb 17, 2024

AmgadHasan commented Feb 18, 2024

jxnl commented Feb 18, 2024

AmgadHasan commented Feb 18, 2024

AmgadHasan commented Feb 19, 2024

jxnl commented Feb 19, 2024

jxnl commented Feb 19, 2024

AmgadHasan commented Feb 19, 2024

AmgadHasan commented Feb 19, 2024

ankrgyl commented Feb 19, 2024

davhin commented Mar 25, 2024

jxnl commented Mar 25, 2024

oliverbj commented Apr 10, 2024

jxnl commented Apr 10, 2024

M-Gonzalo commented May 11, 2024

ssonal commented May 19, 2024 •

edited

Loading

Add support for Gemini API #441

Add support for Gemini API #441

Comments

AmgadHasan commented Feb 17, 2024

jxnl commented Feb 17, 2024

AmgadHasan commented Feb 18, 2024

jxnl commented Feb 18, 2024

AmgadHasan commented Feb 18, 2024

AmgadHasan commented Feb 19, 2024

jxnl commented Feb 19, 2024

jxnl commented Feb 19, 2024

AmgadHasan commented Feb 19, 2024

AmgadHasan commented Feb 19, 2024

ankrgyl commented Feb 19, 2024

davhin commented Mar 25, 2024

jxnl commented Mar 25, 2024

oliverbj commented Apr 10, 2024

jxnl commented Apr 10, 2024

M-Gonzalo commented May 11, 2024

ssonal commented May 19, 2024 • edited Loading

ssonal commented May 19, 2024 •

edited

Loading