Skip to content

[Bug]: gemini-1.5-flash causes error #2819

Closed as not planned
Closed as not planned
@daniel-counto

Description

@daniel-counto

Describe the bug

Error:

TypeError                                 Traceback (most recent call last)
[<ipython-input-15-dbea5d0e63b5>](https://localhost:8080/#) in <cell line: 7>()
      5 user_proxy = UserProxyAgent("user_proxy", human_input_mode="NEVER", max_consecutive_auto_reply=0)
      6 
----> 7 user_proxy.initiate_chat(
      8     image_agent,
      9     message="""Describe what is in this image?

8 frames
[/usr/local/lib/python3.10/dist-packages/autogen/agentchat/conversable_agent.py](https://localhost:8080/#) in initiate_chat(self, recipient, clear_history, silent, cache, max_turns, summary_method, summary_args, message, **kwargs)
   1005             else:
   1006                 msg2send = self.generate_init_message(message, **kwargs)
-> 1007             self.send(msg2send, recipient, silent=silent)
   1008         summary = self._summarize_chat(
   1009             summary_method,

[/usr/local/lib/python3.10/dist-packages/autogen/agentchat/conversable_agent.py](https://localhost:8080/#) in send(self, message, recipient, request_reply, silent)
    643         valid = self._append_oai_message(message, "assistant", recipient)
    644         if valid:
--> 645             recipient.receive(message, self, request_reply, silent)
    646         else:
    647             raise ValueError(

[/usr/local/lib/python3.10/dist-packages/autogen/agentchat/conversable_agent.py](https://localhost:8080/#) in receive(self, message, sender, request_reply, silent)
    806         if request_reply is False or request_reply is None and self.reply_at_receive[sender] is False:
    807             return
--> 808         reply = self.generate_reply(messages=self.chat_messages[sender], sender=sender)
    809         if reply is not None:
    810             self.send(reply, sender, silent=silent)

[/usr/local/lib/python3.10/dist-packages/autogen/agentchat/conversable_agent.py](https://localhost:8080/#) in generate_reply(self, messages, sender, **kwargs)
   1947                 continue
   1948             if self._match_trigger(reply_func_tuple["trigger"], sender):
-> 1949                 final, reply = reply_func(self, messages=messages, sender=sender, config=reply_func_tuple["config"])
   1950                 if logging_enabled():
   1951                     log_event(

[/usr/local/lib/python3.10/dist-packages/autogen/agentchat/contrib/multimodal_conversable_agent.py](https://localhost:8080/#) in generate_oai_reply(self, messages, sender, config)
    112 
    113         # TODO: #1143 handle token limit exceeded error
--> 114         response = client.create(context=messages[-1].pop("context", None), messages=messages_with_b64_img)
    115 
    116         # TODO: line 301, line 271 is converting messages to dict. Can be removed after ChatCompletionMessage_to_dict is merged.

[/usr/local/lib/python3.10/dist-packages/autogen/oai/client.py](https://localhost:8080/#) in create(self, **config)
    636             try:
    637                 request_ts = get_current_ts()
--> 638                 response = client.create(params)
    639             except APITimeoutError as err:
    640                 logger.debug(f"config {i} timed out", exc_info=True)

[/usr/local/lib/python3.10/dist-packages/autogen/oai/gemini.py](https://localhost:8080/#) in create(self, params)
    110         if "vision" not in model_name:
    111             # A. create and call the chat model.
--> 112             gemini_messages = oai_messages_to_gemini_messages(messages)
    113 
    114             # we use chat model by default

[/usr/local/lib/python3.10/dist-packages/autogen/oai/gemini.py](https://localhost:8080/#) in oai_messages_to_gemini_messages(messages)
    265 
    266     # handle the last message
--> 267     rst.append(Content(parts=concat_parts(curr_parts), role=role))
    268 
    269     # The Gemini is restrict on order of roles, such that

[/usr/local/lib/python3.10/dist-packages/autogen/oai/gemini.py](https://localhost:8080/#) in concat_parts(parts)
    233     for current_part in parts[1:]:
    234         if previous_part.text != "":
--> 235             previous_part.text += current_part.text
    236         else:
    237             concatenated_parts.append(previous_part)

TypeError: can only concatenate str (not "dict") to str

Steps to reproduce

I simply use the given notebook cloud-gemini.ipynb.

first, set the OAI_CONFIG_LIST.json file as:

[
    {
        "model": "gemini-1.5-flash",
        "api_key": "<some key>",
        "api_type": "google"
    }
]

then the confg_list:

config_list_gemini_vision = autogen.config_list_from_json(
    "./OAI_CONFIG_LIST.json",
    filter_dict={
        "model": ["gemini-1.5-flash"],
    },
)

then run this:

image_agent = MultimodalConversableAgent(
    "Gemini Vision", llm_config={"config_list": config_list_gemini_vision, "seed": seed}, max_consecutive_auto_reply=1
)

user_proxy = UserProxyAgent("user_proxy", human_input_mode="NEVER", max_consecutive_auto_reply=0)

user_proxy.initiate_chat(
    image_agent,
    message="""Describe what is in this image?
<img https://github.com/microsoft/autogen/blob/main/website/static/img/chat_example.png?raw=true>.""",
)

Model Used

gemini-1.5-flash

Expected Behavior

It should run fine, by returning an answer from the agent. The previous gemini models are working by the way, just the new flash version fails.

Screenshots and logs

image

Additional Information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    0.2Issues which are related to the pre 0.4 codebaseneeds-triage

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions