-
Couldn't load subscription status.
- Fork 2.1k
Description
Describe the bug
In an app instance wrapped by AdkApp, when the message argument for async_stream_query() is provided using the google.genai.types method, it doesn't support multimodal inputs. The following error appears: "message must be a string or a dictionary representing a Content object".
Codes code1 and code2 resulted in errors, but code3, where the message was created as a dictionary as indicated in the error message, worked.
code1
artifact_part = types.Part.from_bytes(data=file_bytes, mime_type=mime_type)
text_part = types.Part.from_text(text=f"Please analyze and summarize the uploaded file '{filename}'.")
message_content = types.Content(
role="user",
parts=[text_part, artifact_part]
)
async for event in app_instance.async_stream_query(
user_id=user_id,
session_id=session_id,
message=message_content
):
events.append(event)
return events
code2
artifact_part = types.Part.from_bytes(data=file_bytes, mime_type=mime_type)
text_part = types.Part.from_text(text=f"Please analyze and summarize the uploaded file '{filename}'.")
async for event in app_instance.async_stream_query(
user_id=user_id,
session_id=session_id,
message=[text_part, artifact_part]
):
events.append(event)
return events
code3
message_content = {
"role": "user",
"parts": [
{"text": f"Please analyze and summarize the uploaded file '{filename}'."},
{"inline_data": {"data": file_bytes, "mime_type": mime_type}}
]
}
async for event in app_instance.async_stream_query(
user_id=user_id,
session_id=session_id,
message=message_content
):
events.append(event)
return events
Expected behavior
Since the official ADK documentation states the following, codes code1 and code2 should ideally work without issues.
Sending Multimodal Queries
To send multimodal queries (e.g., including images) to your agent, you can construct the message parameter of async_stream_query with a list of types.Part objects. Each part can be text or an image.
To include an image, you can use types.Part.from_uri, providing a Google Cloud Storage (GCS) URI for the image.
from google.genai import types
image_part = types.Part.from_uri(
file_uri="gs://cloud-samples-data/generative-ai/image/scones.jpg",
mime_type="image/jpeg",
)
text_part = types.Part.from_text(
text="What is in this image?",
)
async for event in remote_app.async_stream_query(
user_id="u_456",
session_id=remote_session["id"],
message=[text_part, image_part],
):
print(event)
Desktop (please complete the following information):
- OS: [Windows11]
- Python version 3.12:
- ADK version 1.15.1:
Model Information:
- Are you using LiteLLM: No
- Which model is being used ->gemini-2.5-flash