Skip to content

[Bug] Unable to Pass Messages into Prediction in dspy #7844

@pretbc

Description

@pretbc

What happened?

Description:
I am trying to pass messages containing audio data into dspy.Predict, but it seems that the model is analyzing the base64 string of the audio instead of properly processing the audio content.

Code Snippet:

lm = dspy.LM(
    "gemini-2.0-flash-exp", api_key=os.getenv("GOOGLE_APPLICATION_CREDENTIALS")
)
dspy.configure(lm=lm)

audio_path = "temp_segment_1894 1.wav"

audio_data = pathlib.Path(audio_path).read_bytes()
audio_data_base64 = base64.b64encode(audio_data).decode("utf-8")

messages = [
    {
        "role": "user",
        "content": [
            {"type": "text", "text": "Analyze audio"},
            {
                "type": "image_url",
                "image_url": "data:audio/wav;base64,{}".format(
                    audio_data_base64
                ),
            },
        ],
    }
]
print(lm(messages=messages))  # This works correctly

classify = dspy.Predict('messages -> sentiment')  
# Issue: Cannot pass messages; the output seems to analyze the base64 string instead of the actual audio content.

Expected Behavior:

The model should process the audio data properly and return the sentiment analysis.
Observed Behavior:

dspy.Predict appears to be treating the base64 string as text instead of decoding and analyzing the actual audio.
Questions:

How should messages be passed to dspy.Predict correctly?
Is there a way to specify that messages contain audio data so the model processes it correctly?
Should a custom data structure or preprocessing step be added before passing messages?
Environment:

dspy version: [2.6.6]
Model: gemini-2.0-flash-exp
Python version: [3.12]
Any guidance on properly passing audio messages into Prediction would be greatly appreciated.

Steps to reproduce

provided as code snippet

DSPy version

2.6.6

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions