Images for Vision in the Assistants API #932

andehr · 2024-05-29T15:11:37Z

First check

I added a descriptive title to this issue.
I used the GitHub search to look for a similar issue and didn't find it.
I searched the Marvin documentation for this feature.

Describe the current behavior

Currently when building a Thread for use with an Assistant assistant, you can add messages with the .add() function, which takes text content, role, code_interpreter_files, and file_search_files. But as of May 9th, Open AI's Assistant API supports adding images to threads. The images should be uploaded with the purpose "vision" (rather than "assistants" used for code_interpreter_files and file_search_files).

Describe the proposed behavior

Perhaps an additional parameter image_files which accepts image files, uploads them, then attaches their file IDs to the thread.

Though it looks like the images may be added as content parts rather than attachments?

Perhaps there should be an option to directly pass known file IDs instead of uploading the file every time?

Example Use

# new thread
thread = Thread()

# add message with image attachment
with open("path/to/image.png", "rb") as f:
    thread.add("Ask me questions about this image", image_files=[f])

# run thread
thread.run(assistant=some_assistant)

Additional context

In general I've found that the latest open ai models perform very well with threads of images and text. Being able to deal with multiple image, texts, file search together with all the benefits of thread management, would be super helpful for my use cases.

The text was updated successfully, but these errors were encountered:

andehr · 2024-06-18T11:19:59Z

Maybe a change like this to the Thread.add_async() would be the minimal?

    @expose_sync_method("add")
    async def add_async(
        self,
        message: str,
        role: str = "user",
        code_interpreter_files: Optional[list[str]] = None,
        file_search_files: Optional[list[str]] = None,
        image_files: Optional[list[str]] = None,
    ) -> Message:
        """
        Add a user message to the thread.
        """
        client = marvin.utilities.openai.get_openai_client()

        if self.id is None:
            await self.create_async()

        content = [dict(text=message, type="text")]

        # Upload files and collect their IDs
        attachments = []
        for fp in code_interpreter_files or []:
            with open(fp, mode="rb") as file:
                response = await client.files.create(file=file, purpose="assistants")
                attachments.append(
                    dict(file_id=response.id, tools=[dict(type="code_interpreter")])
                )
        for fp in file_search_files or []:
            with open(fp, mode="rb") as file:
                response = await client.files.create(file=file, purpose="assistants")
                attachments.append(
                    dict(file_id=response.id, tools=[dict(type="file_search")])
                )
        for fp in image_files or []:
            with open(fp, mode="rb") as file:
                response = await client.files.create(file=file, purpose="vision")
                content.append(
                    dict(image_file=dict(file_id=response.id), type="image_file")
                )

        # Create the message with the attached files
        response = await client.beta.threads.messages.create(
            thread_id=self.id, role=role, content=content, attachments=attachments
        )
        return response

zzstoatzz · 2024-06-26T03:46:55Z

thanks for the great issue @andehr! looking at the PR now

andehr added the enhancement New feature or request label May 29, 2024

andehr mentioned this issue Jun 18, 2024

add image files to thread via add method (supports Assistant API vision) #939

Merged

zzstoatzz closed this as completed in #939 Jun 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Images for Vision in the Assistants API #932

Images for Vision in the Assistants API #932

andehr commented May 29, 2024 •

edited

Loading

andehr commented Jun 18, 2024 •

edited

Loading

zzstoatzz commented Jun 26, 2024

Images for Vision in the Assistants API #932

Images for Vision in the Assistants API #932

Comments

andehr commented May 29, 2024 • edited Loading

First check

Describe the current behavior

Describe the proposed behavior

Example Use

Additional context

andehr commented Jun 18, 2024 • edited Loading

zzstoatzz commented Jun 26, 2024

andehr commented May 29, 2024 •

edited

Loading

andehr commented Jun 18, 2024 •

edited

Loading