Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Images for Vision in the Assistants API #932

Closed
3 tasks done
andehr opened this issue May 29, 2024 · 2 comments · Fixed by #939
Closed
3 tasks done

Images for Vision in the Assistants API #932

andehr opened this issue May 29, 2024 · 2 comments · Fixed by #939
Labels
enhancement New feature or request

Comments

@andehr
Copy link
Contributor

andehr commented May 29, 2024

First check

  • I added a descriptive title to this issue.
  • I used the GitHub search to look for a similar issue and didn't find it.
  • I searched the Marvin documentation for this feature.

Describe the current behavior

Currently when building a Thread for use with an Assistant assistant, you can add messages with the .add() function, which takes text content, role, code_interpreter_files, and file_search_files. But as of May 9th, Open AI's Assistant API supports adding images to threads. The images should be uploaded with the purpose "vision" (rather than "assistants" used for code_interpreter_files and file_search_files).

Describe the proposed behavior

Perhaps an additional parameter image_files which accepts image files, uploads them, then attaches their file IDs to the thread.

Though it looks like the images may be added as content parts rather than attachments?

Perhaps there should be an option to directly pass known file IDs instead of uploading the file every time?

Example Use

# new thread
thread = Thread()

# add message with image attachment
with open("path/to/image.png", "rb") as f:
    thread.add("Ask me questions about this image", image_files=[f])

# run thread
thread.run(assistant=some_assistant)

Additional context

In general I've found that the latest open ai models perform very well with threads of images and text. Being able to deal with multiple image, texts, file search together with all the benefits of thread management, would be super helpful for my use cases.

@andehr andehr added the enhancement New feature or request label May 29, 2024
@andehr
Copy link
Contributor Author

andehr commented Jun 18, 2024

Maybe a change like this to the Thread.add_async() would be the minimal?

    @expose_sync_method("add")
    async def add_async(
        self,
        message: str,
        role: str = "user",
        code_interpreter_files: Optional[list[str]] = None,
        file_search_files: Optional[list[str]] = None,
        image_files: Optional[list[str]] = None,
    ) -> Message:
        """
        Add a user message to the thread.
        """
        client = marvin.utilities.openai.get_openai_client()

        if self.id is None:
            await self.create_async()

        content = [dict(text=message, type="text")]

        # Upload files and collect their IDs
        attachments = []
        for fp in code_interpreter_files or []:
            with open(fp, mode="rb") as file:
                response = await client.files.create(file=file, purpose="assistants")
                attachments.append(
                    dict(file_id=response.id, tools=[dict(type="code_interpreter")])
                )
        for fp in file_search_files or []:
            with open(fp, mode="rb") as file:
                response = await client.files.create(file=file, purpose="assistants")
                attachments.append(
                    dict(file_id=response.id, tools=[dict(type="file_search")])
                )
        for fp in image_files or []:
            with open(fp, mode="rb") as file:
                response = await client.files.create(file=file, purpose="vision")
                content.append(
                    dict(image_file=dict(file_id=response.id), type="image_file")
                )

        # Create the message with the attached files
        response = await client.beta.threads.messages.create(
            thread_id=self.id, role=role, content=content, attachments=attachments
        )
        return response

@zzstoatzz
Copy link
Collaborator

thanks for the great issue @andehr! looking at the PR now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants