fix(cua): nest python gemini screenshots in FunctionResponse, drop unused openai dep#157
Merged
masnwilliams merged 1 commit intohypeship/unified-cua-templatefrom May 5, 2026
Conversation
…enai dep - Python gemini provider was sending screenshots as separate Part(inline_data=...) entries after the FunctionResponse part. With multiple function calls per turn the model can't bind a screenshot to its originating call. Match the standalone gemini-computer-use template (and the TS unified template) by nesting the screenshot as a FunctionResponsePart inside FunctionResponse.parts, gated on the predefined-actions allowlist. - Drop openai from pyproject.toml — provider uses httpx directly against the Responses API; the SDK was never imported.
|
Firetiger deploy monitoring skipped This PR didn't match the auto-monitor filter configured on your GitHub connection:
Reason: PR modifies Python Gemini provider and dependencies in the CUA template, not kernel API endpoints or Temporal workflows as specified in the filter. To monitor this PR anyway, reply with |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two bugbot findings on commit `c684ca7`:
Medium — Python Gemini provider sent screenshots as a separate `Part(inline_data=...)` entry in the user content after the `FunctionResponse` part. With multiple function calls per turn the model can't bind a screenshot to its originating call. The standalone `python/gemini-computer-use` template and the TS unified template both nest the screenshot as a `FunctionResponsePart` inside `FunctionResponse.parts`. This PR matches that structure and adds the predefined-actions allowlist that gates screenshot inclusion.
Low — `openai` was listed in `pyproject.toml` but never imported. The OpenAI provider uses raw `httpx` against the Responses API. Removed.
Test plan
Note
Medium Risk
Moderate risk because it changes the structure of Gemini tool-call response parts, which could affect how multi-call turns are interpreted by the model or SDK. Dependency removal is low risk but may impact downstream installs if they relied on the extra package.
Overview
Gemini Python CUA now nests screenshots inside each tool call response. Instead of sending a standalone
Part(inline_data=...)after theFunctionResponse, screenshots are attached asFunctionResponse.parts(asFunctionResponsePart/FunctionResponseBlob) so multi-call turns can reliably associate images with the correct action; screenshot inclusion is gated by aPREDEFINED_ACTIONSallowlist.Template deps cleanup. Removes the unused
openaidependency frompyproject.toml.Reviewed by Cursor Bugbot for commit ee48a5c. Bugbot is set up for automated code reviews on this repo. Configure here.