Skip to content

Conversation

pentiminax
Copy link
Contributor

@pentiminax pentiminax commented Sep 3, 2025

Q A
Bug fix? yes
New feature? no
Docs? no
Issues Fix #422
License MIT

This PR fixes a crash in the Gemini ResultConverter when handling responses that only contained executableCode and codeExecutionResult parts.

Before:

  • RuntimeException: Unsupported finish reason "STOP"

After:

  • Proper conversion into a TextResult containing formatted code blocks and the last successful OUTCOME_OK output.
  • No more exceptions for valid Gemini responses.

What’s Changed

  • Enhanced choice conversion logic in ResultConverter::convertChoice():
  • Single part → keep legacy behavior (functionCall, text, or one code/output part).
  • Multiple parts → aggregate:
  • executableCode → Markdown code blocks.
  • codeExecutionResult (OUTCOME_OK) → output block.
  • Parts flagged as thought: true are ignored for text/code, but their successful execution results are preserved.

New unit tests:

  • Covers the failing payload where all parts were marked thought: true.
  • Ensures the result is a readable TextResult instead of an exception.

Why

  • Prevents crashes when Gemini outputs reasoning steps as thought: true.
  • Keeps only useful information (code + last valid output).
  • Maintains backward compatibility for simpler responses.

Impact

  • No BC breaks — existing behavior for single-part responses unchanged.
  • Improved UX — users now see clean Markdown code + output instead of runtime errors.

@carsonbot carsonbot added Bug Something isn't working Platform Issues & PRs about the AI Platform component Status: Needs Review labels Sep 3, 2025
@pentiminax pentiminax force-pushed the feature/result-converter-enhancement- branch 4 times, most recently from 94dac63 to fa4614e Compare September 3, 2025 17:14
@OskarStark OskarStark changed the title [Platform] Enhance choice conversion logic and add unit tests [Platform][Gemini] Fix choice conversion logic for executableCode and codeExecutionResult Sep 4, 2025
@OskarStark
Copy link
Contributor

friendly ping @valtzu

@OskarStark OskarStark requested a review from Copilot September 4, 2025 05:55
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a crash in the Gemini ResultConverter when handling responses containing only executableCode and codeExecutionResult parts, particularly when all parts are marked as thought: true. The fix prevents RuntimeException: "Unsupported finish reason STOP" by implementing proper aggregation logic for multi-part responses.

  • Enhanced choice conversion logic to handle both single-part (legacy) and multi-part responses
  • Added aggregation of executable code into formatted content and preservation of successful execution results
  • Comprehensive test coverage for the previously failing scenario with thought-flagged parts

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
src/platform/src/Bridge/Gemini/Gemini/ResultConverter.php Enhanced convertChoice method to aggregate multiple parts containing executable code and execution results
src/platform/tests/Bridge/Gemini/CodeExecution/ResultConverterTest.php Added test case covering the crash scenario with thought-flagged parts containing code execution

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@OskarStark OskarStark changed the title [Platform][Gemini] Fix choice conversion logic for executableCode and codeExecutionResult [Platform][Gemini] Fix choice conversion logic for executableCode and codeExecutionResult Sep 4, 2025
"thought": true
},
{
"codeExecutionResult": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this response contains 2 failed and one OK?

Copy link
Contributor Author

@pentiminax pentiminax Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this response matches the requested prompt:

'Count how many times each word appears in this text using the Python code execution tool:
"Symfony est un framework PHP. Symfony est utilisé dans de nombreux projets. PHP est un langage puissant."
Display the result as a dictionary sorted by frequency, then explain the result.'

It seems Gemini iteratively fixed its code until it executed correctly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so would be good to have a fixture (and test) for failed, too

Copy link
Contributor Author

@pentiminax pentiminax Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally, I removed the isSuccessfulCodeExecution condition, we want to display all the LLM’s thoughts to the user, right?"

@pentiminax
Copy link
Contributor Author

For information: https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/code-execution-api

@valtzu
Copy link
Contributor

valtzu commented Sep 4, 2025

First of all, thanks for working on this – when the server_tools were implemented, I only ever tested google_search and url_context.

In my opinion the output should not contain the generated & executed code by default, since the last message/part is a human readable message which already contains the answer to the asked question.

The code_execution is a server tool, like google_search or url_context, and those don't show up in the output either – even though f.e. search results / "sources" are available as "grounding" in the response.


If I'm building a customer-facing chat bot which should be able to answer to Calculate 20th fibonacci number. Then find the nearest palindrome to it. (an example from the code_execution docs), I don't think the customer should see the python code.


Somewhere we have this feature: keepToolMessages. Maybe we could even use that, or add a new one, like keepServerToolParts to control if those will be included or not 🤔

@pentiminax pentiminax force-pushed the feature/result-converter-enhancement- branch from 7b50efb to a2f4c4d Compare September 4, 2025 16:30
},
{
"codeExecutionResult": {
"outcome": "OUTCOME_OK",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I tried it via the API, after codeExecutionResult.outcome: "OUTCOME_OK" there is a normal text part, i.e.

          {
            "text": "The 20th Fibonacci number is 6765.\n\nThe nearest palindrome to 6765 is 6776."
          }

Did you not get it in the response? If you did, let's add it to the fixture?

Copy link
Contributor Author

@pentiminax pentiminax Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's weird, I never saw this part before, but you right I have it in API response, I will take this part into account.
I think text is not in the response when we have parts with thoughts key.

@pentiminax
Copy link
Contributor Author

pentiminax commented Sep 4, 2025

First of all, thanks for working on this – when the server_tools were implemented, I only ever tested google_search and url_context.

In my opinion the output should not contain the generated & executed code by default, since the last message/part is a human readable message which already contains the answer to the asked question.

The code_execution is a server tool, like google_search or url_context, and those don't show up in the output either – even though f.e. search results / "sources" are available as "grounding" in the response.

If I'm building a customer-facing chat bot which should be able to answer to Calculate 20th fibonacci number. Then find the nearest palindrome to it. (an example from the code_execution docs), I don't think the customer should see the python code.

Somewhere we have this feature: keepToolMessages. Maybe we could even use that, or add a new one, like keepServerToolParts to control if those will be included or not 🤔

Thanks a lot for clarifying! I see your point now. Since code_execution is indeed just another server tool like google_search or url_context, it makes sense to treat it consistently and not expose the raw code by default. The final human-readable message is usually enough for the end user, while the tool details can stay in the background.

I like the idea of making this configurable.

@OskarStark
Copy link
Contributor

The rebase looks wrong

@pentiminax pentiminax force-pushed the feature/result-converter-enhancement- branch from a2f4c4d to 4ea8008 Compare September 4, 2025 19:22
@OskarStark OskarStark requested a review from valtzu September 4, 2025 20:22
@pentiminax
Copy link
Contributor Author

At the moment the code iterates through the contentParts array and looks for a successful code execution marker.

  • Once a successful execution is detected, the flag $successfulCodeExecutionDetected is set to true.
  • From that point onward, it concatenates any following text parts into the $content string.
  • If any text was collected, it returns a TextResult containing that aggregated content.
  • If no content was collected, it throws a RuntimeException("Code execution failed.").

Copy link
Contributor

@OskarStark OskarStark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one minor, afterwards good to merge. Thanks

@OskarStark OskarStark force-pushed the feature/result-converter-enhancement- branch from 8fe9603 to bbff633 Compare September 5, 2025 09:01
@OskarStark
Copy link
Contributor

Thanks for fixing this bug @pentiminax.

@OskarStark OskarStark merged commit 38afc15 into symfony:main Sep 5, 2025
7 checks passed
@pentiminax
Copy link
Contributor Author

Thanks for reviewing @OskarStark.
I think we will have the same problem with Vertex bridge since it's Google models?

@OskarStark
Copy link
Contributor

Yes, can you have a look?

OskarStark added a commit that referenced this pull request Sep 5, 2025
This PR was squashed before being merged into the main branch.

Discussion
----------

[Platform][VertexAI] Update `ResultConverter`

| Q             | A
| ------------- | ---
| Bug fix?      | yes
| New feature?  | no
| Docs?         | no
| License       | MIT

Same as #421 but for VertexAI bridge

This PR fixes a crash in the Gemini ResultConverter when handling responses that only contained executableCode and codeExecutionResult parts.

Rather than duplicating the code, I created a trait that is used in both bridges.
I also reused the same test fixtures.

Commits
-------

a46ddcf [Platform][VertexAI] Update `ResultConverter`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working Platform Issues & PRs about the AI Platform component Status: Reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RuntimeException: Unsupported finish reason "STOP" in ResultConverter (Gemini bridge)
4 participants