fix: dedupe duplicated tool call fields by bugkeep · Pull Request #7765 · AstrBotDevs/AstrBot

bugkeep · 2026-04-24T07:39:45Z

Some OpenAI-compatible proxies can duplicate streaming chunks; when that happens, tool_call.id and tool_call.function.name can end up as a self-concatenated string (s + s). We now defensively de-duplicate those fields during completion parsing, and add a unit test covering the regression.

Summary by Sourcery

Handle duplicated tool call metadata from OpenAI-compatible proxies during completion parsing and add coverage for this regression.

Bug Fixes:

Normalize self-concatenated tool call IDs and function names when parsing OpenAI-style completions to avoid duplicated metadata from streaming proxies.

Tests:

Add an async unit test verifying that self-concatenated tool call IDs and names are de-duplicated when parsing OpenAI completions.

sourcery-ai

Hey - I've found 1 issue, and left some high level feedback:

Consider moving _dedupe_self_concatenated to a module-level helper (or shared utility) so it can be reused and more easily unit-tested in isolation rather than as an inner function.
The min_len thresholds (16 for IDs and 8 for names) are embedded as magic numbers; it would be clearer to lift these into named constants or document why these particular values are appropriate for the expected formats.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- Consider moving `_dedupe_self_concatenated` to a module-level helper (or shared utility) so it can be reused and more easily unit-tested in isolation rather than as an inner function.
- The `min_len` thresholds (16 for IDs and 8 for names) are embedded as magic numbers; it would be clearer to lift these into named constants or document why these particular values are appropriate for the expected formats.

## Individual Comments

### Comment 1
<location path="tests/test_openai_source.py" line_range="1183-1185" />
<code_context>
+async def test_parse_openai_completion_dedupes_self_concatenated_tool_call_fields():
+    provider = _make_provider()
+    try:
+        tool_call_id = "call_95fae017db5b4a91b1259aba"
+        tool_name = "astr_kb_search"
+        completion = ChatCompletion.model_validate(
+            {
+                "id": "chatcmpl-toolcall-dup",
</code_context>
<issue_to_address>
**suggestion (testing):** Add a test case where only one of `tool_call.id` or `tool_call.function.name` is duplicated to verify they are handled independently.

The current test only covers the case where both ID and function name are duplicated together. Please add two more cases: (a) duplicated ID with a normal function name, and (b) duplicated function name with a normal ID, to confirm each field’s deduping behavior is independent.

Suggested implementation:

```python
@pytest.mark.asyncio
async def test_parse_openai_completion_dedupes_self_concatenated_tool_call_id_only():
    provider = _make_provider()
    try:
        tool_call_id = "call_95fae017db5b4a91b1259aba"
        tool_name = "astr_kb_search"
        completion = ChatCompletion.model_validate(
            {
                "id": "chatcmpl-toolcall-dup-id-only",
                "object": "chat.completion",
                "created": 0,
                "model": "gpt-4o-mini",
                "choices": [
                    {
                        "index": 0,
                        "message": {
                            "role": "assistant",
                            "content": None,
                            "refusal": None,
                            "tool_calls": [
                                {
                                    "id": f"{tool_call_id}{tool_call_id}",
                                    "type": "function",
                                    "function": {
                                        "name": tool_name,
                                        "arguments": '{"query": "test"}',
                                    },
                                }
                            ],
                        },
                        "logprobs": None,
                        "finish_reason": "tool_calls",
                    }
                ],
                "usage": {
                    "prompt_tokens": 0,
                    "completion_tokens": 0,
                    "total_tokens": 0,
                },
            }
        )

        events = [e async for e in provider._parse_openai_completion(completion)]
        assert len(events) == 1

        output_message = events[0].output_message
        assert output_message is not None
        assert output_message.tool_calls is not None
        assert len(output_message.tool_calls) == 1

        tool_call = output_message.tool_calls[0]
        assert tool_call.id == tool_call_id
        assert tool_call.function.name == tool_name
    finally:
        await provider.terminate()


@pytest.mark.asyncio
async def test_parse_openai_completion_dedupes_self_concatenated_tool_call_function_name_only():
    provider = _make_provider()
    try:
        tool_call_id = "call_95fae017db5b4a91b1259aba"
        tool_name = "astr_kb_search"
        completion = ChatCompletion.model_validate(
            {
                "id": "chatcmpl-toolcall-dup-name-only",
                "object": "chat.completion",
                "created": 0,
                "model": "gpt-4o-mini",
                "choices": [
                    {
                        "index": 0,
                        "message": {
                            "role": "assistant",
                            "content": None,
                            "refusal": None,
                            "tool_calls": [
                                {
                                    "id": tool_call_id,
                                    "type": "function",
                                    "function": {
                                        "name": f"{tool_name}{tool_name}",
                                        "arguments": '{"query": "test"}',
                                    },
                                }
                            ],
                        },
                        "logprobs": None,
                        "finish_reason": "tool_calls",
                    }
                ],
                "usage": {
                    "prompt_tokens": 0,
                    "completion_tokens": 0,
                    "total_tokens": 0,
                },
            }
        )

        events = [e async for e in provider._parse_openai_completion(completion)]
        assert len(events) == 1

        output_message = events[0].output_message
        assert output_message is not None
        assert output_message.tool_calls is not None
        assert len(output_message.tool_calls) == 1

        tool_call = output_message.tool_calls[0]
        assert tool_call.id == tool_call_id
        assert tool_call.function.name == tool_name
    finally:
        await provider.terminate()


@pytest.mark.asyncio
async def test_parse_openai_completion_dedupes_self_concatenated_tool_call_fields():
    provider = _make_provider()
    try:
        tool_call_id = "call_95fae017db5b4a91b1259aba"
        tool_name = "astr_kb_search"
        completion = ChatCompletion.model_validate(
            {
                "id": "chatcmpl-toolcall-dup",
                "object": "chat.completion",
                "created": 0,
                "model": "gpt-4o-mini",
                "choices": [
                    {
                        "index": 0,
                        "message": {
                            "role": "assistant",
                            "content": None,
                            "refusal": None,

```

The new tests assume:
1. The helper under test is `provider._parse_openai_completion` and that it yields events with an `output_message.tool_calls` structure identical to the existing test.
2. The existing test constructs `tool_calls` as shown (a list with `id`, `type: "function"`, and `function: {name, arguments}`).

Please:
- Ensure the signature and usage of `provider._parse_openai_completion` and the event/`output_message` shape match those used in `test_parse_openai_completion_dedupes_self_concatenated_tool_call_fields`. If the existing test uses a different helper or field path, mirror that in the two new tests.
- Align the `tool_calls` payload shape (keys like `"tool_calls"`, `"type"`, `"function"`, `"arguments"`) with whatever is used in the existing dedupe test; update field names if they differ.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2026-04-24T07:41:06Z

+        tool_call_id = "call_95fae017db5b4a91b1259aba"
+        tool_name = "astr_kb_search"
+        completion = ChatCompletion.model_validate(


suggestion (testing): Add a test case where only one of tool_call.id or tool_call.function.name is duplicated to verify they are handled independently.

The current test only covers the case where both ID and function name are duplicated together. Please add two more cases: (a) duplicated ID with a normal function name, and (b) duplicated function name with a normal ID, to confirm each field’s deduping behavior is independent.

Suggested implementation:

@pytest.mark.asyncio async def test_parse_openai_completion_dedupes_self_concatenated_tool_call_id_only(): provider = _make_provider() try: tool_call_id = "call_95fae017db5b4a91b1259aba" tool_name = "astr_kb_search" completion = ChatCompletion.model_validate( { "id": "chatcmpl-toolcall-dup-id-only", "object": "chat.completion", "created": 0, "model": "gpt-4o-mini", "choices": [ { "index": 0, "message": { "role": "assistant", "content": None, "refusal": None, "tool_calls": [ { "id": f"{tool_call_id}{tool_call_id}", "type": "function", "function": { "name": tool_name, "arguments": '{"query": "test"}', }, } ], }, "logprobs": None, "finish_reason": "tool_calls", } ], "usage": { "prompt_tokens": 0, "completion_tokens": 0, "total_tokens": 0, }, } ) events = [e async for e in provider._parse_openai_completion(completion)] assert len(events) == 1 output_message = events[0].output_message assert output_message is not None assert output_message.tool_calls is not None assert len(output_message.tool_calls) == 1 tool_call = output_message.tool_calls[0] assert tool_call.id == tool_call_id assert tool_call.function.name == tool_name finally: await provider.terminate() @pytest.mark.asyncio async def test_parse_openai_completion_dedupes_self_concatenated_tool_call_function_name_only(): provider = _make_provider() try: tool_call_id = "call_95fae017db5b4a91b1259aba" tool_name = "astr_kb_search" completion = ChatCompletion.model_validate( { "id": "chatcmpl-toolcall-dup-name-only", "object": "chat.completion", "created": 0, "model": "gpt-4o-mini", "choices": [ { "index": 0, "message": { "role": "assistant", "content": None, "refusal": None, "tool_calls": [ { "id": tool_call_id, "type": "function", "function": { "name": f"{tool_name}{tool_name}", "arguments": '{"query": "test"}', }, } ], }, "logprobs": None, "finish_reason": "tool_calls", } ], "usage": { "prompt_tokens": 0, "completion_tokens": 0, "total_tokens": 0, }, } ) events = [e async for e in provider._parse_openai_completion(completion)] assert len(events) == 1 output_message = events[0].output_message assert output_message is not None assert output_message.tool_calls is not None assert len(output_message.tool_calls) == 1 tool_call = output_message.tool_calls[0] assert tool_call.id == tool_call_id assert tool_call.function.name == tool_name finally: await provider.terminate() @pytest.mark.asyncio async def test_parse_openai_completion_dedupes_self_concatenated_tool_call_fields(): provider = _make_provider() try: tool_call_id = "call_95fae017db5b4a91b1259aba" tool_name = "astr_kb_search" completion = ChatCompletion.model_validate( { "id": "chatcmpl-toolcall-dup", "object": "chat.completion", "created": 0, "model": "gpt-4o-mini", "choices": [ { "index": 0, "message": { "role": "assistant", "content": None, "refusal": None,

The new tests assume:

The helper under test is provider._parse_openai_completion and that it yields events with an output_message.tool_calls structure identical to the existing test.

The existing test constructs tool_calls as shown (a list with id, type: "function", and function: {name, arguments}).

Please:

Ensure the signature and usage of provider._parse_openai_completion and the event/output_message shape match those used in test_parse_openai_completion_dedupes_self_concatenated_tool_call_fields. If the existing test uses a different helper or field path, mirror that in the two new tests.

Align the tool_calls payload shape (keys like "tool_calls", "type", "function", "arguments") with whatever is used in the existing dedupe test; update field names if they differ.

gemini-code-assist

Code Review

This pull request introduces a defensive de-duplication mechanism for tool call fields to handle issues where certain OpenAI-compatible proxies duplicate streaming chunks. It adds a helper function to detect and fix self-concatenated strings in tool IDs and function names, along with a corresponding test case. The review feedback suggests refactoring the nested helper function into a static method on the class to maintain consistency with the existing codebase structure and improve organization.

gemini-code-assist · 2026-04-24T07:42:40Z

+        def _dedupe_self_concatenated(value: str, *, min_len: int) -> str:
+            if not value or len(value) < min_len or (len(value) % 2) != 0:
+                return value
+            half = len(value) // 2
+            return value[:half] if value[:half] == value[half:] else value


For better code organization and consistency with other helper methods in this class (like _safe_json_dump), consider moving this nested function out of _parse_openai_completion and defining it as a staticmethod on the ProviderOpenAIOfficial class. This improves discoverability and aligns with the existing structure of the file.

After moving it, you would update the calls to self._dedupe_self_concatenated(...).

References

Refactor logic into shared helper functions to improve code organization and reusability.

fix: dedupe duplicated tool call fields

4d2c2f5

auto-assign Bot requested review from LIghtJUNction and Raven95676 April 24, 2026 07:39

dosubot Bot added size:S This PR changes 10-29 lines, ignoring generated files. area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. labels Apr 24, 2026

sourcery-ai Bot reviewed Apr 24, 2026

View reviewed changes

gemini-code-assist Bot reviewed Apr 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: dedupe duplicated tool call fields#7765

fix: dedupe duplicated tool call fields#7765
bugkeep wants to merge 1 commit intoAstrBotDevs:masterfrom
bugkeep:bugfix/7694-dedupe-toolcall

bugkeep commented Apr 24, 2026 •

edited by sourcery-ai Bot

Loading

Uh oh!

sourcery-ai Bot left a comment

Uh oh!

sourcery-ai Bot Apr 24, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

bugkeep commented Apr 24, 2026 • edited by sourcery-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by Sourcery

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

bugkeep commented Apr 24, 2026 •

edited by sourcery-ai Bot

Loading