[BUG] MCP responses issue

When using MCP responses, something is lost from tool calling to the conversation endpoint.

LLamastack config:

```
version: '2'
image_name: ollama-llama-stack-config
apis:
  - agents
  - inference
  - safety
  - telemetry
  - tool_runtime
  - vector_io
logging:
  level: DEBUG
  category_levels:
    llama_stack: DEBUG  # Enable DEBUG for all llama_stack modules
    llama_stack.providers.remote.inference.vllm: DEBUG
    llama_stack.providers.inline.agents.meta_reference: DEBUG
    llama_stack.providers.inline.agents.meta_reference.agent_instance: DEBUG
    llama_stack.providers.inline.vector_io.faiss: DEBUG
    llama_stack.providers.inline.telemetry.meta_reference: DEBUG
    llama_stack.core: DEBUG
    llama_stack.apis: DEBUG
    uvicorn: DEBUG
    uvicorn.access: INFO
    fastapi: DEBUG

providers:
  vector_io:
    - config:
        kvstore:
          db_path: /tmp/faiss_store.db
          type: sqlite
      provider_id: faiss
      provider_type: inline::faiss

  agents:
  - config:
      persistence_store:
        db_path: /tmp/agents_store.db
        namespace: null
        type: sqlite
      responses_store:
        db_path: /tmp/responses_store.db
        type: sqlite
    provider_id: meta-reference
    provider_type: inline::meta-reference


  inference:
    - provider_id: vllm-inference
      provider_type: remote::vllm
      config:
        url: ${env.VLLM_URL:=http://localhost:8000/v1}
        max_tokens: ${env.VLLM_MAX_TOKENS:=4096}
        api_token: ${env.VLLM_API_TOKEN:=fake}
        tls_verify: ${env.VLLM_TLS_VERIFY:=false}

  tool_runtime:
    - provider_id: model-context-protocol
      provider_type: remote::model-context-protocol
      config: {}
      module: null

  telemetry:
    - config:
        service_name: 'llama-stack'
        sinks: console,sqlite
        sqlite_db_path: /tmp/trace_store.db
      provider_id: meta-reference
      provider_type: inline::meta-reference

metadata_store:
  type: sqlite
  db_path: /tmp/registry.db
  namespace: null

inference_store:
  type: sqlite
  db_path: /tmp/inference_store.db
```


lightspeed-statck:
```
name: Test LSCore
service:
  host: 0.0.0.0
  port: 8080
  auth_enabled: false
  workers: 1
  color_log: true
  access_log: true
llama_stack:
  use_as_library_client: false
  # url: http://llama-stack:8321
  url: "http://host.docker.internal:8321"
user_data_collection:
  feedback_enabled: true
  feedback_storage: "/tmp/data/feedback"
  transcripts_enabled: true
  transcripts_storage: "/tmp/data/transcripts"
authentication:
  module: "noop"

mcp_servers:
  - name: "server1"
    provider_id: "model-context-protocol"
    url: "http://host.docker.internal:8000/mcp/"
inference:
  default_model: vllm-inference/gemma3:27b-it-qat
  default_provider: vllm-inference
```


Initial Call and conversations response:


```
$ -> ls::test
[DEBUG] POST http://localhost:8080/v1/query
{
  "conversation_id": "13120c75-fd5f-4182-bf94-49a65315bcf1",
  "response": ""
}
$ -> ls::conversation "13120c75-fd5f-4182-bf94-49a65315bcf1"
[DEBUG] GET http://localhost:8080/v1/conversations/13120c75-fd5f-4182-bf94-49a65315bcf1
{
  "conversation_id": "13120c75-fd5f-4182-bf94-49a65315bcf1",
  "chat_history": [
    {
      "messages": [
        {
          "content": "I need to inform my manager about how many orchestator workflows we have currently registered",
          "type": "user"
        },
        {
          "content": "",
          "type": "assistant"
        }
      ],
      "started_at": "2025-08-14T14:46:47.209657Z",
      "completed_at": "2025-08-14T14:46:49.752601Z"
    }
  ]
}
```


Llamastack agents/sessions

Agents:
```

$ -> curl -s http://localhost:8321/v1/agents | jq .
{
  "data": [
    {
      "agent_id": "be50344d-c67f-493f-9010-d7e08da09c1c",
      "agent_config": {
        "sampling_params": {
          "strategy": {
            "type": "greedy"
          },
          "max_tokens": 0,
          "repetition_penalty": 1.0,
          "stop": null
        },
        "input_shields": [],
        "output_shields": [],
        "toolgroups": [],
        "client_tools": [],
        "tool_choice": null,
        "tool_prompt_format": null,
        "tool_config": {
          "tool_choice": "auto",
          "tool_prompt_format": null,
          "system_message_behavior": "append"
        },
        "max_infer_iters": 10,
        "model": "vllm-inference/qwen2.5-coder:7b",
        "instructions": "You are a helpful assistant",
        "name": null,
        "enable_session_persistence": true,
        "response_format": null
      },
      "created_at": "2025-08-14T14:46:47.113391Z"
    }
  ],
  "has_more": false,
  "url": "/v1/agents"
}

```

Session:
```
$ -> curl -s http://localhost:8321/v1/agents/be50344d-c67f-493f-9010-d7e08da09c1c/session/13120c75-fd5f-4182-bf94-49a65315bcf1 | jq .
{
  "session_id": "13120c75-fd5f-4182-bf94-49a65315bcf1",
  "session_name": "a8187279-b6c2-4979-83b7-a1156191cb90",
  "turns": [
    {
      "turn_id": "cc01590b-4129-4ceb-9a95-137a7000e8e6",
      "session_id": "13120c75-fd5f-4182-bf94-49a65315bcf1",
      "input_messages": [
        {
          "role": "user",
          "content": "I need to inform my manager about how many orchestator workflows we have currently registered",
          "context": null
        }
      ],
      "steps": [
        {
          "turn_id": "cc01590b-4129-4ceb-9a95-137a7000e8e6",
          "step_id": "7f6697b0-24e4-4197-a386-4c7a659eaa7d",
          "started_at": "2025-08-14T14:46:47.212324Z",
          "completed_at": "2025-08-14T14:46:47.918493Z",
          "step_type": "inference",
          "model_response": {
            "role": "assistant",
            "content": "",
            "stop_reason": "end_of_turn",
            "tool_calls": [
              {
                "call_id": "call_1g7qh2ks",
                "tool_name": "get_orchestrator_instances",
                "arguments": {
                  "session_id": "your_session_id"
                },
                "arguments_json": "{\"session_id\": \"your_session_id\"}"
              }
            ]
          }
        },
        {
          "turn_id": "cc01590b-4129-4ceb-9a95-137a7000e8e6",
          "step_id": "e902b473-ecda-42d1-8e11-b384cd978b6c",
          "started_at": "2025-08-14T14:46:47.955033Z",
          "completed_at": "2025-08-14T14:46:48.037260Z",
          "step_type": "tool_execution",
          "tool_calls": [
            {
              "call_id": "call_1g7qh2ks",
              "tool_name": "get_orchestrator_instances",
              "arguments": {
                "session_id": "your_session_id"
              },
              "arguments_json": "{\"session_id\": \"your_session_id\"}"
            }
          ],
          "tool_responses": [
            {
              "call_id": "call_1g7qh2ks",
              "tool_name": "get_orchestrator_instances",
              "content": [
                {
                  "type": "text",
                  "text": "{\"total\":5,\"data\":[{\"name\":\"process_payment_refunds\",\"status\":\"RUNNING\"},{\"name\":\"sync_inventory_updates\",\"status\":\"READY\"},{\"name\":\"generate_monthly_reports\",\"status\":\"RUNNING\"},{\"name\":\"backup_customer_data\",\"status\":\"FAILED\"},{\"name\":\"send_welcome_emails\",\"status\":\"COMPLETED\"}]}"
                }
              ],
              "metadata": null
            }
          ]
        },
        {
          "turn_id": "cc01590b-4129-4ceb-9a95-137a7000e8e6",
          "step_id": "3a069c6c-0e02-4a47-b90e-d65296923663",
          "started_at": "2025-08-14T14:46:48.049073Z",
          "completed_at": "2025-08-14T14:46:49.741719Z",
          "step_type": "inference",
          "model_response": {
            "role": "assistant",
            "content": "",
            "stop_reason": "end_of_turn",
            "tool_calls": []
          }
        }
      ],
      "output_message": {
        "role": "assistant",
        "content": "",
        "stop_reason": "end_of_turn",
        "tool_calls": []
      },
      "output_attachments": [],
      "started_at": "2025-08-14T14:46:47.209657Z",
      "completed_at": "2025-08-14T14:46:49.752601Z"
    }
  ],
  "started_at": "2025-08-14T14:46:47.127526Z"
}
```


But I can see using tcmpdump in ollama that it's replaying correctly, looks like that lightspeed is not getting an event with the response.

```
T +0.198405 REDACTED:39176 -> REDACTED:11434 [A] #104
POST /v1/chat/completions HTTP/1.1'
Host: REDACTED:11434'
Accept-Encoding: gzip, deflate'
Connection: keep-alive'
Accept: application/json'
Content-Type: application/json'
User-Agent: AsyncOpenAI/Python 1.98.0'
X-Stainless-Lang: python'
X-Stainless-Package-Version: 1.98.0'
X-Stainless-OS: Linux'
X-Stainless-Arch: x64'
X-Stainless-Runtime: CPython'
X-Stainless-Runtime-Version: 3.12.11'
Authorization: Bearer test'
X-Stainless-Async: async:asyncio'
x-stainless-retry-count: 0'
x-stainless-read-timeout: 600'
Content-Length: 1116'
'
{"messages":[{"role":"system","content":[{"type":"text","text":"You are a helpful assistant"}]},{"role":"user","content":[{"type":"text","text":"I need to inform my manager about how many orchestator workflows we have currently registered"}]},{"role":"assistant","content":[{"type":"text","text":""}],"tool_calls":[{"id":"call_1g7qh2ks","type":"function","function":{"name":"get_orchestrator_instances","arguments":"{\"session_id\": \"your_session_id\"}"}}]},{"role":"tool","content":[{"type":"text","text":"{\"total\":5,\"data\":[{\"name\":\"process_payment_refunds\",\"status\":\"RUNNING\"},{\"name\":\"sync_inventory_updates\",\"status\":\"READY\"},{\"name\":\"generate_monthly_reports\",\"status\":\"RUNNING\"},{\"name\":\"backup_customer_data\",\"status\":\"FAILED\"},{\"name\":\"send_welcome_emails\",\"status\":\"COMPLETED\"}]}"}]}],"model":"qwen2.5-coder:7b","max_tokens":4096,"stream":true,"temperature"
##
T +0.000459 REDACTED:39176 -> REDACTED:11434 [AP] #106
:0.0,"tools":[{"type":"function","function":{"name":"get_orchestrator_instances","description":"","parameters":{"type":"object","properties":{"session_id":{"type":"string"}},"required":["session_id"]}}}]}
##
T +1.601402 REDACTED:11434 -> REDACTED:39176 [AP] #108
HTTP/1.1 200 OK'
Content-Type: text/event-stream'
Date: Thu, 14 Aug 2025 14:46:49 GMT'
Transfer-Encoding: chunked'
'
23d'
data: {"id":"chatcmpl-630","object":"chat.completion.chunk","created":1755182809,"model":"qwen2.5-coder:7b","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"There are currently 5 orchestator workflows registered. Here is the status of each:\n\n1. **process_payment_refunds** - Status: RUNNING\n2. **sync_inventory_updates** - Status: READY\n3. **generate_monthly_reports** - Status: RUNNING\n4. **backup_customer_data** - Status: FAILED\n5. **send_welcome_emails** - Status: COMPLETED"},"finish_reason":"stop"}]}

data: [DONE]
```

- Latest versions from docker/quay.
- Nothing relevant in the logs on both sides.
- FastMCP is working and I can see the tool response correctly.`[{"id":"call_1g7qh2ks","type":"function","function":{"name":"get_orchestrator_instances","arguments":"{\"session_id\": \"your_session_id\"}"}}]},{"role":"tool","content":[{"type":"text","text":"{\"total\":5,\"data\":[{\"name\":\"process_payment_refunds\",\"status\":\"RUNNING\"},{\"name\":\"sync_inventory_updates\",\"status\":\"READY\"},{\"name\":\"generate_monthly_reports\",\"status\":\"RUNNING\"},{\"name\":\"backup_customer_data\",\"status\":\"FAILED\"},{\"name\":\"send_welcome_emails\",\"status\":\"COMPLETED\"}]}"}]}]`
- I checked the communication from llamastack to lightspeed service, there is no reponse event at all.
- Is this is a bug? Is this the desired usage of the MCP tools? should I define the MCP tools on llamastack?

Regards


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUG] MCP responses issue #402

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] MCP responses issue #402

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions