Skip to content

[BUG] MCP responses issue #402

@eloycoto

Description

@eloycoto

When using MCP responses, something is lost from tool calling to the conversation endpoint.

LLamastack config:

version: '2'
image_name: ollama-llama-stack-config
apis:
  - agents
  - inference
  - safety
  - telemetry
  - tool_runtime
  - vector_io
logging:
  level: DEBUG
  category_levels:
    llama_stack: DEBUG  # Enable DEBUG for all llama_stack modules
    llama_stack.providers.remote.inference.vllm: DEBUG
    llama_stack.providers.inline.agents.meta_reference: DEBUG
    llama_stack.providers.inline.agents.meta_reference.agent_instance: DEBUG
    llama_stack.providers.inline.vector_io.faiss: DEBUG
    llama_stack.providers.inline.telemetry.meta_reference: DEBUG
    llama_stack.core: DEBUG
    llama_stack.apis: DEBUG
    uvicorn: DEBUG
    uvicorn.access: INFO
    fastapi: DEBUG

providers:
  vector_io:
    - config:
        kvstore:
          db_path: /tmp/faiss_store.db
          type: sqlite
      provider_id: faiss
      provider_type: inline::faiss

  agents:
  - config:
      persistence_store:
        db_path: /tmp/agents_store.db
        namespace: null
        type: sqlite
      responses_store:
        db_path: /tmp/responses_store.db
        type: sqlite
    provider_id: meta-reference
    provider_type: inline::meta-reference


  inference:
    - provider_id: vllm-inference
      provider_type: remote::vllm
      config:
        url: ${env.VLLM_URL:=http://localhost:8000/v1}
        max_tokens: ${env.VLLM_MAX_TOKENS:=4096}
        api_token: ${env.VLLM_API_TOKEN:=fake}
        tls_verify: ${env.VLLM_TLS_VERIFY:=false}

  tool_runtime:
    - provider_id: model-context-protocol
      provider_type: remote::model-context-protocol
      config: {}
      module: null

  telemetry:
    - config:
        service_name: 'llama-stack'
        sinks: console,sqlite
        sqlite_db_path: /tmp/trace_store.db
      provider_id: meta-reference
      provider_type: inline::meta-reference

metadata_store:
  type: sqlite
  db_path: /tmp/registry.db
  namespace: null

inference_store:
  type: sqlite
  db_path: /tmp/inference_store.db

lightspeed-statck:

name: Test LSCore
service:
  host: 0.0.0.0
  port: 8080
  auth_enabled: false
  workers: 1
  color_log: true
  access_log: true
llama_stack:
  use_as_library_client: false
  # url: http://llama-stack:8321
  url: "http://host.docker.internal:8321"
user_data_collection:
  feedback_enabled: true
  feedback_storage: "/tmp/data/feedback"
  transcripts_enabled: true
  transcripts_storage: "/tmp/data/transcripts"
authentication:
  module: "noop"

mcp_servers:
  - name: "server1"
    provider_id: "model-context-protocol"
    url: "http://host.docker.internal:8000/mcp/"
inference:
  default_model: vllm-inference/gemma3:27b-it-qat
  default_provider: vllm-inference

Initial Call and conversations response:

$ -> ls::test
[DEBUG] POST http://localhost:8080/v1/query
{
  "conversation_id": "13120c75-fd5f-4182-bf94-49a65315bcf1",
  "response": ""
}
$ -> ls::conversation "13120c75-fd5f-4182-bf94-49a65315bcf1"
[DEBUG] GET http://localhost:8080/v1/conversations/13120c75-fd5f-4182-bf94-49a65315bcf1
{
  "conversation_id": "13120c75-fd5f-4182-bf94-49a65315bcf1",
  "chat_history": [
    {
      "messages": [
        {
          "content": "I need to inform my manager about how many orchestator workflows we have currently registered",
          "type": "user"
        },
        {
          "content": "",
          "type": "assistant"
        }
      ],
      "started_at": "2025-08-14T14:46:47.209657Z",
      "completed_at": "2025-08-14T14:46:49.752601Z"
    }
  ]
}

Llamastack agents/sessions

Agents:


$ -> curl -s http://localhost:8321/v1/agents | jq .
{
  "data": [
    {
      "agent_id": "be50344d-c67f-493f-9010-d7e08da09c1c",
      "agent_config": {
        "sampling_params": {
          "strategy": {
            "type": "greedy"
          },
          "max_tokens": 0,
          "repetition_penalty": 1.0,
          "stop": null
        },
        "input_shields": [],
        "output_shields": [],
        "toolgroups": [],
        "client_tools": [],
        "tool_choice": null,
        "tool_prompt_format": null,
        "tool_config": {
          "tool_choice": "auto",
          "tool_prompt_format": null,
          "system_message_behavior": "append"
        },
        "max_infer_iters": 10,
        "model": "vllm-inference/qwen2.5-coder:7b",
        "instructions": "You are a helpful assistant",
        "name": null,
        "enable_session_persistence": true,
        "response_format": null
      },
      "created_at": "2025-08-14T14:46:47.113391Z"
    }
  ],
  "has_more": false,
  "url": "/v1/agents"
}

Session:

$ -> curl -s http://localhost:8321/v1/agents/be50344d-c67f-493f-9010-d7e08da09c1c/session/13120c75-fd5f-4182-bf94-49a65315bcf1 | jq .
{
  "session_id": "13120c75-fd5f-4182-bf94-49a65315bcf1",
  "session_name": "a8187279-b6c2-4979-83b7-a1156191cb90",
  "turns": [
    {
      "turn_id": "cc01590b-4129-4ceb-9a95-137a7000e8e6",
      "session_id": "13120c75-fd5f-4182-bf94-49a65315bcf1",
      "input_messages": [
        {
          "role": "user",
          "content": "I need to inform my manager about how many orchestator workflows we have currently registered",
          "context": null
        }
      ],
      "steps": [
        {
          "turn_id": "cc01590b-4129-4ceb-9a95-137a7000e8e6",
          "step_id": "7f6697b0-24e4-4197-a386-4c7a659eaa7d",
          "started_at": "2025-08-14T14:46:47.212324Z",
          "completed_at": "2025-08-14T14:46:47.918493Z",
          "step_type": "inference",
          "model_response": {
            "role": "assistant",
            "content": "",
            "stop_reason": "end_of_turn",
            "tool_calls": [
              {
                "call_id": "call_1g7qh2ks",
                "tool_name": "get_orchestrator_instances",
                "arguments": {
                  "session_id": "your_session_id"
                },
                "arguments_json": "{\"session_id\": \"your_session_id\"}"
              }
            ]
          }
        },
        {
          "turn_id": "cc01590b-4129-4ceb-9a95-137a7000e8e6",
          "step_id": "e902b473-ecda-42d1-8e11-b384cd978b6c",
          "started_at": "2025-08-14T14:46:47.955033Z",
          "completed_at": "2025-08-14T14:46:48.037260Z",
          "step_type": "tool_execution",
          "tool_calls": [
            {
              "call_id": "call_1g7qh2ks",
              "tool_name": "get_orchestrator_instances",
              "arguments": {
                "session_id": "your_session_id"
              },
              "arguments_json": "{\"session_id\": \"your_session_id\"}"
            }
          ],
          "tool_responses": [
            {
              "call_id": "call_1g7qh2ks",
              "tool_name": "get_orchestrator_instances",
              "content": [
                {
                  "type": "text",
                  "text": "{\"total\":5,\"data\":[{\"name\":\"process_payment_refunds\",\"status\":\"RUNNING\"},{\"name\":\"sync_inventory_updates\",\"status\":\"READY\"},{\"name\":\"generate_monthly_reports\",\"status\":\"RUNNING\"},{\"name\":\"backup_customer_data\",\"status\":\"FAILED\"},{\"name\":\"send_welcome_emails\",\"status\":\"COMPLETED\"}]}"
                }
              ],
              "metadata": null
            }
          ]
        },
        {
          "turn_id": "cc01590b-4129-4ceb-9a95-137a7000e8e6",
          "step_id": "3a069c6c-0e02-4a47-b90e-d65296923663",
          "started_at": "2025-08-14T14:46:48.049073Z",
          "completed_at": "2025-08-14T14:46:49.741719Z",
          "step_type": "inference",
          "model_response": {
            "role": "assistant",
            "content": "",
            "stop_reason": "end_of_turn",
            "tool_calls": []
          }
        }
      ],
      "output_message": {
        "role": "assistant",
        "content": "",
        "stop_reason": "end_of_turn",
        "tool_calls": []
      },
      "output_attachments": [],
      "started_at": "2025-08-14T14:46:47.209657Z",
      "completed_at": "2025-08-14T14:46:49.752601Z"
    }
  ],
  "started_at": "2025-08-14T14:46:47.127526Z"
}

But I can see using tcmpdump in ollama that it's replaying correctly, looks like that lightspeed is not getting an event with the response.

T +0.198405 REDACTED:39176 -> REDACTED:11434 [A] #104
POST /v1/chat/completions HTTP/1.1'
Host: REDACTED:11434'
Accept-Encoding: gzip, deflate'
Connection: keep-alive'
Accept: application/json'
Content-Type: application/json'
User-Agent: AsyncOpenAI/Python 1.98.0'
X-Stainless-Lang: python'
X-Stainless-Package-Version: 1.98.0'
X-Stainless-OS: Linux'
X-Stainless-Arch: x64'
X-Stainless-Runtime: CPython'
X-Stainless-Runtime-Version: 3.12.11'
Authorization: Bearer test'
X-Stainless-Async: async:asyncio'
x-stainless-retry-count: 0'
x-stainless-read-timeout: 600'
Content-Length: 1116'
'
{"messages":[{"role":"system","content":[{"type":"text","text":"You are a helpful assistant"}]},{"role":"user","content":[{"type":"text","text":"I need to inform my manager about how many orchestator workflows we have currently registered"}]},{"role":"assistant","content":[{"type":"text","text":""}],"tool_calls":[{"id":"call_1g7qh2ks","type":"function","function":{"name":"get_orchestrator_instances","arguments":"{\"session_id\": \"your_session_id\"}"}}]},{"role":"tool","content":[{"type":"text","text":"{\"total\":5,\"data\":[{\"name\":\"process_payment_refunds\",\"status\":\"RUNNING\"},{\"name\":\"sync_inventory_updates\",\"status\":\"READY\"},{\"name\":\"generate_monthly_reports\",\"status\":\"RUNNING\"},{\"name\":\"backup_customer_data\",\"status\":\"FAILED\"},{\"name\":\"send_welcome_emails\",\"status\":\"COMPLETED\"}]}"}]}],"model":"qwen2.5-coder:7b","max_tokens":4096,"stream":true,"temperature"
##
T +0.000459 REDACTED:39176 -> REDACTED:11434 [AP] #106
:0.0,"tools":[{"type":"function","function":{"name":"get_orchestrator_instances","description":"","parameters":{"type":"object","properties":{"session_id":{"type":"string"}},"required":["session_id"]}}}]}
##
T +1.601402 REDACTED:11434 -> REDACTED:39176 [AP] #108
HTTP/1.1 200 OK'
Content-Type: text/event-stream'
Date: Thu, 14 Aug 2025 14:46:49 GMT'
Transfer-Encoding: chunked'
'
23d'
data: {"id":"chatcmpl-630","object":"chat.completion.chunk","created":1755182809,"model":"qwen2.5-coder:7b","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"There are currently 5 orchestator workflows registered. Here is the status of each:\n\n1. **process_payment_refunds** - Status: RUNNING\n2. **sync_inventory_updates** - Status: READY\n3. **generate_monthly_reports** - Status: RUNNING\n4. **backup_customer_data** - Status: FAILED\n5. **send_welcome_emails** - Status: COMPLETED"},"finish_reason":"stop"}]}

data: [DONE]
  • Latest versions from docker/quay.
  • Nothing relevant in the logs on both sides.
  • FastMCP is working and I can see the tool response correctly.[{"id":"call_1g7qh2ks","type":"function","function":{"name":"get_orchestrator_instances","arguments":"{\"session_id\": \"your_session_id\"}"}}]},{"role":"tool","content":[{"type":"text","text":"{\"total\":5,\"data\":[{\"name\":\"process_payment_refunds\",\"status\":\"RUNNING\"},{\"name\":\"sync_inventory_updates\",\"status\":\"READY\"},{\"name\":\"generate_monthly_reports\",\"status\":\"RUNNING\"},{\"name\":\"backup_customer_data\",\"status\":\"FAILED\"},{\"name\":\"send_welcome_emails\",\"status\":\"COMPLETED\"}]}"}]}]
  • I checked the communication from llamastack to lightspeed service, there is no reponse event at all.
  • Is this is a bug? Is this the desired usage of the MCP tools? should I define the MCP tools on llamastack?

Regards

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions