Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: chat-ui (Docker) crashes after reasoning-enabled QwQ model inference (ECONNREFUSED 0.0.0.0:443) #1777

Open
dosomethingbyme opened this issue Mar 27, 2025 · 0 comments
Labels
bug Something isn't working

Comments

@dosomethingbyme
Copy link

dosomethingbyme commented Mar 27, 2025

Bug description

When using the QwQ model via Ollama in Chat UI with reasoning enabled (<think>...</think>), the chat-ui Docker container crashes after completing a single inference. The logs report a fetch failed: connect ECONNREFUSED 0.0.0.0:443 error, followed by a fatal Generation failed exception.

The model successfully returns the response, but then the container exits unexpectedly. This behavior is consistent and reproducible.

This issue appears to be related to a post-inference process—possibly related to moderation or summary generation logic—attempting to reach 0.0.0.0:443, which is clearly invalid.

Steps to reproduce

  1. Configure .env.local with the QwQ model and reasoning enabled:

    MODELS=[{
      "name": "QwQ 32B",
      "chatPromptTemplate": "{{#each messages}}{{#ifUser}}<|im_start|>user\n{{content}}<|im_end|>\n{{/ifUser}}{{#ifAssistant}}<|im_start|>assistant\n{{content}}<|im_end|>\n{{/ifAssistant}}{{/each}}<|im_start|>assistant\n",
      "parameters": {
        "temperature": 0.7,
        "top_p": 0.9,
        "repetition_penalty": 1.1,
        "top_k": 50,
        "truncate": 3072,
        "max_new_tokens": 10240,
        "stop": ["<|im_end|>"]
      },
      "endpoints": [{
        "type": "ollama",
        "url": "http://172.30.0.1:11434",
        "ollamaName": "qwq:32b"
      }],
      "reasoning": {
        "type": "tokens",
        "beginToken": "<think>",
        "endToken": "</think>"
      }
    }]
  2. Launch the chat-ui and mongo containers using docker-compose up.

  3. Open the web UI at http://localhost:3000 and send a reasoning-style prompt (e.g., a logical puzzle or multistep deduction).

  4. Observe that the model returns a response successfully, then the chat-ui container logs an error and exits.

Screenshots

Image Image

Context

Logs

(base) renmin@renmin-ThinkStation-P920:~/dockercomposes/chat-ui$ docker logs -f chat-ui 
{"level":30,"time":1743073937640,"pid":22,"hostname":"c88c59dc2d52","msg":"Starting server..."}
Listening on http://0.0.0.0:3000
{"level":30,"time":1743073938199,"pid":22,"hostname":"c88c59dc2d52","msg":"[MIGRATIONS] Begin check..."}
{"level":30,"time":1743073938217,"pid":22,"hostname":"c88c59dc2d52","msg":"[MIGRATIONS] \"Update search assistants\" already applied. Skipping..."}
{"level":30,"time":1743073938217,"pid":22,"hostname":"c88c59dc2d52","msg":"[MIGRATIONS] \"Update deprecated models in assistants with the default model\" should not be applied for this run. Skipping..."}
{"level":30,"time":1743073938217,"pid":22,"hostname":"c88c59dc2d52","msg":"[MIGRATIONS] \"Add empty 'tools' record in settings\" already applied. Skipping..."}
{"level":30,"time":1743073938217,"pid":22,"hostname":"c88c59dc2d52","msg":"[MIGRATIONS] \"Convert message updates to the new schema\" already applied. Skipping..."}
{"level":30,"time":1743073938217,"pid":22,"hostname":"c88c59dc2d52","msg":"[MIGRATIONS] \"Convert message files to the new schema\" already applied. Skipping..."}
{"level":30,"time":1743073938217,"pid":22,"hostname":"c88c59dc2d52","msg":"[MIGRATIONS] \"Trim message updates to reduce stored size\" already applied. Skipping..."}
{"level":30,"time":1743073938217,"pid":22,"hostname":"c88c59dc2d52","msg":"[MIGRATIONS] \"Reset tools to empty\" already applied. Skipping..."}
{"level":30,"time":1743073938217,"pid":22,"hostname":"c88c59dc2d52","msg":"[MIGRATIONS] \"Update featured to review\" already applied. Skipping..."}
{"level":30,"time":1743073938217,"pid":22,"hostname":"c88c59dc2d52","msg":"[MIGRATIONS] \"Delete conversations with no user or assistant messages or valid sessions\" should not be applied for this run. Skipping..."}
{"level":30,"time":1743073938217,"pid":22,"hostname":"c88c59dc2d52","msg":"[MIGRATIONS] All migrations applied. Releasing lock"}
{"level":50,"time":1743073960409,"pid":22,"hostname":"c88c59dc2d52","err":{"type":"TypeError","message":"fetch failed: connect ECONNREFUSED 0.0.0.0:443","stack":"TypeError: fetch failed\n    at node:internal/deps/undici/undici:13502:13\n    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n    at async Promise.all (index 0)\n    at async Promise.all (index 0)\n    at async file:///app/build/server/chunks/index3-nzCYc910.js:152:63\ncaused by: Error: connect ECONNREFUSED 0.0.0.0:443\n    at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1611:16)"},"msg":"Failed to initialize PlaywrightBlocker from prebuilt lists"}
file:///app/build/server/chunks/index3-nzCYc910.js:1053
  throw new Error("Generation failed");
        ^

Error: Generation failed
    at generateFromDefaultEndpoint (file:///app/build/server/chunks/index3-nzCYc910.js:1053:9)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async getReturnFromGenerator (file:///app/build/server/chunks/index3-nzCYc910.js:1058:14)
    at async generateSummaryOfReasoning (file:///app/build/server/chunks/_server.ts-BrpIp4z7.js:275:15)

Specs

  • OS:
  • Browser:
  • chat-ui commit:

Config

.env.local

MONGODB_URL=mongodb://mongo-chatui:27017
HF_TOKEN=hf_VXxxxxxxxxx
PUBLIC_ORIGIN=http://chat.aicxxx.xxx

ALLOW_INSECURE_COOKIES=true


MODELS=[{"name":"Qwen 2.5 7B","chatPromptTemplate":"{{#each messages}}{{#ifUser}}<|im_start|>user\n{{content}}<|im_end|>\n{{/ifUser}}{{#ifAssistant}}<|im_start|>assistant\n{{content}}<|im_end|>\n{{/ifAssistant}}{{/each}}<|im_start|>assistant\n","parameters":{"temperature":0.7,"top_p":0.9,"repetition_penalty":1.1,"top_k":50,"truncate":3072,"max_new_tokens":10240,"stop":["<|im_end|>"]},"endpoints":[{"type":"ollama","url":"http://172.30.0.1:11434","ollamaName":"qwen2.5:7b"}]},{"name":"Qwen 2.5 14B","chatPromptTemplate":"{{#each messages}}{{#ifUser}}<|im_start|>user\n{{content}}<|im_end|>\n{{/ifUser}}{{#ifAssistant}}<|im_start|>assistant\n{{content}}<|im_end|>\n{{/ifAssistant}}{{/each}}<|im_start|>assistant\n","parameters":{"temperature":0.7,"top_p":0.9,"repetition_penalty":1.1,"top_k":50,"truncate":3072,"max_new_tokens":10240,"stop":["<|im_end|>"]},"endpoints":[{"type":"ollama","url":"http://172.30.0.1:11434","ollamaName":"qwen2.5:14b"}]},{"name":"Qwen 2.5 32B","chatPromptTemplate":"{{#each messages}}{{#ifUser}}<|im_start|>user\n{{content}}<|im_end|>\n{{/ifUser}}{{#ifAssistant}}<|im_start|>assistant\n{{content}}<|im_end|>\n{{/ifAssistant}}{{/each}}<|im_start|>assistant\n","parameters":{"temperature":0.7,"top_p":0.9,"repetition_penalty":1.1,"top_k":50,"truncate":3072,"max_new_tokens":10240,"stop":["<|im_end|>"]},"endpoints":[{"type":"ollama","url":"http://172.30.0.1:11434","ollamaName":"qwen2.5:32b"}]},{"name":"QwQ 32B","chatPromptTemplate":"{{#each messages}}{{#ifUser}}<|im_start|>user\n{{content}}<|im_end|>\n{{/ifUser}}{{#ifAssistant}}<|im_start|>assistant\n{{content}}<|im_end|>\n{{/ifAssistant}}{{/each}}<|im_start|>assistant\n","parameters":{"temperature":0.7,"top_p":0.9,"repetition_penalty":1.1,"top_k":50,"truncate":3072,"max_new_tokens":10240,"stop":["<|im_end|>"]},"endpoints":[{"type":"ollama","url":"http://172.30.0.1:11434","ollamaName":"qwq:32b"}],"reasoning":{"type":"tokens","beginToken":"<think>","endToken":"</think>"}}]

# MODELS=[{"name":"Qwen 2.5 7B","chatPromptTemplate":"{{#each messages}}{{#ifUser}}<|im_start|>user\n{{content}}<|im_end|>\n{{/ifUser}}{{#ifAssistant}}<|im_start|>assistant\n{{content}}<|im_end|>\n{{/ifAssistant}}{{/each}}<|im_start|>assistant\n","parameters":{"temperature":0.7,"top_p":0.9,"repetition_penalty":1.1,"top_k":50,"truncate":3072,"max_new_tokens":10240,"stop":["<|im_end|>"]},"endpoints":[{"type":"ollama","url":"http://172.30.0.1:11434","ollamaName":"qwen2.5:7b"}]},{"name":"Qwen 2.5 14B","chatPromptTemplate":"{{#each messages}}{{#ifUser}}<|im_start|>user\n{{content}}<|im_end|>\n{{/ifUser}}{{#ifAssistant}}<|im_start|>assistant\n{{content}}<|im_end|>\n{{/ifAssistant}}{{/each}}<|im_start|>assistant\n","parameters":{"temperature":0.7,"top_p":0.9,"repetition_penalty":1.1,"top_k":50,"truncate":3072,"max_new_tokens":10240,"stop":["<|im_end|>"]},"endpoints":[{"type":"ollama","url":"http://172.30.0.1:11434","ollamaName":"qwen2.5:14b"}]},{"name":"Qwen 2.5 32B","chatPromptTemplate":"{{#each messages}}{{#ifUser}}<|im_start|>user\n{{content}}<|im_end|>\n{{/ifUser}}{{#ifAssistant}}<|im_start|>assistant\n{{content}}<|im_end|>\n{{/ifAssistant}}{{/each}}<|im_start|>assistant\n","parameters":{"temperature":0.7,"top_p":0.9,"repetition_penalty":1.1,"top_k":50,"truncate":3072,"max_new_tokens":10240,"stop":["<|im_end|>"]},"endpoints":[{"type":"ollama","url":"http://172.30.0.1:11434","ollamaName":"qwen2.5:32b"}]}]

docker-compose.yml (relevant parts)

services:
  mongo:
    image: mongo:latest
    container_name: mongo-chatui
    ports:
      - "27017:27017"
    volumes:
      - ./mongo-data:/data/db
    networks:
      chat-net:
        ipv4_address: 172.30.0.2

  chat-ui:
    image: ghcr.io/huggingface/chat-ui
    container_name: chat-ui
    ports:
      - "3000:3000"
    env_file:
      - .env.local
    environment:
      HF_ENDPOINT: https://hf-mirror.com
    depends_on:
      - mongo
    networks:
      chat-net:
        ipv4_address: 172.30.0.3

networks:
  chat-net:
    driver: bridge
    ipam:
      config:
        - subnet: 172.30.0.0/16

Notes

Notes

  • The issue only occurs when using models with "reasoning" enabled (<think>...</think>).
  • The model (QwQ 32B via Ollama) returns a valid response before the crash happens.
  • The chat-ui container attempts to connect to 0.0.0.0:443, which fails and results in a crash.
  • It seems related to post-processing logic, possibly PlaywrightBlocker or summary generation that depends on external resources or a hardcoded endpoint.
  • The same environment works fine when reasoning is disabled or when using simpler models (e.g., without summary/think step).
  • The behavior is consistent and reproducible across restarts and clean deployments.
@dosomethingbyme dosomethingbyme added the bug Something isn't working label Mar 27, 2025
@dosomethingbyme dosomethingbyme changed the title Docker container crashes after a single inference round with custom Qwen models (ECONNREFUSED error) bug: Reasoning with QwQ model in Docker leads to chat-ui container crash (ECONNREFUSED) Mar 27, 2025
@dosomethingbyme dosomethingbyme changed the title bug: Reasoning with QwQ model in Docker leads to chat-ui container crash (ECONNREFUSED) bug: chat-ui (Docker) crashes after reasoning-enabled QwQ model inference (ECONNREFUSED 0.0.0.0:443) Mar 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant