auto tools #114

qBrabus · 2025-12-16T14:47:58Z

qBrabus
Dec 16, 2025

Hello,

I am running tests on this model, which I find excellent. However, I am encountering a few issues and would like to know whether it is possible to fix them or if I am simply asking for the impossible.

First of all, here is my vLLM configuration:

docker run -d \ --name vllm-llm \ --gpus '"device=4,5,6,7"' \ -e NVIDIA_DRIVER_CAPABILITIES=compute,utility \ -e VLLM_OBJECT_STORAGE_SHM_BUFFER_NAME="${SHM_NAME}" \ -v /raid/workspace/qladane/vllm/hf-cache:/root/.cache/huggingface \ --env "HF_TOKEN=${HF_TOKEN:-}" \ -p 8003:8000 \ --ipc=host \ --restart unless-stopped \ vllm-openai:glm46v \ zai-org/GLM-4.6V-FP8 \ --tensor-parallel-size 4 \ --enforce-eager \ --served-model-name ImagineAI \ --allowed-local-media-path / \ --limit-mm-per-prompt '{"image": 1, "video": 0}' \ --max-model-len 131072 \ --dtype auto \ --kv-cache-dtype fp8 \ --gpu-memory-utilization 0.85 \ --reasoning-parser glm45 \ --tool-call-parser glm45 \ --enable-auto-tool-choice \ --enable-expert-parallel \ --mm-encoder-tp-mode data \ --mm-processor-cache-type shm

Next, here is my OpenWebUI configuration:
[Image 1] [Image 2] [Image 3]

I would like to know whether, with GLM-4.6V and OpenWebUI, it is possible to make the model choose and execute tools autonomously when it considers them relevant.

At the moment:

If it is an internet search, I have to manually activate the button, even though access is already available.

If it is Python code, I have to click “execute”; it does not run it by itself, even though it clearly has access to Jupyter, etc.

If anyone has already encountered this issue.

Thank you very much in advance for your help.

Kind regards

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

auto tools #114

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

auto tools #114

Uh oh!

qBrabus Dec 16, 2025

Replies: 0 comments

qBrabus
Dec 16, 2025