You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am running tests on this model, which I find excellent. However, I am encountering a few issues and would like to know whether it is possible to fix them or if I am simply asking for the impossible.
Next, here is my OpenWebUI configuration:
[Image 1] [Image 2] [Image 3]
I would like to know whether, with GLM-4.6V and OpenWebUI, it is possible to make the model choose and execute tools autonomously when it considers them relevant.
At the moment:
If it is an internet search, I have to manually activate the button, even though access is already available.
If it is Python code, I have to click “execute”; it does not run it by itself, even though it clearly has access to Jupyter, etc.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Hello,
I am running tests on this model, which I find excellent. However, I am encountering a few issues and would like to know whether it is possible to fix them or if I am simply asking for the impossible.
First of all, here is my vLLM configuration:
docker run -d \ --name vllm-llm \ --gpus '"device=4,5,6,7"' \ -e NVIDIA_DRIVER_CAPABILITIES=compute,utility \ -e VLLM_OBJECT_STORAGE_SHM_BUFFER_NAME="${SHM_NAME}" \ -v /raid/workspace/qladane/vllm/hf-cache:/root/.cache/huggingface \ --env "HF_TOKEN=${HF_TOKEN:-}" \ -p 8003:8000 \ --ipc=host \ --restart unless-stopped \ vllm-openai:glm46v \ zai-org/GLM-4.6V-FP8 \ --tensor-parallel-size 4 \ --enforce-eager \ --served-model-name ImagineAI \ --allowed-local-media-path / \ --limit-mm-per-prompt '{"image": 1, "video": 0}' \ --max-model-len 131072 \ --dtype auto \ --kv-cache-dtype fp8 \ --gpu-memory-utilization 0.85 \ --reasoning-parser glm45 \ --tool-call-parser glm45 \ --enable-auto-tool-choice \ --enable-expert-parallel \ --mm-encoder-tp-mode data \ --mm-processor-cache-type shm
Next, here is my OpenWebUI configuration:




[Image 1] [Image 2] [Image 3]
I would like to know whether, with GLM-4.6V and OpenWebUI, it is possible to make the model choose and execute tools autonomously when it considers them relevant.
At the moment:
If it is an internet search, I have to manually activate the button, even though access is already available.
If it is Python code, I have to click “execute”; it does not run it by itself, even though it clearly has access to Jupyter, etc.
If anyone has already encountered this issue.
Thank you very much in advance for your help.
Kind regards
Beta Was this translation helpful? Give feedback.
All reactions