You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I haved downloaded model internlm/internlm-xcomposer2-vl-7b-4bit from HuggingFace, and created a model config file for LocalAI as below:
- name: gpt-4-vision-preview# Default model parameters.# These options can also be specified in the API callsparameters:
model: internlm-xcomposer2-vl-7b-4bit/temperature: 0.2top_k: 85top_p: 0.7# Default context sizecontext_size: 4096# Default number of threadsthreads: 16backend: autogptqtrust_remote_code: true# define chat rolesroles:
user: "user:"assistant: "assistant:"template:
chat: &template |Instruct: {{.Input}}Output:
completion: *template# Enable F16 if backend supports itf16: trueembeddings: false# Enable debuggingdebug: true# GPU Layers (only used when built with cublas)gpu_layers: -1# Diffusers/transformerscuda: true
And then started the container with: docker run --gpus all -p 8080:8080 -v $PWD/models:/opt/models -e DEBUG=true -e MODELS_PATH=/opt/models -e CLIP_VISION_MODEL=/opt/models/clip-vit-large-patch14-336 -e HF_HOME=/opt/models -e TRANSFORMERS_OFFLINE=1 localai:v2.10.0-autogptq-5 --config-file /opt/models/intern-vl.yml
Service seems started successfully.
10:25AM DBG Template found, input modified to: Instruct: user:[img-0]Describe the image?
Output:
10:25AM DBG Prompt (after templating): Instruct: user:[img-0]Describe the image?
Output:
10:25AM INF Loading model 'internlm-xcomposer2-vl-7b-4bit/' with backend autogptq
10:25AM DBG Loading model in memory from file: /opt/models/internlm-xcomposer2-vl-7b-4bit
10:25AM DBG Loading Model internlm-xcomposer2-vl-7b-4bit/ with gRPC (file: /opt/models/internlm-xcomposer2-vl-7b-4bit) (backend: autogptq): {backendString:autogptq model:internlm-xcomposer2-vl-7b-4bit/ threads:16 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0xc00048c200 externalBackends:map[autogptq:/build/backend/python/autogptq/run.sh] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false}
10:25AM DBG Loading external backend: /build/backend/python/autogptq/run.sh
10:25AM DBG Loading GRPC Process: /build/backend/python/autogptq/run.sh
10:25AM DBG GRPC Service for internlm-xcomposer2-vl-7b-4bit/ will be running at: '127.0.0.1:43603'
But when I call the vision API, it always return 500 error, with message:
could not load model (no success): Unexpected err=OSError(\"Incorrect path_or_model_id: 'internlm-xcomposer2-vl-7b-4bit/'. Please provide either the path to a local folder or the repo_id of a model on the Hub.\"), type(err)=\u003cclass 'OSError'\u003e","type":""
Can anyone tell is this a bug of auto-gptq or localai, or my configuration mistake?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I tried to load an auto-gptq model with the latest LocalAI v2.10.0 docker image. I rebuilt the image with below Dockerfile:
I haved downloaded model
internlm/internlm-xcomposer2-vl-7b-4bit
from HuggingFace, and created a model config file for LocalAI as below:And then started the container with:
docker run --gpus all -p 8080:8080 -v $PWD/models:/opt/models -e DEBUG=true -e MODELS_PATH=/opt/models -e CLIP_VISION_MODEL=/opt/models/clip-vit-large-patch14-336 -e HF_HOME=/opt/models -e TRANSFORMERS_OFFLINE=1 localai:v2.10.0-autogptq-5 --config-file /opt/models/intern-vl.yml
Service seems started successfully.
But when I call the vision API, it always return 500 error, with message:
Can anyone tell is this a bug of auto-gptq or localai, or my configuration mistake?
Beta Was this translation helpful? Give feedback.
All reactions