Can't load local auto-gptq model #1851

thiner · 2024-03-18T10:49:59Z

thiner
Mar 18, 2024

I tried to load an auto-gptq model with the latest LocalAI v2.10.0 docker image. I rebuilt the image with below Dockerfile:

FROM localai/localai:v2.10.0-cublas-cuda12-core

RUN apt-get update -y && apt-get install -y  curl gcc
RUN curl https://repo.anaconda.com/pkgs/misc/gpgkeys/anaconda.asc | gpg --dearmor > conda.gpg && \
    install -o root -g root -m 644 conda.gpg /usr/share/keyrings/conda-archive-keyring.gpg && \
    gpg --keyring /usr/share/keyrings/conda-archive-keyring.gpg --no-default-keyring --fingerprint 34161F5BF5EB1D4BFBBB8F0A8AEB4F8B29D82806 && \
    echo "deb [arch=amd64 signed-by=/usr/share/keyrings/conda-archive-keyring.gpg] https://repo.anaconda.com/pkgs/misc/debrepo/conda stable main" > /etc/apt/sources.list.d/conda.list && \
    echo "deb [arch=amd64 signed-by=/usr/share/keyrings/conda-archive-keyring.gpg] https://repo.anaconda.com/pkgs/misc/debrepo/conda stable main" | tee -a /etc/apt/sources.list.d/conda.list && \
    apt-get update && \
    apt-get install -y conda && apt-get clean

ENV PATH="/opt/conda/bin:${PATH}"
RUN conda init bash
RUN echo "source activate transformers" > ~/.bashrc
ENV PATH /opt/conda/envs/transformers/bin:$PATH
RUN make BUILD_TYPE=cublas -C backend/python/autogptq
RUN pip3 install optimum transformers_stream_generator torchvision
ENV EXTERNAL_GRPC_BACKENDS="autogptq:/build/backend/python/autogptq/run.sh"
ENV BUILD_TYPE="cublas"

I haved downloaded model internlm/internlm-xcomposer2-vl-7b-4bit from HuggingFace, and created a model config file for LocalAI as below:

- name: gpt-4-vision-preview
  # Default model parameters.
  # These options can also be specified in the API calls
  parameters:
    model: internlm-xcomposer2-vl-7b-4bit/
    temperature: 0.2
    top_k: 85
    top_p: 0.7

  # Default context size
  context_size: 4096
  # Default number of threads
  threads: 16
  backend: autogptq
  trust_remote_code: true

  # define chat roles
  roles:
    user: "user:"
    assistant: "assistant:"
  template:
    chat: &template |
      Instruct: {{.Input}}
      Output:
    completion: *template
  # Enable F16 if backend supports it
  f16: true
  embeddings: false
  # Enable debugging
  debug: true

  # GPU Layers (only used when built with cublas)
  gpu_layers: -1

  # Diffusers/transformers
  cuda: true

And then started the container with:
docker run --gpus all -p 8080:8080 -v $PWD/models:/opt/models -e DEBUG=true -e MODELS_PATH=/opt/models -e CLIP_VISION_MODEL=/opt/models/clip-vit-large-patch14-336 -e HF_HOME=/opt/models -e TRANSFORMERS_OFFLINE=1 localai:v2.10.0-autogptq-5 --config-file /opt/models/intern-vl.yml

Service seems started successfully.

10:25AM DBG Template found, input modified to: Instruct: user:[img-0]Describe the image?
Output:

10:25AM DBG Prompt (after templating): Instruct: user:[img-0]Describe the image?
Output:

10:25AM INF Loading model 'internlm-xcomposer2-vl-7b-4bit/' with backend autogptq
10:25AM DBG Loading model in memory from file: /opt/models/internlm-xcomposer2-vl-7b-4bit
10:25AM DBG Loading Model internlm-xcomposer2-vl-7b-4bit/ with gRPC (file: /opt/models/internlm-xcomposer2-vl-7b-4bit) (backend: autogptq): {backendString:autogptq model:internlm-xcomposer2-vl-7b-4bit/ threads:16 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0xc00048c200 externalBackends:map[autogptq:/build/backend/python/autogptq/run.sh] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false}
10:25AM DBG Loading external backend: /build/backend/python/autogptq/run.sh
10:25AM DBG Loading GRPC Process: /build/backend/python/autogptq/run.sh
10:25AM DBG GRPC Service for internlm-xcomposer2-vl-7b-4bit/ will be running at: '127.0.0.1:43603'

But when I call the vision API, it always return 500 error, with message:

could not load model (no success): Unexpected err=OSError(\"Incorrect path_or_model_id: 'internlm-xcomposer2-vl-7b-4bit/'. Please provide either the path to a local folder or the repo_id of a model on the Hub.\"), type(err)=\u003cclass 'OSError'\u003e","type":""

Can anyone tell is this a bug of auto-gptq or localai, or my configuration mistake?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't load local auto-gptq model #1851

{{title}}

Replies: 0 comments

Select a reply

Can't load local auto-gptq model #1851

thiner Mar 18, 2024

Replies: 0 comments

thiner
Mar 18, 2024