Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error on image generation with diffusers backend on Intel GPU #2332

Closed
Xav-v opened this issue May 16, 2024 · 5 comments
Closed

Error on image generation with diffusers backend on Intel GPU #2332

Xav-v opened this issue May 16, 2024 · 5 comments
Labels
bug Something isn't working unconfirmed

Comments

@Xav-v
Copy link

Xav-v commented May 16, 2024

LocalAI version:
quay.io/go-skynet/local-ai:master-sycl-f16-ffmpeg
Additional test on git master version (self build) - see Additional context below.

Environment, CPU architecture, OS, and Version:
Docker environment on Debian, with Arc A380 GPU passthrough

Describe the bug
Running a model with diffuser backend such as DreamShaper_8_pruned.safetensors gives an error No module named 'setuptools'.
I also tested installing setuptools through backend/python/diffusers/requirements-intel.txt (see below) but it also fails.

To Reproduce
Through the Text to Image WebUI , select Dreamshaper as model, input a query and launch.

Expected behavior
The model loads, no error is given, and image is created according to the prompt.

Logs

local-ai    | 7:29AM INF Success ip=[XXX] latency="531.499µs" method=GET status=200 url=/text2image/dreamshaper
local-ai    | 7:29AM DBG Request received: {"model":"dreamshaper","language":"","n":1,"top_p":null,"top_k":null,"temperature":null,"max_tokens":null,"echo":false,"batch":0,"ignore_eos":false,"repeat_penalty":0,"n_keep":0,"frequency_penalty":0,"presence_penalty":0,"tfz":null,"typical_p":null,"seed":null,"negative_prompt":"","rope_freq_base":0,"rope_freq_scale":0,"negative_prompt_scale":0,"use_fast_tokenizer":false,"clip_skip":0,"tokenizer":"","file":"","response_format":{},"size":"512x512","prompt":"A cat on a bench","instruction":"","input":null,"stop":null,"messages":null,"functions":null,"function_call":null,"stream":false,"mode":0,"step":0,"grammar":"","grammar_json_functions":null,"grammar_json_name":null,"backend":"","model_base_name":""}
local-ai    | 7:29AM DBG Loading model: dreamshaper
local-ai    | 7:29AM DBG Parameter Config: &{PredictionOptions:{Model:DreamShaper_8_pruned.safetensors Language: N:0 TopP:0xc0000146e8 TopK:0xc0000146f0 Temperature:0xc0000146f8 Maxtokens:0xc0000148d8 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0xc0000148d0 TypicalP:0xc000014728 Seed:0xc0000148f0 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:dreamshaper F16:0xc0000146ca Threads:0xc0000146d8 Debug:0xc000af8060 Roles:map[] Embeddings:false Backend:diffusers TemplateConfig:{Chat: ChatMessage: Completion: Edit: Functions: UseTokenizerTemplate:false JoinChatMessagesByCharacter:<nil>} PromptStrings:[A cat on a bench] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: FunctionsConfig:{DisableNoAction:false NoActionFunctionName: NoActionDescriptionName: ParallelCalls:false NoGrammar:false ResponseRegex: JSONRegexMatch: FunctionName:false} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0xc000014720 MirostatTAU:0xc000014718 Mirostat:0xc000014710 NGPULayers:0xc0000148e0 MMap:0xc0000148e9 MMlock:0xc0000148e9 LowVRAM:0xc0000148e9 Grammar: StopWords:[] Cutstrings:[] TrimSpace:[] TrimSuffix:[] ContextSize:0xc0000146d0 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: FlashAttention:false NoKVOffloading:false RopeScaling: ModelType: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:true PipelineType:StableDiffusionPipeline SchedulerType:k_dpmpp_2m EnableParameters:negative_prompt,num_inference_steps CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:25 GRPC:{Attempts:0 AttemptsSleepTime:0} VallE:{AudioPath:} CUDA:false DownloadFiles:[] Description: Usage:}
local-ai    | 7:29AM INF Loading model 'DreamShaper_8_pruned.safetensors' with backend diffusers
local-ai    | 7:29AM DBG Loading model in memory from file: /models/DreamShaper_8_pruned.safetensors
local-ai    | 7:29AM DBG Loading Model DreamShaper_8_pruned.safetensors with gRPC (file: /models/DreamShaper_8_pruned.safetensors) (backend: diffusers): {backendString:diffusers model:DreamShaper_8_pruned.safetensors threads:20 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0xc0009c0488 externalBackends:map[autogptq:/build/backend/python/autogptq/run.sh bark:/build/backend/python/bark/run.sh coqui:/build/backend/python/coqui/run.sh diffusers:/build/backend/python/diffusers/run.sh exllama:/build/backend/python/exllama/run.sh exllama2:/build/backend/python/exllama2/run.sh huggingface-embeddings:/build/backend/python/sentencetransformers/run.sh mamba:/build/backend/python/mamba/run.sh parler-tts:/build/backend/python/parler-tts/run.sh petals:/build/backend/python/petals/run.sh rerankers:/build/backend/python/rerankers/run.sh sentencetransformers:/build/backend/python/sentencetransformers/run.sh transformers:/build/backend/python/transformers/run.sh transformers-musicgen:/build/backend/python/transformers-musicgen/run.sh vall-e-x:/build/backend/python/vall-e-x/run.sh vllm:/build/backend/python/vllm/run.sh] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false}
local-ai    | 7:29AM DBG Loading external backend: /build/backend/python/diffusers/run.sh
local-ai    | 7:29AM DBG Loading GRPC Process: /build/backend/python/diffusers/run.sh
local-ai    | 7:29AM DBG GRPC Service for DreamShaper_8_pruned.safetensors will be running at: '127.0.0.1:41527'
local-ai    | 7:29AM DBG GRPC Service state dir: /tmp/go-processmanager509139886
local-ai    | 7:29AM DBG GRPC Service Started
local-ai    | 7:29AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:41527): stdout Initializing libbackend for build
local-ai    | 7:29AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:41527): stdout virtualenv activated
local-ai    | 7:29AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:41527): stdout activated virtualenv has been ensured
local-ai    | 7:29AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:41527): stderr /build/backend/python/diffusers/venv/lib/python3.10/site-packages/transformers/utils/hub.py:124: FutureWarning: Using `TRANSFORMERS_CACHE` is deprecated and will be removed in v5 of Transformers. Use `HF_HOME` instead.
local-ai    | 7:29AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:41527): stderr   warnings.warn(
local-ai    | 7:29AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:41527): stderr Traceback (most recent call last):
local-ai    | 7:29AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:41527): stderr   File "/build/backend/python/diffusers/backend.py", line 41, in <module>
local-ai    | 7:29AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:41527): stderr     import intel_extension_for_pytorch as ipex
local-ai    | 7:29AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:41527): stderr   File "/build/backend/python/diffusers/venv/lib/python3.10/site-packages/intel_extension_for_pytorch/__init__.py", line 111, in <module>
local-ai    | 7:29AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:41527): stderr     from . import xpu
local-ai    | 7:29AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:41527): stderr   File "/build/backend/python/diffusers/venv/lib/python3.10/site-packages/intel_extension_for_pytorch/xpu/__init__.py", line 27, in <module>
local-ai    | 7:29AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:41527): stderr     from .cpp_extension import *
local-ai    | 7:29AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:41527): stderr   File "/build/backend/python/diffusers/venv/lib/python3.10/site-packages/intel_extension_for_pytorch/xpu/cpp_extension.py", line 4, in <module>
local-ai    | 7:29AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:41527): stderr     import setuptools
local-ai    | 7:29AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:41527): stderr ModuleNotFoundError: No module named 'setuptools'
local-ai    | 7:29AM ERR failed starting/connecting to the gRPC service error="rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing: dial tcp 127.0.0.1:41527: connect: connection refused\""
local-ai    | 7:30AM DBG GRPC Service NOT ready
local-ai    | 7:30AM ERR Server error error="grpc service not ready" ip=[XXX] latency=40.031790214s method=POST status=500 url=/v1/images/generations

Additional context
I pulled local git version, added setuptools to the backend/python/diffusers/requirements-intel.txt, build the docker image.
The new log is as following, with a Torch not compiled with CUDA enabled error popping up.

local-ai    | 8:40AM INF Success ip=[xxx] latency="641.456µs" method=GET status=200 url=/text2image/dreamshaper
local-ai    | 8:40AM DBG Request received: {"model":"dreamshaper","language":"","n":1,"top_p":null,"top_k":null,"temperature":null,"max_tokens":null,"echo":false,"batch":0,"ignore_eos":false,"repeat_penalty":0,"n_keep":0,"frequency_penalty":0,"presence_penalty":0,"tfz":null,"typical_p":null,"seed":null,"negative_prompt":"","rope_freq_base":0,"rope_freq_scale":0,"negative_prompt_scale":0,"use_fast_tokenizer":false,"clip_skip":0,"tokenizer":"","file":"","response_format":{},"size":"512x512","prompt":"A cat on a tree","instruction":"","input":null,"stop":null,"messages":null,"functions":null,"function_call":null,"stream":false,"mode":0,"step":0,"grammar":"","grammar_json_functions":null,"grammar_json_name":null,"backend":"","model_base_name":""}
local-ai    | 8:40AM DBG Loading model: dreamshaper
local-ai    | 8:40AM DBG Parameter Config: &{PredictionOptions:{Model:DreamShaper_8_pruned.safetensors Language: N:0 TopP:0xc00067cb98 TopK:0xc00067cba0 Temperature:0xc00067cba8 Maxtokens:0xc00067cbd8 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0xc00067cbd0 TypicalP:0xc00067cbc8 Seed:0xc00067cbf0 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:dreamshaper F16:0xc00067cb7a Threads:0xc00067cb88 Debug:0xc0004a03b0 Roles:map[] Embeddings:false Backend:diffusers TemplateConfig:{Chat: ChatMessage: Completion: Edit: Functions: UseTokenizerTemplate:false JoinChatMessagesByCharacter:<nil>} PromptStrings:[A cat on a tree] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: FunctionsConfig:{DisableNoAction:false NoActionFunctionName: NoActionDescriptionName: ParallelCalls:false NoGrammar:false ResponseRegex: JSONRegexMatch: FunctionName:false} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0xc00067cbc0 MirostatTAU:0xc00067cbb8 Mirostat:0xc00067cbb0 NGPULayers:0xc00067cbe0 MMap:0xc00067cbe9 MMlock:0xc00067cbe9 LowVRAM:0xc00067cbe9 Grammar: StopWords:[] Cutstrings:[] TrimSpace:[] TrimSuffix:[] ContextSize:0xc00067cb80 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: FlashAttention:false NoKVOffloading:false RopeScaling: ModelType: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:true PipelineType:StableDiffusionPipeline SchedulerType:k_dpmpp_2m EnableParameters:negative_prompt,num_inference_steps CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:25 GRPC:{Attempts:0 AttemptsSleepTime:0} VallE:{AudioPath:} CUDA:false DownloadFiles:[] Description: Usage:}
local-ai    | 8:40AM INF Loading model 'DreamShaper_8_pruned.safetensors' with backend diffusers
local-ai    | 8:40AM DBG Loading model in memory from file: /models/DreamShaper_8_pruned.safetensors
local-ai    | 8:40AM DBG Loading Model DreamShaper_8_pruned.safetensors with gRPC (file: /models/DreamShaper_8_pruned.safetensors) (backend: diffusers): {backendString:diffusers model:DreamShaper_8_pruned.safetensors threads:20 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0xc00067efc8 externalBackends:map[autogptq:/build/backend/python/autogptq/run.sh bark:/build/backend/python/bark/run.sh coqui:/build/backend/python/coqui/run.sh diffusers:/build/backend/python/diffusers/run.sh exllama:/build/backend/python/exllama/run.sh exllama2:/build/backend/python/exllama2/run.sh huggingface-embeddings:/build/backend/python/sentencetransformers/run.sh mamba:/build/backend/python/mamba/run.sh parler-tts:/build/backend/python/parler-tts/run.sh petals:/build/backend/python/petals/run.sh rerankers:/build/backend/python/rerankers/run.sh sentencetransformers:/build/backend/python/sentencetransformers/run.sh transformers:/build/backend/python/transformers/run.sh transformers-musicgen:/build/backend/python/transformers-musicgen/run.sh vall-e-x:/build/backend/python/vall-e-x/run.sh vllm:/build/backend/python/vllm/run.sh] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false}
local-ai    | 8:40AM DBG Loading external backend: /build/backend/python/diffusers/run.sh
local-ai    | 8:40AM DBG Loading GRPC Process: /build/backend/python/diffusers/run.sh
local-ai    | 8:40AM DBG GRPC Service for DreamShaper_8_pruned.safetensors will be running at: '127.0.0.1:44683'
local-ai    | 8:40AM DBG GRPC Service state dir: /tmp/go-processmanager1256271674
local-ai    | 8:40AM DBG GRPC Service Started
local-ai    | 8:40AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:44683): stdout Initializing libbackend for build
local-ai    | 8:40AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:44683): stdout virtualenv activated
local-ai    | 8:40AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:44683): stdout activated virtualenv has been ensured
local-ai    | 8:40AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:44683): stderr /build/backend/python/diffusers/venv/lib/python3.10/site-packages/transformers/utils/hub.py:124: FutureWarning: Using `TRANSFORMERS_CACHE` is deprecated and will be removed in v5 of Transformers. Use `HF_HOME` instead.
local-ai    | 8:40AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:44683): stderr   warnings.warn(
local-ai    | 8:40AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:44683): stderr Server started. Listening on: 127.0.0.1:44683
local-ai    | 8:40AM DBG GRPC Service Ready
local-ai    | 8:40AM DBG GRPC: Loading model with options: {state:{NoUnkeyedLiterals:{} DoNotCompare:[] DoNotCopy:[] atomicMessageInfo:<nil>} sizeCache:0 unknownFields:[] Model:DreamShaper_8_pruned.safetensors ContextSize:512 Seed:129115138 NBatch:512 F16Memory:true MLock:false MMap:false VocabOnly:false LowVRAM:false Embeddings:false NUMA:false NGPULayers:99999999 MainGPU: TensorSplit: Threads:20 LibrarySearchPath: RopeFreqBase:0 RopeFreqScale:0 RMSNormEps:0 NGQA:0 ModelFile:/models/DreamShaper_8_pruned.safetensors Device: UseTriton:false ModelBaseName: UseFastTokenizer:false PipelineType:StableDiffusionPipeline SchedulerType:k_dpmpp_2m CUDA:true CFGScale:0 IMG2IMG:false CLIPModel: CLIPSubfolder: CLIPSkip:0 ControlNet: Tokenizer: LoraBase: LoraAdapter: LoraScale:0 NoMulMatQ:false DraftModel: AudioPath: Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: RopeScaling: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0 Type: FlashAttention:false NoKVOffload:false}
local-ai    | 8:40AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:44683): stderr Loading model DreamShaper_8_pruned.safetensors...
local-ai    | 8:40AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:44683): stderr Request Model: "DreamShaper_8_pruned.safetensors"
local-ai    | 8:40AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:44683): stderr ContextSize: 512
local-ai    | 8:40AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:44683): stderr Seed: 129115138
local-ai    | 8:40AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:44683): stderr NBatch: 512
local-ai    | 8:40AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:44683): stderr F16Memory: true
local-ai    | 8:40AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:44683): stderr NGPULayers: 99999999
local-ai    | 8:40AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:44683): stderr Threads: 20
local-ai    | 8:40AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:44683): stderr ModelFile: "/models/DreamShaper_8_pruned.safetensors"
local-ai    | 8:40AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:44683): stderr PipelineType: "StableDiffusionPipeline"
local-ai    | 8:40AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:44683): stderr SchedulerType: "k_dpmpp_2m"
local-ai    | 8:40AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:44683): stderr CUDA: true
local-ai    | 8:40AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:44683): stderr 
local-ai    | 8:40AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:44683): stderr /build/backend/python/diffusers/venv/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
local-ai    | 8:40AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:44683): stderr   warnings.warn(
local-ai    | 8:40AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:44683): stderr Some weights of the model checkpoint were not used when initializing CLIPTextModel: 
local-ai    | 8:40AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:44683): stderr  ['text_model.embeddings.position_ids']
local-ai    | 8:40AM DBG GRPC(DreamShaper_8_pruned.safetensors-127.0.0.1:44683): stderr You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing `safety_checker=None`. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 .
local-ai    | 8:40AM ERR Server error error="could not load model (no success): Unexpected err=AssertionError('Torch not compiled with CUDA enabled'), type(err)=<class 'AssertionError'>" ip=[xxx] latency=8.850052993s method=POST status=500 url=/v1/images/generations
@cryptk
Copy link
Collaborator

cryptk commented May 16, 2024

@Xav-v just had a commit merged to resolve the missing setuptools. For the CUDA error, can you please provide the exact command you used to build the docker image? That error makes me think that the image wasn't built correctly for an Intel platform.

@Xav-v
Copy link
Author

Xav-v commented May 16, 2024

@Xav-v just had a commit merged to resolve the missing setuptools. For the CUDA error, can you please provide the exact command you used to build the docker image? That error makes me think that the image wasn't built correctly for an Intel platform.

Many thanks, I compiled again from master and the first issue disappeared.
The second "CUDA error" persist.

That's my docker-compose for the built:


  local-ai:
    container_name: local-ai
    image: quay.io/go-skynet/local-ai:master-aio-gpu-intel-f32
    build:
      context: ./LocalAI
      dockerfile: Dockerfile
      args:
      - BASE_IMAGE=intel/oneapi-basekit:2024.1.0-devel-ubuntu22.04
      - FFMPEG=true
      - BUILD_TYPE=sycl_f32
      - GO_TAGS="none"
    volumes:
      - ./models:/models:cached
      - ./images/:/tmp/generated/images/
    devices:
      - /dev/dri:/dev/dri
    group_add:
      - "105"

@cryptk
Copy link
Collaborator

cryptk commented May 16, 2024

based on your logs, it looks like your diffusers model definition has CUDA: true in it. Can you try either removing the CUDA setting or changing it to false and see if that helps with the CUDA error?

EDIT:

You may also need to add a devices section to your docker-compose file to map in /dev/dri as follows:

  local-ai:
    container_name: local-ai
    image: quay.io/go-skynet/local-ai:master-aio-gpu-intel-f32
    build:
      context: ./LocalAI
      dockerfile: Dockerfile
      devices:
      - /dev/dri:/dev/dri
      args:
      - BASE_IMAGE=intel/oneapi-basekit:2024.1.0-devel-ubuntu22.04
      - FFMPEG=true
      - BUILD_TYPE=sycl_f32
      - GO_TAGS="none"

@Xav-v
Copy link
Author

Xav-v commented May 17, 2024

Many thanks; now it's working! Modifying to CUDA: true in the yaml model file was the trick.
For the GPU, I was lazy and didn't give the full yaml file, but yes, it included gpu too :) my comment is updated above.

@Xav-v Xav-v closed this as completed May 17, 2024
@TheDom42
Copy link

TheDom42 commented Jul 6, 2024

Sorry to ressurect this old thread, however I was running into the same CUDA error.
I just started using the latest-aio-gpu-intel-f32 container on an Intel i5-12600K with /dev/dri passed through.

As I have no prior experience with running local AI models, I have not made any changes to any default settings.
I then wanted to run the T2I functionality with the default /build/models/runwayml/stable-diffusion-v1-5 model and it did not work due to the above mentioned CUDA error.
I also tried to download /build/models/v2ray/stable-diffusion-3-medium-diffusers afterwards with the same error.

I manually modified the yaml files now. However, would it be possible to modify the container so that it automatically changes this to CUDA: false when downloading it with the Intel GPU images?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working unconfirmed
Projects
None yet
Development

No branches or pull requests

3 participants