Error: llama runner process has terminated: error loading model: No device of requested type available

Hello,

I exactly followed the instruction below to try to run Ollama on my iGPU of i7-13700K, but got error when I run the ./ollama run llama3.1:8b.

https://github.com/intel-analytics/ipex-llm/blob/main/docs/mddocs/Quickstart/ollama_quickstart.md

here are the output from Ollama serve before and after ollama run.

does anyone can help?

(ollama) llm@fanlessLinux:~$ hwinfo --display
27: PCI 02.0: 0300 VGA compatible controller (VGA)              
  [Created at pci.386]
  Unique ID: _Znp.iTfrQPXqtxB
  SysFS ID: /devices/pci0000:00/0000:00:02.0
  SysFS BusID: 0000:00:02.0
  Hardware Class: graphics card
  Device Name: "Onboard - Video"
  Model: "Intel VGA compatible controller"
  Vendor: pci 0x8086 "Intel Corporation"
  Device: pci 0xa780 
  SubVendor: pci 0x1458 "Gigabyte Technology Co., Ltd"
  SubDevice: pci 0xd000 
  Revision: 0x04
  Driver: "i915"
  Driver Modules: "i915"
  Memory Range: 0x6002000000-0x6002ffffff (rw,non-prefetchable)
  Memory Range: 0x4000000000-0x403fffffff (ro,non-prefetchable)
  I/O Ports: 0x5000-0x503f (rw)
  Memory Range: 0x000c0000-0x000dffff (rw,non-prefetchable,disabled)
  IRQ: 220 (11520 events)
  Module Alias: "pci:v00008086d0000A780sv00001458sd0000D000bc03sc00i00"
  Driver Info #0:
    Driver Status: i915 is active
    Driver Activation Cmd: "modprobe i915"
  Config Status: cfg=new, avail=yes, need=no, active=unknown

Primary display adapter: #27

sudo dmesg | grep i915
[    8.042895] i915 0000:00:02.0: Using 24 cores (0-23) for kthreads
[    8.043501] i915 0000:00:02.0: vgaarb: deactivate vga console
[    8.043512] i915 0000:00:02.0: Using Transparent Hugepages
[    8.053594] i915 0000:00:02.0: [drm] Finished loading DMC firmware i915/adls_dmc_ver2_01.bin (v2.1)
[    8.085845] i915 0000:00:02.0: [drm] Protected Xe Path (PXP) protected content support initialized
[    8.085848] i915 0000:00:02.0: GT0: GuC firmware i915/tgl_guc_70.26.4.bin version 70.26.4
[    8.085851] i915 0000:00:02.0: GT0: HuC firmware i915/tgl_huc_7.9.3.bin version 7.9.3
[    8.118088] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on minor 0
[    8.152002] snd_hda_intel 0000:00:1f.3: bound 0000:00:02.0 (ops i915_audio_component_bind_ops [i915])



==========ollama serve output========
time=2024-11-20T15:05:56.394-05:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu cpu_avx cpu_avx2]". <== does this mean gpu was not recognized?
[GIN] 2024/11/20 - 15:06:09 | 200 |      82.264µs |       127.0.0.1 | HEAD     "/"
[GIN] 2024/11/20 - 15:06:09 | 200 |   19.279993ms |       127.0.0.1 | POST     "/api/show"

#########
ollama run from here
#########

time=2024-11-20T15:06:09.925-05:00 level=INFO source=gpu.go:168 msg="looking for compatible GPUs"
time=2024-11-20T15:06:09.925-05:00 level=WARN source=gpu.go:560 msg="unable to locate gpu dependency libraries" <== this might be the real reason.
time=2024-11-20T15:06:09.925-05:00 level=WARN source=gpu.go:560 msg="unable to locate gpu dependency libraries"
time=2024-11-20T15:06:09.928-05:00 level=WARN source=gpu.go:560 msg="unable to locate gpu dependency libraries"
time=2024-11-20T15:06:09.929-05:00 level=INFO source=gpu.go:280 msg="no compatible GPUs were discovered"
time=2024-11-20T15:06:09.952-05:00 level=INFO source=memory.go:309 msg="offload to cpu" layers.requested=-1 layers.model=33 layers.offload=0 layers.split="" memory.available="[91.5 GiB]" memory.required.full="5.8 GiB" memory.required.partial="0 B" memory.required.kv="1.0 GiB" memory.required.allocations="[5.8 GiB]" memory.weights.total="4.7 GiB" memory.weights.repeating="4.3 GiB" memory.weights.nonrepeating="411.0 MiB" memory.graph.full="560.0 MiB" memory.graph.partial="677.5 MiB"
time=2024-11-20T15:06:09.953-05:00 level=INFO source=server.go:395 msg="starting llama server" cmd="/tmp/ollama829815146/runners/cpu_avx2/ollama_llama_server --model /home/llm/.ollama/models/blobs/sha256-8eeb52dfb3bb9aefdf9d1ef24b3bdbcfbe82238798c4b918278320b6fcef18fe --ctx-size 8192 --batch-size 512 --embedding --log-disable --n-gpu-layers 999 --no-mmap --parallel 4 --port 41231"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Error: llama runner process has terminated: error loading model: No device of requested type available #12420

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Error: llama runner process has terminated: error loading model: No device of requested type available #12420

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions