-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Description
Hello,
I exactly followed the instruction below to try to run Ollama on my iGPU of i7-13700K, but got error when I run the ./ollama run llama3.1:8b.
https://github.com/intel-analytics/ipex-llm/blob/main/docs/mddocs/Quickstart/ollama_quickstart.md
here are the output from Ollama serve before and after ollama run.
does anyone can help?
(ollama) llm@fanlessLinux:~$ hwinfo --display
27: PCI 02.0: 0300 VGA compatible controller (VGA)
[Created at pci.386]
Unique ID: _Znp.iTfrQPXqtxB
SysFS ID: /devices/pci0000:00/0000:00:02.0
SysFS BusID: 0000:00:02.0
Hardware Class: graphics card
Device Name: "Onboard - Video"
Model: "Intel VGA compatible controller"
Vendor: pci 0x8086 "Intel Corporation"
Device: pci 0xa780
SubVendor: pci 0x1458 "Gigabyte Technology Co., Ltd"
SubDevice: pci 0xd000
Revision: 0x04
Driver: "i915"
Driver Modules: "i915"
Memory Range: 0x6002000000-0x6002ffffff (rw,non-prefetchable)
Memory Range: 0x4000000000-0x403fffffff (ro,non-prefetchable)
I/O Ports: 0x5000-0x503f (rw)
Memory Range: 0x000c0000-0x000dffff (rw,non-prefetchable,disabled)
IRQ: 220 (11520 events)
Module Alias: "pci:v00008086d0000A780sv00001458sd0000D000bc03sc00i00"
Driver Info #0:
Driver Status: i915 is active
Driver Activation Cmd: "modprobe i915"
Config Status: cfg=new, avail=yes, need=no, active=unknown
Primary display adapter: #27
sudo dmesg | grep i915
[ 8.042895] i915 0000:00:02.0: Using 24 cores (0-23) for kthreads
[ 8.043501] i915 0000:00:02.0: vgaarb: deactivate vga console
[ 8.043512] i915 0000:00:02.0: Using Transparent Hugepages
[ 8.053594] i915 0000:00:02.0: [drm] Finished loading DMC firmware i915/adls_dmc_ver2_01.bin (v2.1)
[ 8.085845] i915 0000:00:02.0: [drm] Protected Xe Path (PXP) protected content support initialized
[ 8.085848] i915 0000:00:02.0: GT0: GuC firmware i915/tgl_guc_70.26.4.bin version 70.26.4
[ 8.085851] i915 0000:00:02.0: GT0: HuC firmware i915/tgl_huc_7.9.3.bin version 7.9.3
[ 8.118088] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on minor 0
[ 8.152002] snd_hda_intel 0000:00:1f.3: bound 0000:00:02.0 (ops i915_audio_component_bind_ops [i915])
==========ollama serve output========
time=2024-11-20T15:05:56.394-05:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu cpu_avx cpu_avx2]". <== does this mean gpu was not recognized?
[GIN] 2024/11/20 - 15:06:09 | 200 | 82.264µs | 127.0.0.1 | HEAD "/"
[GIN] 2024/11/20 - 15:06:09 | 200 | 19.279993ms | 127.0.0.1 | POST "/api/show"
#########
ollama run from here
#########
time=2024-11-20T15:06:09.925-05:00 level=INFO source=gpu.go:168 msg="looking for compatible GPUs"
time=2024-11-20T15:06:09.925-05:00 level=WARN source=gpu.go:560 msg="unable to locate gpu dependency libraries" <== this might be the real reason.
time=2024-11-20T15:06:09.925-05:00 level=WARN source=gpu.go:560 msg="unable to locate gpu dependency libraries"
time=2024-11-20T15:06:09.928-05:00 level=WARN source=gpu.go:560 msg="unable to locate gpu dependency libraries"
time=2024-11-20T15:06:09.929-05:00 level=INFO source=gpu.go:280 msg="no compatible GPUs were discovered"
time=2024-11-20T15:06:09.952-05:00 level=INFO source=memory.go:309 msg="offload to cpu" layers.requested=-1 layers.model=33 layers.offload=0 layers.split="" memory.available="[91.5 GiB]" memory.required.full="5.8 GiB" memory.required.partial="0 B" memory.required.kv="1.0 GiB" memory.required.allocations="[5.8 GiB]" memory.weights.total="4.7 GiB" memory.weights.repeating="4.3 GiB" memory.weights.nonrepeating="411.0 MiB" memory.graph.full="560.0 MiB" memory.graph.partial="677.5 MiB"
time=2024-11-20T15:06:09.953-05:00 level=INFO source=server.go:395 msg="starting llama server" cmd="/tmp/ollama829815146/runners/cpu_avx2/ollama_llama_server --model /home/llm/.ollama/models/blobs/sha256-8eeb52dfb3bb9aefdf9d1ef24b3bdbcfbe82238798c4b918278320b6fcef18fe --ctx-size 8192 --batch-size 512 --embedding --log-disable --n-gpu-layers 999 --no-mmap --parallel 4 --port 41231"