You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
When I try to run the example LLM TextGeneration code I get an assertion error.
(Sorry for any formatting errors, if you have tips to make it more readable please tell me).
Expected behavior
Run the LLM and get the output.
Environment
Include all relevant environment information:
OS: Ubuntu Server 22.04.3 LTS
Python version: 3.11.7
DeepSparse version or commit hash [e.g. 0.1.0, f7245c8]: deepsparse-nightly 1.7.0.20240103
CPU info - output of deepsparse/src/deepsparse/arch.bin or output of cpu_architecture() as follows: Same error so can't output but CPU is AMD EPYC 7282 and the chipset is x86-64-v3 which has AVX2 support. The VM should have access to 32 cores.
To Reproduce
Exact steps to reproduce the behavior:
Install Ubuntu Server 22.04.3
Install pyenv and create virtualenv with python version 3.11.3
Activate virtual environment
Install deepsparse via pip install -U deepsparse-nightly[llm]
Use following example code and it with python main.py
from deepsparse import TextGeneration
#construct a pipeline
model_path = "zoo:mpt-7b-dolly_mpt_pretrain-pruned50_quantized"
pipeline = TextGeneration(model=model_path)
#generate text
prompt = "Below is an instruction that describes a task. Write a response that appropriately completes the request. ### Instruction: What is Kubernetes? ### Response:"
output = pipeline(prompt=prompt)
print(output.generations[0].text)
Errors
DeepSparse, Copyright 2021-present / Neuralmagic, Inc. version: 1.7.0.20240103 (b4c5ec70) (release) (optimized) (system=avx2, binary=avx2)
Date: 01-14-2024 @ 13:40:59 UTC
OS: Linux ubuntu-test-ai 5.15.0-91-generic #101-Ubuntu SMP Tue Nov 14 13:30:08 UTC 2023
Arch: x86_64
CPU:
Vendor:
Cores/sockets/threads: [0, 0, 0]
Available cores/sockets/threads: [0, 0, 0]
L1 cache size data/instruction: 0k/0k
L2 cache size: 0Mb
L3 cache size: 0Mb
Total memory: 39.3345G
Free memory: 29.7219G
Thread: 0x7fe7aca2cb80
Assertion at src/lib/core/topology.cpp:627
Backtrace:
0# 0x00007fe6ad018b5d:
[41b90100000031f66a0041b801000000b973020000488d15e19004fee882b190]
[01488b3dbb5a9801585ae884b1900148833ddc5598010074084c89e7e852b090]
1# 0x00007fe6ad01908b:
[0f1f4400004883c3184839dd741c8b0385c074f1488b7c24284889dee8f4f0ff]
[ff4883c3184839dd75e44881c4880000004c89ef5b5d415c415d415e415fe9f2]
2# (deepsparse)
Additional context
The system runs as a container on a Proxmox server. I also tried on a Debian 12 system before and it has the same problem so maybe the problem has to do with proxmox or the CPU maybe.
The text was updated successfully, but these errors were encountered:
Hey @Zorgosto that error is due to our hardware topology detection being unable to detect the cache size from CPUID. We are unable to diagnose this ourselves since we do not have access to a virtual environment like that. Please re-open this issue if you have an idea of a publicly available instance with this software, thanks!
Describe the bug
When I try to run the example LLM TextGeneration code I get an assertion error.
(Sorry for any formatting errors, if you have tips to make it more readable please tell me).
Expected behavior
Run the LLM and get the output.
Environment
Include all relevant environment information:
f7245c8
]: deepsparse-nightly 1.7.0.20240103To Reproduce
Exact steps to reproduce the behavior:
pip install -U deepsparse-nightly[llm]
python main.py
Errors
Additional context
The system runs as a container on a Proxmox server. I also tried on a Debian 12 system before and it has the same problem so maybe the problem has to do with proxmox or the CPU maybe.
The text was updated successfully, but these errors were encountered: