Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion at src/lib/core/topology.cpp:627 #1527

Closed
Zorgosto opened this issue Jan 14, 2024 · 1 comment
Closed

Assertion at src/lib/core/topology.cpp:627 #1527

Zorgosto opened this issue Jan 14, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@Zorgosto
Copy link

Describe the bug
When I try to run the example LLM TextGeneration code I get an assertion error.
(Sorry for any formatting errors, if you have tips to make it more readable please tell me).

Expected behavior
Run the LLM and get the output.

Environment
Include all relevant environment information:

  1. OS: Ubuntu Server 22.04.3 LTS
  2. Python version: 3.11.7
  3. DeepSparse version or commit hash [e.g. 0.1.0, f7245c8]: deepsparse-nightly 1.7.0.20240103
  4. ML framework version(s): torch 2.1.2
  5. Other Python package versions [e.g. SparseML, Sparsify, numpy, ONNX]: ONNX 1.14.1
  6. CPU info - output of deepsparse/src/deepsparse/arch.bin or output of cpu_architecture() as follows: Same error so can't output but CPU is AMD EPYC 7282 and the chipset is x86-64-v3 which has AVX2 support. The VM should have access to 32 cores.

To Reproduce
Exact steps to reproduce the behavior:

  1. Install Ubuntu Server 22.04.3
  2. Install pyenv and create virtualenv with python version 3.11.3
  3. Activate virtual environment
  4. Install deepsparse via pip install -U deepsparse-nightly[llm]
  5. Use following example code and it with python main.py

from deepsparse import TextGeneration
#construct a pipeline
model_path = "zoo:mpt-7b-dolly_mpt_pretrain-pruned50_quantized"
pipeline = TextGeneration(model=model_path)
#generate text
prompt = "Below is an instruction that describes a task. Write a response that appropriately completes the request. ### Instruction: What is Kubernetes? ### Response:"
output = pipeline(prompt=prompt)
print(output.generations[0].text)

Errors

DeepSparse, Copyright 2021-present / Neuralmagic, Inc. version: 1.7.0.20240103 (b4c5ec70) (release) (optimized) (system=avx2, binary=avx2)
Date: 01-14-2024 @ 13:40:59 UTC
OS: Linux ubuntu-test-ai 5.15.0-91-generic #101-Ubuntu SMP Tue Nov 14 13:30:08 UTC 2023
Arch: x86_64
CPU:
Vendor:
Cores/sockets/threads: [0, 0, 0]
Available cores/sockets/threads: [0, 0, 0]
L1 cache size data/instruction: 0k/0k
L2 cache size: 0Mb
L3 cache size: 0Mb
Total memory: 39.3345G
Free memory: 29.7219G
Thread: 0x7fe7aca2cb80
Assertion at src/lib/core/topology.cpp:627
Backtrace:
0# 0x00007fe6ad018b5d:
[41b90100000031f66a0041b801000000b973020000488d15e19004fee882b190]
[01488b3dbb5a9801585ae884b1900148833ddc5598010074084c89e7e852b090]
1# 0x00007fe6ad01908b:
[0f1f4400004883c3184839dd741c8b0385c074f1488b7c24284889dee8f4f0ff]
[ff4883c3184839dd75e44881c4880000004c89ef5b5d415c415d415e415fe9f2]
2# (deepsparse)

Additional context
The system runs as a container on a Proxmox server. I also tried on a Debian 12 system before and it has the same problem so maybe the problem has to do with proxmox or the CPU maybe.

@Zorgosto Zorgosto added the bug Something isn't working label Jan 14, 2024
@mgoin
Copy link
Member

mgoin commented Apr 24, 2024

Hey @Zorgosto that error is due to our hardware topology detection being unable to detect the cache size from CPUID. We are unable to diagnose this ourselves since we do not have access to a virtual environment like that. Please re-open this issue if you have an idea of a publicly available instance with this software, thanks!

@mgoin mgoin closed this as completed Apr 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants