Closed
Description
在数据预处理的时候运行weclone-cli make-dataset
, 就会报下面的错误
INFO 05-16 18:43:01 [loader.py:447] Loading weights took 2.60 seconds
INFO 05-16 18:43:01 [gpu_model_runner.py:1186] Model loading took 14.2487 GB and 2.777985 seconds
INFO 05-16 18:43:07 [backends.py:415] Using cache directory: /home/118/.cache/vllm/torch_compile_cache/834147faf0/rank_0_0 for vLLM's torch.compile
INFO 05-16 18:43:07 [backends.py:425] Dynamo bytecode transform time: 5.80 s
INFO 05-16 18:43:07 [backends.py:115] Directly load the compiled graph for shape None from the cache
INFO 05-16 18:43:12 [monitor.py:33] torch.compile takes 5.80 s in total
ERROR 05-16 18:43:14 [core.py:343] EngineCore hit an exception: Traceback (most recent call last):
ERROR 05-16 18:43:14 [core.py:343] File "/home/118/WeClone/.venv/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 335, in run_engine_core
ERROR 05-16 18:43:14 [core.py:343] engine_core = EngineCoreProc(*args, **kwargs)
ERROR 05-16 18:43:14 [core.py:343] File "/home/118/WeClone/.venv/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 290, in __init__
ERROR 05-16 18:43:14 [core.py:343] super().__init__(vllm_config, executor_class, log_stats)
ERROR 05-16 18:43:14 [core.py:343] File "/home/118/WeClone/.venv/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 63, in __init__
ERROR 05-16 18:43:14 [core.py:343] num_gpu_blocks, num_cpu_blocks = self._initialize_kv_caches(
ERROR 05-16 18:43:14 [core.py:343] File "/home/118/WeClone/.venv/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 126, in _initialize_kv_caches
ERROR 05-16 18:43:14 [core.py:343] kv_cache_configs = [
ERROR 05-16 18:43:14 [core.py:343] File "/home/118/WeClone/.venv/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 127, in <listcomp>
ERROR 05-16 18:43:14 [core.py:343] get_kv_cache_config(vllm_config, kv_cache_spec_one_worker,
ERROR 05-16 18:43:14 [core.py:343] File "/home/118/WeClone/.venv/lib/python3.10/site-packages/vllm/v1/core/kv_cache_utils.py", line 604, in get_kv_cache_config
ERROR 05-16 18:43:14 [core.py:343] check_enough_kv_cache_memory(vllm_config, kv_cache_spec, available_memory)
ERROR 05-16 18:43:14 [core.py:343] File "/home/118/WeClone/.venv/lib/python3.10/site-packages/vllm/v1/core/kv_cache_utils.py", line 468, in check_enough_kv_cache_memory
ERROR 05-16 18:43:14 [core.py:343] raise ValueError("No available memory for the cache blocks. "
ERROR 05-16 18:43:14 [core.py:343] ValueError: No available memory for the cache blocks. Try increasing `gpu_memory_utilization` when initializing the engine.
ERROR 05-16 18:43:14 [core.py:343]
CRITICAL 05-16 18:43:14 [core_client.py:269] Got fatal signal from worker processes, shutting down. See stack trace above for root cause issue.
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.133.20 Driver Version: 570.133.20 CUDA Version: 12.8 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3090 Off | 00000000:31:00.0 On | N/A |
| 36% 43C P8 30W / 350W | 1317MiB / 24576MiB | 2% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 1926 G /usr/lib/xorg/Xorg 574MiB |
| 0 N/A N/A 2875 C+G ...c/gnome-remote-desktop-daemon 258MiB |
| 0 N/A N/A 2925 G /usr/bin/gnome-shell 150MiB |
| 0 N/A N/A 53165 G /opt/google/chrome/chrome 4MiB |
| 0 N/A N/A 53215 G ...ersion=20250515-180047.882000 257MiB |
+-----------------------------------------------------------------------------------------+