You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[TensorRT-LLM] TensorRT-LLM version: 0.10.0.dev2024050700
[TensorRT-LLM][INFO] Engine version 0.10.0.dev2024050700 found in the config file, assuming engine(s) built by new builder API.
[TensorRT-LLM][WARNING] [json.exception.out_of_range.403] key 'cross_attention' not found
[TensorRT-LLM][WARNING] Optional value for parameter cross_attention will not be set.
[TensorRT-LLM][WARNING] Parameter layer_types cannot be read from json:
[TensorRT-LLM][WARNING] [json.exception.out_of_range.403] key 'layer_types' not found
[TensorRT-LLM][WARNING] [json.exception.type_error.302] type must be string, but is null
[TensorRT-LLM][WARNING] Optional value for parameter kv_cache_quant_algo will not be set.
[TensorRT-LLM][WARNING] [json.exception.out_of_range.403] key 'num_medusa_heads' not found
[TensorRT-LLM][WARNING] Optional value for parameter num_medusa_heads will not be set.
[TensorRT-LLM][WARNING] [json.exception.out_of_range.403] key 'max_draft_len' not found
[TensorRT-LLM][WARNING] Optional value for parameter max_draft_len will not be set.
[TensorRT-LLM][INFO] MPI size: 1, rank: 0
[TensorRT-LLM][WARNING] Device 0 peer access Device 1 is not available.
[TensorRT-LLM][WARNING] Device 0 peer access Device 2 is not available.
[TensorRT-LLM][WARNING] Device 0 peer access Device 3 is not available.
[TensorRT-LLM][WARNING] Device 0 peer access Device 4 is not available.
[TensorRT-LLM][WARNING] Device 0 peer access Device 5 is not available.
[TensorRT-LLM][WARNING] Device 0 peer access Device 6 is not available.
[TensorRT-LLM][WARNING] Device 0 peer access Device 7 is not available.
[TensorRT-LLM][INFO] Loaded engine size: 14495 MiB
[TensorRT-LLM][ERROR] 1: [defaultAllocator.cpp::allocate::19] Error Code 1: Cuda Runtime (out of memory)
[TensorRT-LLM][WARNING] Requested amount of GPU memory (13908726432 bytes) could not be allocated. There may not be enough free memory for allocation to succeed.
[TensorRT-LLM][ERROR] 2: [safeDeserialize.cpp::load::269] Error Code 2: OutOfMemory (no further information)
[01d177e8bded:22501] *** Process received signal ***
[01d177e8bded:22501] Signal: Segmentation fault (11)
[01d177e8bded:22501] Signal code: Address not mapped (1)
[01d177e8bded:22501] Failing at address: 0x8
[01d177e8bded:22501] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x14420)[0x7f02d1a61420]
[01d177e8bded:22501] [ 1] /opt/conda/envs/tensorRT/lib/python3.10/site-packages/tensorrt_llm/libs/libtensorrt_llm.so(_ZN12tensorrt_llm7runtime11TllmRuntimeC2EPKvmRN8nvinfer17ILoggerE+0x19d)[0x7f01372db78d]
[01d177e8bded:22501] [ 2] /opt/conda/envs/tensorRT/lib/python3.10/site-packages/tensorrt_llm/libs/libtensorrt_llm.so(_ZN12tensorrt_llm7runtime10GptSessionC2ERKNS1_6ConfigERKNS0_11ModelConfigERKNS0_11WorldConfigEPKvmSt10shared_ptrIN8nvinfer17ILoggerEE+0x395)[0x7f0137288d25]
[01d177e8bded:22501] [ 3] /opt/conda/envs/tensorRT/lib/python3.10/site-packages/tensorrt_llm/bindings.cpython-310-x86_64-linux-gnu.so(+0xc5459)[0x7f01ab0f0459]
[01d177e8bded:22501] [ 4] /opt/conda/envs/tensorRT/lib/python3.10/site-packages/tensorrt_llm/bindings.cpython-310-x86_64-linux-gnu.so(+0x71b99)[0x7f01ab09cb99]
[01d177e8bded:22501] [ 5] /opt/conda/envs/tensorRT/lib/python3.10/site-packages/tensorrt_llm/bindings.cpython-310-x86_64-linux-gnu.so(+0x54a6c)[0x7f01ab07fa6c]
[01d177e8bded:22501] [ 6] python3[0x4fc697]
[01d177e8bded:22501] [ 7] python3(_PyObject_MakeTpCall+0x25b)[0x4f614b]
[01d177e8bded:22501] [ 8] python3[0x50819f]
[01d177e8bded:22501] [ 9] python3(PyVectorcall_Call+0xb9)[0x508bb9]
[01d177e8bded:22501] [10] python3[0x50607f]
[01d177e8bded:22501] [11] python3[0x4f64b6]
[01d177e8bded:22501] [12] /opt/conda/envs/tensorRT/lib/python3.10/site-packages/tensorrt_llm/bindings.cpython-310-x86_64-linux-gnu.so(+0x540d9)[0x7f01ab07f0d9]
[01d177e8bded:22501] [13] python3(_PyObject_MakeTpCall+0x25b)[0x4f614b]
[01d177e8bded:22501] [14] python3(_PyEval_EvalFrameDefault+0x5757)[0x4f26f7]
[01d177e8bded:22501] [15] python3[0x507eae]
[01d177e8bded:22501] [16] python3(PyObject_Call+0xb8)[0x508858]
[01d177e8bded:22501] [17] python3(_PyEval_EvalFrameDefault+0x2b79)[0x4efb19]
[01d177e8bded:22501] [18] python3[0x591d92]
[01d177e8bded:22501] [19] python3(PyEval_EvalCode+0x87)[0x591cd7]
[01d177e8bded:22501] [20] python3[0x5c2967]
[01d177e8bded:22501] [21] python3[0x5bdad0]
[01d177e8bded:22501] [22] python3[0x45956b]
[01d177e8bded:22501] [23] python3(_PyRun_SimpleFileObject+0x19f)[0x5b805f]
[01d177e8bded:22501] [24] python3(_PyRun_AnyFileObject+0x43)[0x5b7dc3]
[01d177e8bded:22501] [25] python3(Py_RunMain+0x38d)[0x5b4b7d]
[01d177e8bded:22501] [26] python3(Py_BytesMain+0x39)[0x584e49]
[01d177e8bded:22501] [27] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7f02d1725083]
[01d177e8bded:22501] [28] python3[0x584cfe]
[01d177e8bded:22501] *** End of error message ***
Segmentation fault (core dumped)
The text was updated successfully, but these errors were encountered:
It is getting OOM ([TensorRT-LLM][ERROR] 2: [safeDeserialize.cpp::load::269] Error Code 2: OutOfMemory (no further information)), could you please share your system or VM configurations
[TensorRT-LLM] TensorRT-LLM version: 0.10.0.dev2024050700
[TensorRT-LLM][INFO] Engine version 0.10.0.dev2024050700 found in the config file, assuming engine(s) built by new builder API.
[TensorRT-LLM][WARNING] [json.exception.out_of_range.403] key 'cross_attention' not found
[TensorRT-LLM][WARNING] Optional value for parameter cross_attention will not be set.
[TensorRT-LLM][WARNING] Parameter layer_types cannot be read from json:
[TensorRT-LLM][WARNING] [json.exception.out_of_range.403] key 'layer_types' not found
[TensorRT-LLM][WARNING] [json.exception.type_error.302] type must be string, but is null
[TensorRT-LLM][WARNING] Optional value for parameter kv_cache_quant_algo will not be set.
[TensorRT-LLM][WARNING] [json.exception.out_of_range.403] key 'num_medusa_heads' not found
[TensorRT-LLM][WARNING] Optional value for parameter num_medusa_heads will not be set.
[TensorRT-LLM][WARNING] [json.exception.out_of_range.403] key 'max_draft_len' not found
[TensorRT-LLM][WARNING] Optional value for parameter max_draft_len will not be set.
[TensorRT-LLM][INFO] MPI size: 1, rank: 0
[TensorRT-LLM][WARNING] Device 0 peer access Device 1 is not available.
[TensorRT-LLM][WARNING] Device 0 peer access Device 2 is not available.
[TensorRT-LLM][WARNING] Device 0 peer access Device 3 is not available.
[TensorRT-LLM][WARNING] Device 0 peer access Device 4 is not available.
[TensorRT-LLM][WARNING] Device 0 peer access Device 5 is not available.
[TensorRT-LLM][WARNING] Device 0 peer access Device 6 is not available.
[TensorRT-LLM][WARNING] Device 0 peer access Device 7 is not available.
[TensorRT-LLM][INFO] Loaded engine size: 14495 MiB
[TensorRT-LLM][ERROR] 1: [defaultAllocator.cpp::allocate::19] Error Code 1: Cuda Runtime (out of memory)
[TensorRT-LLM][WARNING] Requested amount of GPU memory (13908726432 bytes) could not be allocated. There may not be enough free memory for allocation to succeed.
[TensorRT-LLM][ERROR] 2: [safeDeserialize.cpp::load::269] Error Code 2: OutOfMemory (no further information)
[01d177e8bded:22501] *** Process received signal ***
[01d177e8bded:22501] Signal: Segmentation fault (11)
[01d177e8bded:22501] Signal code: Address not mapped (1)
[01d177e8bded:22501] Failing at address: 0x8
[01d177e8bded:22501] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x14420)[0x7f02d1a61420]
[01d177e8bded:22501] [ 1] /opt/conda/envs/tensorRT/lib/python3.10/site-packages/tensorrt_llm/libs/libtensorrt_llm.so(_ZN12tensorrt_llm7runtime11TllmRuntimeC2EPKvmRN8nvinfer17ILoggerE+0x19d)[0x7f01372db78d]
[01d177e8bded:22501] [ 2] /opt/conda/envs/tensorRT/lib/python3.10/site-packages/tensorrt_llm/libs/libtensorrt_llm.so(_ZN12tensorrt_llm7runtime10GptSessionC2ERKNS1_6ConfigERKNS0_11ModelConfigERKNS0_11WorldConfigEPKvmSt10shared_ptrIN8nvinfer17ILoggerEE+0x395)[0x7f0137288d25]
[01d177e8bded:22501] [ 3] /opt/conda/envs/tensorRT/lib/python3.10/site-packages/tensorrt_llm/bindings.cpython-310-x86_64-linux-gnu.so(+0xc5459)[0x7f01ab0f0459]
[01d177e8bded:22501] [ 4] /opt/conda/envs/tensorRT/lib/python3.10/site-packages/tensorrt_llm/bindings.cpython-310-x86_64-linux-gnu.so(+0x71b99)[0x7f01ab09cb99]
[01d177e8bded:22501] [ 5] /opt/conda/envs/tensorRT/lib/python3.10/site-packages/tensorrt_llm/bindings.cpython-310-x86_64-linux-gnu.so(+0x54a6c)[0x7f01ab07fa6c]
[01d177e8bded:22501] [ 6] python3[0x4fc697]
[01d177e8bded:22501] [ 7] python3(_PyObject_MakeTpCall+0x25b)[0x4f614b]
[01d177e8bded:22501] [ 8] python3[0x50819f]
[01d177e8bded:22501] [ 9] python3(PyVectorcall_Call+0xb9)[0x508bb9]
[01d177e8bded:22501] [10] python3[0x50607f]
[01d177e8bded:22501] [11] python3[0x4f64b6]
[01d177e8bded:22501] [12] /opt/conda/envs/tensorRT/lib/python3.10/site-packages/tensorrt_llm/bindings.cpython-310-x86_64-linux-gnu.so(+0x540d9)[0x7f01ab07f0d9]
[01d177e8bded:22501] [13] python3(_PyObject_MakeTpCall+0x25b)[0x4f614b]
[01d177e8bded:22501] [14] python3(_PyEval_EvalFrameDefault+0x5757)[0x4f26f7]
[01d177e8bded:22501] [15] python3[0x507eae]
[01d177e8bded:22501] [16] python3(PyObject_Call+0xb8)[0x508858]
[01d177e8bded:22501] [17] python3(_PyEval_EvalFrameDefault+0x2b79)[0x4efb19]
[01d177e8bded:22501] [18] python3[0x591d92]
[01d177e8bded:22501] [19] python3(PyEval_EvalCode+0x87)[0x591cd7]
[01d177e8bded:22501] [20] python3[0x5c2967]
[01d177e8bded:22501] [21] python3[0x5bdad0]
[01d177e8bded:22501] [22] python3[0x45956b]
[01d177e8bded:22501] [23] python3(_PyRun_SimpleFileObject+0x19f)[0x5b805f]
[01d177e8bded:22501] [24] python3(_PyRun_AnyFileObject+0x43)[0x5b7dc3]
[01d177e8bded:22501] [25] python3(Py_RunMain+0x38d)[0x5b4b7d]
[01d177e8bded:22501] [26] python3(Py_BytesMain+0x39)[0x584e49]
[01d177e8bded:22501] [27] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7f02d1725083]
[01d177e8bded:22501] [28] python3[0x584cfe]
[01d177e8bded:22501] *** End of error message ***
Segmentation fault (core dumped)
The text was updated successfully, but these errors were encountered: