Skip to content

OSError: (External) CUDA error(719), unspecified launch failure. #5372

@ykallan

Description

@ykallan

fastdeploy 部署模型报错,以下是启动命令:

export CUDA_VISIBLE_DEVICES=0
python -m fastdeploy.entrypoints.openai.api_server \
       --model PaddlePaddle/ERNIE-4.5-0.3B-Paddle \
       --port 8180 \
       --metrics-port 8181 \
       --engine-worker-queue-port 8182 \
       --tensor-parallel-size 1 \
       --max-model-len 1024 \
       --max-num-seqs 80 \
       --enable-prefix-caching \
       --swap-space 10

以下是 workerlog.0 完整日志:

which: no ccache in (/root/miniconda3/envs/fastdeploy/bin:/root/miniconda3/condabin:/root/.local/bin:/root/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin)
/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/paddle/utils/cpp_extension/extension_utils.py:718: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md
  warnings.warn(warning_message)
[2025-12-04 18:18:17,261] [    INFO] distributed_strategy.py:335 - distributed strategy initialized
======================= Modified FLAGS detected =======================
FLAGS(name='FLAGS_cudnn_dir', current_value='/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/paddle/../nvidia/cudnn/lib', default_value='')
FLAGS(name='FLAGS_enable_pir_in_executor', current_value=True, default_value=False)
FLAGS(name='FLAGS_pir_interpreter_record_stream_for_gc_cache', current_value=True, default_value=False)
FLAGS(name='FLAGS_nvidia_package_dir', current_value='/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/paddle/../nvidia', default_value='')
FLAGS(name='FLAGS_cusparse_dir', current_value='/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/paddle/../nvidia/cusparse/lib', default_value='')
FLAGS(name='FLAGS_selected_gpus', current_value='0', default_value='')
FLAGS(name='FLAGS_nccl_dir', current_value='/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/paddle/../nvidia/nccl/lib', default_value='')
FLAGS(name='FLAGS_parameters_persistent_mode_in_dy2st', current_value=True, default_value=False)
FLAGS(name='FLAGS_cublas_dir', current_value='/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/paddle/../nvidia/cublas/lib', default_value='')
FLAGS(name='FLAGS_specialize_device_in_dy2st', current_value=True, default_value=False)
FLAGS(name='FLAGS_cusolver_dir', current_value='/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/paddle/../nvidia/cusolver/lib', default_value='')
FLAGS(name='FLAGS_cupti_dir', current_value='/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/paddle/../nvidia/cuda_cupti/lib', default_value='')
FLAGS(name='FLAGS_curand_dir', current_value='/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/paddle/../nvidia/curand/lib', default_value='')
FLAGS(name='FLAGS_cuda_cccl_dir', current_value='/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/paddle/../nvidia/cuda_cccl/include/', default_value='')
=======================================================================
/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/paddle/distributed/parallel.py:1062: UserWarning: Currently not a parallel execution environment, `paddle.distributed.init_parallel_env` will not do anything.
  warnings.warn(
[2025-12-04 18:18:17,264] [    INFO] topology.py:526 - Total 1 pipe comm group(s) create successfully!
/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/paddle/distributed/communication/group.py:145: UserWarning: Current global rank 0 is not in group _default_pg10
  warnings.warn(
[2025-12-04 18:18:17,623] [    INFO] topology.py:526 - Total 1 data comm group(s) create successfully!
[2025-12-04 18:18:17,624] [    INFO] topology.py:526 - Total 1 model comm group(s) create successfully!
[2025-12-04 18:18:17,624] [    INFO] topology.py:526 - Total 1 sharding comm group(s) create successfully!
[2025-12-04 18:18:17,624] [    INFO] topology.py:440 - HybridParallelInfo: rank_id: 0, mp_degree: 1, sharding_degree: 1, pp_degree: 1, dp_degree: 1, sep_degree: 1, mp_group: [0],  sharding_group: [0], pp_group: [0], dp_group: [0], sep:group: None, check/clip group: [0]
[2025-12-04 18:18:17,624] [    INFO] - Using download source: huggingface
[2025-12-04 18:18:17,624] [    INFO] - Loading configuration file PaddlePaddle/ERNIE-4.5-0.3B-Paddle/config.json
[2025-12-04 18:18:17,624] [ WARNING] - You are using a model of type ernie4_5 to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/fastdeploy/model_executor/graph_optimization/utils.py:21: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you.
[2025-12-04 18:18:18,667] [ WARNING] - import w4afp8_gemm_scale_permute Failed!
[2025-12-04 18:18:20,210] [    INFO] - Enabled logits processors: []
INFO     2025-12-04 18:18:20,270 297823 cuda.py[line:59] Using APPEND ATTN backend.
[2025-12-04 18:18:20,270] [    INFO] - queue id is 8182
[2025-12-04 18:18:20,270] [    INFO] - Starting to load model Ernie4_5_ForCausalLM
[2025-12-04 18:18:20,272] [    INFO] - Attention is running in cache kv bfloat16 mode
[2025-12-04 18:18:20,274] [    INFO] - Attention is running in cache kv bfloat16 mode
[2025-12-04 18:18:20,276] [    INFO] - Attention is running in cache kv bfloat16 mode
[2025-12-04 18:18:20,278] [    INFO] - Attention is running in cache kv bfloat16 mode
[2025-12-04 18:18:20,279] [    INFO] - Attention is running in cache kv bfloat16 mode
[2025-12-04 18:18:20,281] [    INFO] - Attention is running in cache kv bfloat16 mode
[2025-12-04 18:18:20,283] [    INFO] - Attention is running in cache kv bfloat16 mode
[2025-12-04 18:18:20,284] [    INFO] - Attention is running in cache kv bfloat16 mode
[2025-12-04 18:18:20,286] [    INFO] - Attention is running in cache kv bfloat16 mode
[2025-12-04 18:18:20,288] [    INFO] - Attention is running in cache kv bfloat16 mode
[2025-12-04 18:18:20,289] [    INFO] - Attention is running in cache kv bfloat16 mode
[2025-12-04 18:18:20,291] [    INFO] - Attention is running in cache kv bfloat16 mode
[2025-12-04 18:18:20,293] [    INFO] - Attention is running in cache kv bfloat16 mode
[2025-12-04 18:18:20,294] [    INFO] - Attention is running in cache kv bfloat16 mode
[2025-12-04 18:18:20,296] [    INFO] - Attention is running in cache kv bfloat16 mode
[2025-12-04 18:18:20,298] [    INFO] - Attention is running in cache kv bfloat16 mode
[2025-12-04 18:18:20,299] [    INFO] - Attention is running in cache kv bfloat16 mode
[2025-12-04 18:18:20,301] [    INFO] - Attention is running in cache kv bfloat16 mode
Loading safetensors checkpoint shards: 100%|██████████| 1/1 [00:00<00:00,  2.10it/s]
[2025-12-04 18:18:20,783] [    INFO] - Model loading took 0.479 seconds
[2025-12-04 18:18:20,783] [    INFO] - Skip saving ,cache disabled
[2025-12-04 18:18:20,785] [    INFO] - Initializing kv cache for all layers. [0]
[2025-12-04 18:18:20,785] [    INFO] - ..creating kv cache for layer 0: (150, 2, 64, 128)
[2025-12-04 18:18:20,786] [    INFO] - ..creating kv cache for layer 1: (150, 2, 64, 128)
[2025-12-04 18:18:20,788] [    INFO] - ..creating kv cache for layer 2: (150, 2, 64, 128)
[2025-12-04 18:18:20,789] [    INFO] - ..creating kv cache for layer 3: (150, 2, 64, 128)
[2025-12-04 18:18:20,790] [    INFO] - ..creating kv cache for layer 4: (150, 2, 64, 128)
[2025-12-04 18:18:20,792] [    INFO] - ..creating kv cache for layer 5: (150, 2, 64, 128)
[2025-12-04 18:18:20,793] [    INFO] - ..creating kv cache for layer 6: (150, 2, 64, 128)
[2025-12-04 18:18:20,795] [    INFO] - ..creating kv cache for layer 7: (150, 2, 64, 128)
[2025-12-04 18:18:20,796] [    INFO] - ..creating kv cache for layer 8: (150, 2, 64, 128)
[2025-12-04 18:18:20,797] [    INFO] - ..creating kv cache for layer 9: (150, 2, 64, 128)
[2025-12-04 18:18:20,799] [    INFO] - ..creating kv cache for layer 10: (150, 2, 64, 128)
[2025-12-04 18:18:20,800] [    INFO] - ..creating kv cache for layer 11: (150, 2, 64, 128)
[2025-12-04 18:18:20,802] [    INFO] - ..creating kv cache for layer 12: (150, 2, 64, 128)
[2025-12-04 18:18:20,803] [    INFO] - ..creating kv cache for layer 13: (150, 2, 64, 128)
[2025-12-04 18:18:20,804] [    INFO] - ..creating kv cache for layer 14: (150, 2, 64, 128)
[2025-12-04 18:18:20,806] [    INFO] - ..creating kv cache for layer 15: (150, 2, 64, 128)
[2025-12-04 18:18:20,807] [    INFO] - ..creating kv cache for layer 16: (150, 2, 64, 128)
[2025-12-04 18:18:20,809] [    INFO] - ..creating kv cache for layer 17: (150, 2, 64, 128)
CUDA error 209 [/paddle/third_party/cccl/cub/cub/util_device.cuh, 83]: no kernel image is available for execution on the device
CUDA error 101 [/paddle/third_party/cccl/cub/cub/util_device.cuh, 102]: invalid device ordinal
/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/paddle/utils/decorator_utils.py:420: Warning: 
Non compatible API. Please refer to https://www.paddlepaddle.org.cn/documentation/docs/en/develop/guides/model_convert/convert_from_pytorch/api_difference/torch/torch.max.html first.
CUDA error 209 [/paddle/third_party/cccl/cub/cub/util_device.cuh, 83]: no kernel image is available for execution on the device
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Traceback (most recent call last):
  File "/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/fastdeploy/engine/../worker/worker_process.py", line 868, in <module>
    run_worker_proc()
  File "/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/fastdeploy/engine/../worker/worker_process.py", line 853, in run_worker_proc
    worker_proc.initialize_kv_cache()
  File "/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/fastdeploy/engine/../worker/worker_process.py", line 374, in initialize_kv_cache
    available_kv_cache_memory = self.worker.determine_available_memory()
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/fastdeploy/worker/gpu_worker.py", line 136, in determine_available_memory
    self.model_runner.profile_run()
  File "/root/miniconda3/envs/fastdeploy/lib/python3.12/contextlib.py", line 81, in inner
    return func(*args, **kwds)
           ^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/fastdeploy/worker/gpu_model_runner.py", line 2249, in profile_run
    self._dummy_run(
  File "/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/fastdeploy/worker/gpu_model_runner.py", line 1745, in _dummy_run
    model_output = self.model(
                   ^^^^^^^^^^^
  File "/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/paddle/nn/layer/layers.py", line 1580, in __call__
    return self.forward(*inputs, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/fastdeploy/model_executor/models/ernie4_5_moe.py", line 657, in forward
    hidden_states = self.ernie(ids_remove_padding=ids_remove_padding, forward_meta=forward_meta)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/fastdeploy/model_executor/graph_optimization/decorator.py", line 68, in __call__
    return self.graph_opt_backend(**kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/fastdeploy/model_executor/graph_optimization/graph_optimization_backend.py", line 145, in __call__
    return self.dy_runnable(**kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/fastdeploy/model_executor/models/ernie4_5_moe.py", line 473, in forward
    hidden_states, residual = self.layers[i](forward_meta, hidden_states, residual)
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/paddle/nn/layer/layers.py", line 1580, in __call__
    return self.forward(*inputs, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/fastdeploy/model_executor/models/ernie4_5_moe.py", line 366, in forward
    hidden_states = self.self_attn(
                    ^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/paddle/nn/layer/layers.py", line 1580, in __call__
    return self.forward(*inputs, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/fastdeploy/model_executor/models/ernie4_5_moe.py", line 286, in forward
    qkv_out = self.qkv_proj(hidden_states)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/paddle/nn/layer/layers.py", line 1580, in __call__
    return self.forward(*inputs, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/fastdeploy/model_executor/layers/linear.py", line 244, in forward_cuda
    linear_out = self.quant_method.apply(self, x)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/fastdeploy/model_executor/layers/linear.py", line 72, in apply
    linear_out = paddle.matmul(x, layer.weight)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/paddle/base/dygraph/generated_tensor_methods_patch.py", line 67, in _matmul
    return _C_ops.matmul(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: (External) CUDA error(719), unspecified launch failure. 
  [Hint: 'cudaErrorLaunchFailure'. An exception occurred on the device while executing a kernel. Common causes include dereferencing an invalid device pointerand accessing out of bounds shared memory. Less common cases can be system specific - more information about these cases canbe found in the system specific user guide. This leaves the process in an inconsistent state and any further CUDA work willreturn the same error. To continue using CUDA, the process must be terminated and relaunched.] (at /paddle/paddle/phi/core/platform/device/gpu/gpu_info.cc:127)

/root/miniconda3/envs/fastdeploy/lib/python3.12/multiprocessing/resource_tracker.py:279: UserWarning: resource_tracker: There appear to be 2 leaked shared_memory objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions