-
Notifications
You must be signed in to change notification settings - Fork 596
Open
Description
Hi @arjunsuresh,
it's successful to build the docker container by running the be below command but still report the errors.
mlcr run-mlperf,inference,_find-performance,_full,_r5.0-dev
--model=retinanet
--implementation=nvidia
--framework=tensorrt
--category=edge
--scenario=Offline
--execution_mode=test
--device=cuda
--docker --quiet
--test_query_count=500
I'm not sure whether the below content is the original beginning of all the failures, if yes, how can I run this inference benchmark on the multiple GPUs such as NVIDIA A6000*2? Thanks.
[W] [TRT] Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors.
***************************************************************************
CM script::benchmark-program/run.sh
Run Directory: /home/mlcuser/MLC/repos/local/cache/get-git-repo_918692f6/repo/closed/NVIDIA
CMD: make run_harness RUN_ARGS=' --benchmarks=retinanet --scenarios=offline --test_mode=PerformanceOnly --offline_expected_qps=1 --user_conf_path=/home/mlcuser/MLC/repos/mlcommons@mlperf-automations/script/generate-mlperf-inference-user-conf/tmp/f29caee5e0e047f3811f7c0ce5f822ff.conf --mlperf_conf_path=/home/mlcuser/MLC/repos/local/cache/get-git-repo_da4c73f6/inference/mlperf.conf --gpu_batch_size=4 --use_deque_limit --no_audit_verify ' 2>&1 | tee '/home/mlcuser/MLC/repos/local/cache/get-mlperf-inference-results-dir_c369e3b3/test_results/2e6ba58d1633-nvidia_original-gpu-tensorrt-vdefault-default_config/retinanet/offline/performance/run_1/console.out'; echo \${PIPESTATUS[0]} > exitstatus
[2025-03-17 09:54:45,591 module.py:5098 DEBUG] - - Running native script "/home/mlcuser/MLC/repos/mlcommons@mlperf-automations/script/benchmark-program/run-ubuntu.sh" from temporal script "tmp-run.sh" in "/home/mlcuser" ...
[2025-03-17 09:54:45,591 module.py:5105 INFO] - ! cd /home/mlcuser
[2025-03-17 09:54:45,591 module.py:5106 INFO] - ! call /home/mlcuser/MLC/repos/mlcommons@mlperf-automations/script/benchmark-program/run-ubuntu.sh from tmp-run.sh
make run_harness RUN_ARGS=' --benchmarks=retinanet --scenarios=offline --test_mode=PerformanceOnly --offline_expected_qps=1 --user_conf_path=/home/mlcuser/MLC/repos/mlcommons@mlperf-automations/script/generate-mlperf-inference-user-conf/tmp/f29caee5e0e047f3811f7c0ce5f822ff.conf --mlperf_conf_path=/home/mlcuser/MLC/repos/local/cache/get-git-repo_da4c73f6/inference/mlperf.conf --gpu_batch_size=4 --use_deque_limit --no_audit_verify ' 2>&1 | tee '/home/mlcuser/MLC/repos/local/cache/get-mlperf-inference-results-dir_c369e3b3/test_results/2e6ba58d1633-nvidia_original-gpu-tensorrt-vdefault-default_config/retinanet/offline/performance/run_1/console.out'; echo ${PIPESTATUS[0]} > exitstatus
[2025-03-17 09:54:50,687 main.py:229 INFO] Detected system ID: KnownSystem.Nvidia_2e6ba58d1633
[2025-03-17 09:54:50,840 harness.py:249 INFO] The harness will load 2 plugins: ['build/plugins/NMSOptPlugin/libnmsoptplugin.so', 'build/plugins/retinanetConcatPlugin/libretinanetconcatplugin.so']
[2025-03-17 09:54:50,840 generate_conf_files.py:107 INFO] Generated measurements/ entries for Nvidia_2e6ba58d1633_TRT/retinanet/Offline
[2025-03-17 09:54:50,840 __init__.py:46 INFO] Running command: ./build/bin/harness_default --plugins="build/plugins/NMSOptPlugin/libnmsoptplugin.so,build/plugins/retinanetConcatPlugin/libretinanetconcatplugin.so" --logfile_outdir="/home/mlcuser/MLC/repos/local/cache/get-mlperf-inference-results-dir_c369e3b3/test_results/2e6ba58d1633-nvidia_original-gpu-tensorrt-vdefault-default_config/retinanet/offline/performance/run_1" --logfile_prefix="mlperf_log_" --performance_sample_count=64 --test_mode="PerformanceOnly" --use_deque_limit=true --gpu_batch_size=4 --map_path="data_maps/open-images-v6-mlperf/val_map.txt" --mlperf_conf_path="/home/mlcuser/MLC/repos/local/cache/get-git-repo_da4c73f6/inference/mlperf.conf" --tensor_path="build/preprocessed_data/open-images-v6-mlperf/validation/Retinanet/int8_linear" --use_graphs=false --user_conf_path="/home/mlcuser/MLC/repos/mlcommons@mlperf-automations/script/generate-mlperf-inference-user-conf/tmp/f29caee5e0e047f3811f7c0ce5f822ff.conf" --gpu_engines="./build/engines/Nvidia_2e6ba58d1633/retinanet/Offline/retinanet-Offline-gpu-b4-int8.lwis_k_99_MaxP.plan" --max_dlas=0 --scenario Offline --model retinanet --response_postprocess openimageeffnms
[2025-03-17 09:54:50,840 __init__.py:53 INFO] Overriding Environment
benchmark : Benchmark.Retinanet
buffer_manager_thread_count : 0
data_dir : /home/mlcuser/MLC/repos/local/cache/get-mlperf-inference-nvidia-scratch-space_17c7d3bd/data
gpu_batch_size : 4
input_dtype : int8
input_format : linear
log_dir : /home/mlcuser/MLC/repos/local/cache/get-git-repo_918692f6/repo/closed/NVIDIA/build/logs/2025.03.17-09.54.48
map_path : data_maps/open-images-v6-mlperf/val_map.txt
mlperf_conf_path : /home/mlcuser/MLC/repos/local/cache/get-git-repo_da4c73f6/inference/mlperf.conf
offline_expected_qps : 1.0
precision : int8
preprocessed_data_dir : /home/mlcuser/MLC/repos/local/cache/get-mlperf-inference-nvidia-scratch-space_17c7d3bd/preprocessed_data
scenario : Scenario.Offline
system : SystemConfiguration(host_cpu_conf=CPUConfiguration(layout={CPU(name='Intel(R) Xeon(R) Platinum 8480+', architecture=<CPUArchitecture.x86_64: AliasedName(name='x86_64', aliases=(), patterns=())>, core_count=56, threads_per_core=2): 2}), host_mem_conf=MemoryConfiguration(host_memory_capacity=Memory(quantity=1.056300396, byte_suffix=<ByteSuffix.TB: (1000, 4)>, _num_bytes=1056300396000), comparison_tolerance=0.05), accelerator_conf=AcceleratorConfiguration(layout=defaultdict(<class 'int'>, {GPU(name='NVIDIA RTX A6000', accelerator_type=<AcceleratorType.Discrete: AliasedName(name='Discrete', aliases=(), patterns=())>, vram=Memory(quantity=47.98828125, byte_suffix=<ByteSuffix.GiB: (1024, 3)>, _num_bytes=51527024640), max_power_limit=300.0, pci_id='0x223010DE', compute_sm=86): 2})), numa_conf=None, system_id='Nvidia_2e6ba58d1633')
tensor_path : build/preprocessed_data/open-images-v6-mlperf/validation/Retinanet/int8_linear
test_mode : PerformanceOnly
use_deque_limit : True
use_graphs : False
user_conf_path : /home/mlcuser/MLC/repos/mlcommons@mlperf-automations/script/generate-mlperf-inference-user-conf/tmp/f29caee5e0e047f3811f7c0ce5f822ff.conf
system_id : Nvidia_2e6ba58d1633
config_name : Nvidia_2e6ba58d1633_retinanet_Offline
workload_setting : WorkloadSetting(HarnessType.LWIS, AccuracyTarget.k_99, PowerSetting.MaxP)
optimization_level : plugin-enabled
num_profiles : 1
config_ver : lwis_k_99_MaxP
accuracy_level : 99%
inference_server : lwis
skip_file_checks : False
power_limit : None
cpu_freq : None
&&&& RUNNING Default_Harness # ./build/bin/harness_default
[I] mlperf.conf path: /home/mlcuser/MLC/repos/local/cache/get-git-repo_da4c73f6/inference/mlperf.conf
[I] user.conf path: /home/mlcuser/MLC/repos/mlcommons@mlperf-automations/script/generate-mlperf-inference-user-conf/tmp/f29caee5e0e047f3811f7c0ce5f822ff.conf
Creating QSL.
Finished Creating QSL.
Setting up SUT.
[I] [TRT] Loaded engine size: 74 MiB
[I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +6, GPU +10, now: CPU 134, GPU 1085 (MiB)
[I] [TRT] [MemUsageChange] Init cuDNN: CPU +2, GPU +10, now: CPU 136, GPU 1095 (MiB)
[I] [TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +68, now: CPU 0, GPU 68 (MiB)
[I] Device:0.GPU: [0] ./build/engines/Nvidia_2e6ba58d1633/retinanet/Offline/retinanet-Offline-gpu-b4-int8.lwis_k_99_MaxP.plan has been successfully loaded.
[I] [TRT] Loaded engine size: 74 MiB
[W] [TRT] Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors.
[I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +6, GPU +10, now: CPU 175, GPU 374 (MiB)
[I] [TRT] [MemUsageChange] Init cuDNN: CPU +1, GPU +10, now: CPU 176, GPU 384 (MiB)
[I] [TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +69, now: CPU 0, GPU 137 (MiB)
[I] Device:1.GPU: [0] ./build/engines/Nvidia_2e6ba58d1633/retinanet/Offline/retinanet-Offline-gpu-b4-int8.lwis_k_99_MaxP.plan has been successfully loaded.
[E] [TRT] 3: [runtime.cpp::~Runtime::401] Error Code 3: API Usage Error (Parameter check failed at: runtime/rt/runtime.cpp::~Runtime::401, condition: mEngineCounter.use_count() == 1 Destroying a runtime before destroying deserialized engines created by the runtime leads to undefined behavior.)
[I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 102, GPU 1097 (MiB)
[I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 102, GPU 1105 (MiB)
[I] [TRT] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +1, GPU +3056, now: CPU 1, GPU 3193 (MiB)
[I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 103, GPU 4169 (MiB)
[I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +10, now: CPU 103, GPU 4179 (MiB)
[I] [TRT] Could not set default profile 0 for execution context. Profile index must be set explicitly.
[I] [TRT] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +3055, now: CPU 1, GPU 6248 (MiB)
[E] [TRT] 3: [executionContext.cpp::setOptimizationProfileInternal::1328] Error Code 3: Internal Error (Profile 0 has been chosen by another IExecutionContext. Use another profileIndex or destroy the IExecutionContext that use this profile.)
F0317 09:54:52.402531 8377 lwis.cpp:245] Check failed: context->setOptimizationProfile(profileIdx) == true (0 vs. 1)
*** Check failure stack trace: ***
@ 0x79b8bdfa81c3 google::LogMessage::Fail()
@ 0x79b8bdfad25b google::LogMessage::SendToLog()
@ 0x79b8bdfa7ebf google::LogMessage::Flush()
@ 0x79b8bdfa86ef google::LogMessageFatal::~LogMessageFatal()
@ 0x5619743e2b1c lwis::Device::Setup()
@ 0x5619743e4ceb lwis::Server::Setup()
@ 0x5619743409d0 doInference()
@ 0x56197433e190 main
@ 0x79b8abb74083 __libc_start_main
@ 0x56197433e71e _start
Aborted (core dumped)
Traceback (most recent call last):
File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/mlcuser/MLC/repos/local/cache/get-git-repo_918692f6/repo/closed/NVIDIA/code/main.py", line 231, in <module>
main(main_args, DETECTED_SYSTEM)
File "/home/mlcuser/MLC/repos/local/cache/get-git-repo_918692f6/repo/closed/NVIDIA/code/main.py", line 144, in main
dispatch_action(main_args, config_dict, workload_setting)
File "/home/mlcuser/MLC/repos/local/cache/get-git-repo_918692f6/repo/closed/NVIDIA/code/main.py", line 202, in dispatch_action
handler.run()
File "/home/mlcuser/MLC/repos/local/cache/get-git-repo_918692f6/repo/closed/NVIDIA/code/actionhandler/base.py", line 82, in run
self.handle_failure()
File "/home/mlcuser/MLC/repos/local/cache/get-git-repo_918692f6/repo/closed/NVIDIA/code/actionhandler/run_harness.py", line 193, in handle_failure
raise RuntimeError("Run harness failed!")
RuntimeError: Run harness failed!
Traceback (most recent call last):
File "/home/mlcuser/MLC/repos/local/cache/get-git-repo_918692f6/repo/closed/NVIDIA/code/actionhandler/run_harness.py", line 161, in handle
result_data = self.harness.run_harness(flag_dict=self.harness_flag_dict, skip_generate_measurements=True)
File "/home/mlcuser/MLC/repos/local/cache/get-git-repo_918692f6/repo/closed/NVIDIA/code/common/harness.py", line 352, in run_harness
output = run_command(self._construct_terminal_command(argstr), get_output=True, custom_env=self.env_vars)
File "/home/mlcuser/MLC/repos/local/cache/get-git-repo_918692f6/repo/closed/NVIDIA/code/common/__init__.py", line 67, in run_command
raise subprocess.CalledProcessError(ret, cmd)
subprocess.CalledProcessError: Command './build/bin/harness_default --plugins="build/plugins/NMSOptPlugin/libnmsoptplugin.so,build/plugins/retinanetConcatPlugin/libretinanetconcatplugin.so" --logfile_outdir="/home/mlcuser/MLC/repos/local/cache/get-mlperf-inference-results-dir_c369e3b3/test_results/2e6ba58d1633-nvidia_original-gpu-tensorrt-vdefault-default_config/retinanet/offline/performance/run_1" --logfile_prefix="mlperf_log_" --performance_sample_count=64 --test_mode="PerformanceOnly" --use_deque_limit=true --gpu_batch_size=4 --map_path="data_maps/open-images-v6-mlperf/val_map.txt" --mlperf_conf_path="/home/mlcuser/MLC/repos/local/cache/get-git-repo_da4c73f6/inference/mlperf.conf" --tensor_path="build/preprocessed_data/open-images-v6-mlperf/validation/Retinanet/int8_linear" --use_graphs=false --user_conf_path="/home/mlcuser/MLC/repos/mlcommons@mlperf-automations/script/generate-mlperf-inference-user-conf/tmp/f29caee5e0e047f3811f7c0ce5f822ff.conf" --gpu_engines="./build/engines/Nvidia_2e6ba58d1633/retinanet/Offline/retinanet-Offline-gpu-b4-int8.lwis_k_99_MaxP.plan" --max_dlas=0 --scenario Offline --model retinanet --response_postprocess openimageeffnms' returned non-zero exit status 134.
make: *** [Makefile:45: run_harness] Error 1
Traceback (most recent call last):
File "/home/mlcuser/.local/bin/mlcr", line 8, in <module>
sys.exit(mlcr())
File "/home/mlcuser/.local/lib/python3.8/site-packages/mlc/main.py", line 86, in mlcr
main()
File "/home/mlcuser/.local/lib/python3.8/site-packages/mlc/main.py", line 173, in main
res = method(run_args)
File "/home/mlcuser/.local/lib/python3.8/site-packages/mlc/script_action.py", line 141, in run
return self.call_script_module_function("run", run_args)
File "/home/mlcuser/.local/lib/python3.8/site-packages/mlc/script_action.py", line 121, in call_script_module_function
result = automation_instance.run(run_args) # Pass args to the run method
File "/home/mlcuser/MLC/repos/mlcommons@mlperf-automations/automation/script/module.py", line 225, in run
r = self._run(i)
File "/home/mlcuser/MLC/repos/mlcommons@mlperf-automations/automation/script/module.py", line 1850, in _run
r = self._call_run_deps(prehook_deps, self.local_env_keys, local_env_keys_from_meta, env, state, const, const_state, add_deps_recursive,
File "/home/mlcuser/MLC/repos/mlcommons@mlperf-automations/automation/script/module.py", line 3300, in _call_run_deps
r = script._run_deps(deps, local_env_keys, env, state, const, const_state, add_deps_recursive, recursion_spaces,
File "/home/mlcuser/MLC/repos/mlcommons@mlperf-automations/automation/script/module.py", line 3470, in _run_deps
r = self.action_object.access(ii)
File "/home/mlcuser/.local/lib/python3.8/site-packages/mlc/action.py", line 56, in access
result = method(options)
File "/home/mlcuser/.local/lib/python3.8/site-packages/mlc/script_action.py", line 141, in run
return self.call_script_module_function("run", run_args)
File "/home/mlcuser/.local/lib/python3.8/site-packages/mlc/script_action.py", line 121, in call_script_module_function
result = automation_instance.run(run_args) # Pass args to the run method
File "/home/mlcuser/MLC/repos/mlcommons@mlperf-automations/automation/script/module.py", line 225, in run
r = self._run(i)
File "/home/mlcuser/MLC/repos/mlcommons@mlperf-automations/automation/script/module.py", line 1893, in _run
r = self._run_deps(post_deps, clean_env_keys_post_deps, env, state, const, const_state, add_deps_recursive, recursion_spaces,
File "/home/mlcuser/MLC/repos/mlcommons@mlperf-automations/automation/script/module.py", line 3470, in _run_deps
r = self.action_object.access(ii)
File "/home/mlcuser/.local/lib/python3.8/site-packages/mlc/action.py", line 56, in access
result = method(options)
File "/home/mlcuser/.local/lib/python3.8/site-packages/mlc/script_action.py", line 141, in run
return self.call_script_module_function("run", run_args)
File "/home/mlcuser/.local/lib/python3.8/site-packages/mlc/script_action.py", line 121, in call_script_module_function
result = automation_instance.run(run_args) # Pass args to the run method
File "/home/mlcuser/MLC/repos/mlcommons@mlperf-automations/automation/script/module.py", line 225, in run
r = self._run(i)
File "/home/mlcuser/MLC/repos/mlcommons@mlperf-automations/automation/script/module.py", line 1893, in _run
r = self._run_deps(post_deps, clean_env_keys_post_deps, env, state, const, const_state, add_deps_recursive, recursion_spaces,
File "/home/mlcuser/MLC/repos/mlcommons@mlperf-automations/automation/script/module.py", line 3470, in _run_deps
r = self.action_object.access(ii)
File "/home/mlcuser/.local/lib/python3.8/site-packages/mlc/action.py", line 56, in access
result = method(options)
File "/home/mlcuser/.local/lib/python3.8/site-packages/mlc/script_action.py", line 141, in run
return self.call_script_module_function("run", run_args)
File "/home/mlcuser/.local/lib/python3.8/site-packages/mlc/script_action.py", line 131, in call_script_module_function
raise ScriptExecutionError(f"Script {function_name} execution failed. Error : {error}")
mlc.script_action.ScriptExecutionError: Script run execution failed. Error : MLC script failed (name = benchmark-program, return code = 512)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Please file an issue at https://github.com/mlcommons/mlperf-automations/issues along with the full MLC command being run and the relevant
or full console log.
Metadata
Metadata
Assignees
Labels
No labels