Click here to see the table of contents.
Note that this README is automatically generated - don't edit!
See extra notes from the authors and contributors.
- Category: Modular MLPerf inference benchmark pipeline.
- CM GitHub repository: mlcommons@ck
- GitHub directory for this script: GitHub
- CM meta description for this script: _cm.yaml
- CM "database" tags to find this script: run-mlperf-inference
- Output cached? False
cm pull repo mlcommons@ck
-
cm run script --tags=run-mlperf-inference[,variations] [--input_flags]
-
cmr "run-mlperf-inference[ variations]" [--input_flags]
Click here to expand this section.
import cmind
r = cmind.access({'action':'run'
'automation':'script',
'tags':'run-mlperf-inference'
'out':'con',
...
(other input keys for this script)
...
})
if r['return']>0:
print (r['error'])
cmr "cm gui" --script="run-mlperf-inference"
Use this online GUI to generate CM CMD.
cm docker script "run-mlperf-inference[ variations]" [--input_flags]
-
No group (any variation can be selected)
Click here to expand this section.
_all-scenarios
- Environment variables:
- CM_MLPERF_LOADGEN_ALL_SCENARIOS:
yes
- CM_MLPERF_LOADGEN_ALL_SCENARIOS:
- Workflow:
- Environment variables:
_compliance
- Environment variables:
- CM_MLPERF_LOADGEN_COMPLIANCE:
yes
- CM_MLPERF_LOADGEN_COMPLIANCE:
- Workflow:
- Environment variables:
_dashboard
- Environment variables:
- CM_MLPERF_DASHBOARD:
on
- CM_MLPERF_DASHBOARD:
- Workflow:
- Environment variables:
-
Group "benchmark-version"
Click here to expand this section.
_r2.1
- Environment variables:
- CM_MLPERF_INFERENCE_VERSION:
2.1
- CM_RUN_MLPERF_INFERENCE_APP_DEFAULTS:
r2.1_default
- CM_MLPERF_INFERENCE_VERSION:
- Workflow:
- Environment variables:
_r3.0
- Environment variables:
- CM_MLPERF_INFERENCE_VERSION:
3.0
- CM_RUN_MLPERF_INFERENCE_APP_DEFAULTS:
r3.0_default
- CM_MLPERF_INFERENCE_VERSION:
- Workflow:
- Environment variables:
_r3.1
- Environment variables:
- CM_MLPERF_INFERENCE_VERSION:
3.1
- CM_RUN_MLPERF_INFERENCE_APP_DEFAULTS:
r3.1_default
- CM_MLPERF_INFERENCE_VERSION:
- Workflow:
- Environment variables:
_r4.0
(default)- Environment variables:
- CM_MLPERF_INFERENCE_VERSION:
4.0
- CM_RUN_MLPERF_INFERENCE_APP_DEFAULTS:
r4.0_default
- CM_MLPERF_INFERENCE_VERSION:
- Workflow:
- Environment variables:
-
Group "mode"
Click here to expand this section.
_all-modes
- Environment variables:
- CM_MLPERF_LOADGEN_ALL_MODES:
yes
- CM_MLPERF_LOADGEN_ALL_MODES:
- Workflow:
- Environment variables:
-
Group "submission-generation"
Click here to expand this section.
_accuracy-only
- Environment variables:
- CM_MLPERF_LOADGEN_MODE:
accuracy
- CM_MLPERF_SUBMISSION_RUN:
yes
- CM_RUN_MLPERF_ACCURACY:
on
- CM_RUN_SUBMISSION_CHECKER:
no
- CM_MLPERF_LOADGEN_MODE:
- Workflow:
- Environment variables:
_find-performance
(default)- Environment variables:
- CM_MLPERF_FIND_PERFORMANCE_MODE:
yes
- CM_MLPERF_LOADGEN_ALL_MODES:
no
- CM_MLPERF_LOADGEN_MODE:
performance
- CM_MLPERF_RESULT_PUSH_TO_GITHUB:
False
- CM_MLPERF_FIND_PERFORMANCE_MODE:
- Workflow:
- Environment variables:
_performance-only
- Environment variables:
- CM_MLPERF_LOADGEN_MODE:
performance
- CM_MLPERF_SUBMISSION_RUN:
yes
- CM_RUN_SUBMISSION_CHECKER:
no
- CM_MLPERF_LOADGEN_MODE:
- Workflow:
- Environment variables:
_populate-readme
- Environment variables:
- CM_MLPERF_README:
yes
- CM_MLPERF_SUBMISSION_RUN:
yes
- CM_RUN_SUBMISSION_CHECKER:
no
- CM_MLPERF_README:
- Workflow:
- Environment variables:
_submission
- Environment variables:
- CM_MLPERF_LOADGEN_COMPLIANCE:
yes
- CM_MLPERF_SUBMISSION_RUN:
yes
- CM_RUN_MLPERF_ACCURACY:
on
- CM_RUN_SUBMISSION_CHECKER:
yes
- CM_TAR_SUBMISSION_DIR:
yes
- CM_MLPERF_LOADGEN_COMPLIANCE:
- Workflow:
- Read "post_deps" on other CM scripts
- generate,mlperf,inference,submission
if (CM_MLPERF_SKIP_SUBMISSION_GENERATION not in ['yes', 'True'])
- CM names:
--adr.['submission-generator']...
- CM script: generate-mlperf-inference-submission
- generate,mlperf,inference,submission
- Read "post_deps" on other CM scripts
- Environment variables:
-
Group "submission-generation-style"
Click here to expand this section.
_full
- Environment variables:
- CM_MLPERF_SUBMISSION_GENERATION_STYLE:
full
- CM_MLPERF_SUBMISSION_GENERATION_STYLE:
- Workflow:
- Environment variables:
_short
(default)- Environment variables:
- CM_MLPERF_SUBMISSION_GENERATION_STYLE:
short
- CM_MLPERF_SUBMISSION_GENERATION_STYLE:
- Workflow:
- Environment variables:
_find-performance,_r4.0,_short
- --division MLPerf division {open,closed} (open)
- --category MLPerf category {edge,datacenter,network} (edge)
- --device MLPerf device {cpu,cuda,rocm,qaic} (cpu)
- --model MLPerf model {resnet50,retinanet,bert-99,bert-99.9,3d-unet-99,3d-unet-99.9,rnnt,dlrm-v2-99,dlrm-v2-99.9,gptj-99,gptj-99.9,sdxl,llama2-70b-99,llama2-70b-99.9,mobilenet,efficientnet} (resnet50)
- --precision MLPerf model precision {float32,float16,bfloat16,int8,uint8}
- --implementation MLPerf implementation {reference,mil,nvidia-original,intel-original,qualcomm,tflite-cpp} (reference)
- --backend MLPerf framework (backend) {onnxruntime,tf,pytorch,deepsparse,tensorrt,glow,tvm-onnx} (onnxruntime)
- --scenario MLPerf scenario {Offline,Server,SingleStream,MultiStream} (Offline)
- --mode MLPerf benchmark mode {,accuracy,performance}
- --execution_mode MLPerf execution mode {test,fast,valid} (test)
- --submitter Submitter name (without space) (CTuning)
- --results_dir Folder path to store results (defaults to the current working directory)
- --submission_dir Folder path to store MLPerf submission tree
- --adr.compiler.tags Compiler for loadgen and any C/C++ part of implementation
- --adr.inference-src-loadgen.env.CM_GIT_URL Git URL for MLPerf inference sources to build LoadGen (to enable non-reference implementations)
- --adr.inference-src.env.CM_GIT_URL Git URL for MLPerf inference sources to run benchmarks (to enable non-reference implementations)
- --adr.mlperf-inference-implementation.max_batchsize Maximum batchsize to be used
- --adr.mlperf-inference-implementation.num_threads Number of threads (reference & C++ implementation only)
- --adr.python.name Python virtual environment name (optional)
- --adr.python.version Force Python version (must have all system deps)
- --adr.python.version_min Minimal Python version (3.8)
- --power Measure power {yes,no} (no)
- --adr.mlperf-power-client.power_server MLPerf Power server IP address (192.168.0.15)
- --adr.mlperf-power-client.port MLPerf Power server port (4950)
- --clean Clean run (False)
- --compliance Whether to run compliance tests (applicable only for closed division) {yes,no} (no)
- --dashboard_wb_project W&B dashboard project (cm-mlperf-dse-testing)
- --dashboard_wb_user W&B dashboard user (cmind)
- --hw_name MLPerf hardware name (for example "gcp.c3_standard_8", "nvidia_orin", "lenovo_p14s_gen_4_windows_11", "macbook_pro_m1_2", "thundercomm_rb6" ...)
- --multistream_target_latency Set MultiStream target latency
- --offline_target_qps Set LoadGen Offline target QPS
- --quiet Quiet run (select default values for all questions) (True)
- --server_target_qps Set Server target QPS
- --singlestream_target_latency Set SingleStream target latency
- --target_latency Set Target latency
- --target_qps Set LoadGen target QPS
- --j Print results dictionary to console at the end of the run (True)
- --jf Record results dictionary to file at the end of the run (mlperf-inference-results)
- --time Print script execution time at the end of the run (True)
Above CLI flags can be used in the Python CM API as follows:
r=cm.access({... , "division":...}
Click here to expand this section.
--backend=value
→CM_MLPERF_BACKEND=value
--batch_size=value
→CM_MLPERF_LOADGEN_MAX_BATCHSIZE=value
--category=value
→CM_MLPERF_SUBMISSION_SYSTEM_TYPE=value
--clean=value
→CM_MLPERF_CLEAN_ALL=value
--compliance=value
→CM_MLPERF_LOADGEN_COMPLIANCE=value
--dashboard_wb_project=value
→CM_MLPERF_DASHBOARD_WANDB_PROJECT=value
--dashboard_wb_user=value
→CM_MLPERF_DASHBOARD_WANDB_USER=value
--debug=value
→CM_DEBUG_SCRIPT_BENCHMARK_PROGRAM=value
--device=value
→CM_MLPERF_DEVICE=value
--division=value
→CM_MLPERF_SUBMISSION_DIVISION=value
--docker=value
→CM_MLPERF_USE_DOCKER=value
--dump_version_info=value
→CM_DUMP_VERSION_INFO=value
--execution_mode=value
→CM_MLPERF_RUN_STYLE=value
--find_performance=value
→CM_MLPERF_FIND_PERFORMANCE_MODE=value
--gpu_name=value
→CM_NVIDIA_GPU_NAME=value
--hw_name=value
→CM_HW_NAME=value
--hw_notes_extra=value
→CM_MLPERF_SUT_SW_NOTES_EXTRA=value
--imagenet_path=value
→IMAGENET_PATH=value
--implementation=value
→CM_MLPERF_IMPLEMENTATION=value
--lang=value
→CM_MLPERF_IMPLEMENTATION=value
--mode=value
→CM_MLPERF_LOADGEN_MODE=value
--model=value
→CM_MLPERF_MODEL=value
--multistream_target_latency=value
→CM_MLPERF_LOADGEN_MULTISTREAM_TARGET_LATENCY=value
--network=value
→CM_NETWORK_LOADGEN=value
--offline_target_qps=value
→CM_MLPERF_LOADGEN_OFFLINE_TARGET_QPS=value
--output_dir=value
→OUTPUT_BASE_DIR=value
--output_summary=value
→MLPERF_INFERENCE_SUBMISSION_SUMMARY=value
--output_tar=value
→MLPERF_INFERENCE_SUBMISSION_TAR_FILE=value
--performance_sample_count=value
→CM_MLPERF_LOADGEN_PERFORMANCE_SAMPLE_COUNT=value
--power=value
→CM_SYSTEM_POWER=value
--precision=value
→CM_MLPERF_MODEL_PRECISION=value
--preprocess_submission=value
→CM_RUN_MLPERF_SUBMISSION_PREPROCESSOR=value
--push_to_github=value
→CM_MLPERF_RESULT_PUSH_TO_GITHUB=value
--readme=value
→CM_MLPERF_README=value
--regenerate_accuracy_file=value
→CM_MLPERF_REGENERATE_ACCURACY_FILE=value
--regenerate_files=value
→CM_REGENERATE_MEASURE_FILES=value
--rerun=value
→CM_RERUN=value
--results_dir=value
→OUTPUT_BASE_DIR=value
--results_git_url=value
→CM_MLPERF_RESULTS_GIT_REPO_URL=value
--run_checker=value
→CM_RUN_SUBMISSION_CHECKER=value
--run_style=value
→CM_MLPERF_RUN_STYLE=value
--save_console_log=value
→CM_SAVE_CONSOLE_LOG=value
--scenario=value
→CM_MLPERF_LOADGEN_SCENARIO=value
--server_target_qps=value
→CM_MLPERF_LOADGEN_SERVER_TARGET_QPS=value
--singlestream_target_latency=value
→CM_MLPERF_LOADGEN_SINGLESTREAM_TARGET_LATENCY=value
--skip_submission_generation=value
→CM_MLPERF_SKIP_SUBMISSION_GENERATION=value
--skip_truncation=value
→CM_SKIP_TRUNCATE_ACCURACY=value
--submission_dir=value
→CM_MLPERF_INFERENCE_SUBMISSION_DIR=value
--submitter=value
→CM_MLPERF_SUBMITTER=value
--sut_servers=value
→CM_NETWORK_LOADGEN_SUT_SERVERS=value
--sw_notes_extra=value
→CM_MLPERF_SUT_SW_NOTES_EXTRA=value
--system_type=value
→CM_MLPERF_SUBMISSION_SYSTEM_TYPE=value
--target_latency=value
→CM_MLPERF_LOADGEN_TARGET_LATENCY=value
--target_qps=value
→CM_MLPERF_LOADGEN_TARGET_QPS=value
--test_query_count=value
→CM_TEST_QUERY_COUNT=value
--threads=value
→CM_NUM_THREADS=value
Above CLI flags can be used in the Python CM API as follows:
r=cm.access({... , "backend":...}
Click here to expand this section.
These keys can be updated via --env.KEY=VALUE
or env
dictionary in @input.json
or using script flags.
- CM_MLPERF_IMPLEMENTATION:
reference
- CM_MLPERF_MODEL:
resnet50
- CM_MLPERF_RUN_STYLE:
test
master
r2.1
Click here to expand this section.
- Read "deps" on other CM scripts from meta
- detect,os
- CM script: detect-os
- detect,cpu
- CM script: detect-cpu
- get,python3
- CM names:
--adr.['python', 'python3']...
- CM script: get-python3
- CM names:
- get,mlcommons,inference,src
- CM names:
--adr.['inference-src']...
- CM script: get-mlperf-inference-src
- CM names:
- get,sut,description
- CM script: get-mlperf-inference-sut-description
- get,mlperf,inference,results,dir
if (OUTPUT_BASE_DIR != True)
- CM names:
--adr.['get-mlperf-inference-results-dir']...
- CM script: get-mlperf-inference-results-dir
- install,pip-package,for-cmind-python,_package.tabulate
- CM script: install-pip-package-for-cmind-python
- get,mlperf,inference,utils
- CM script: get-mlperf-inference-utils
- detect,os
- Run "preprocess" function from customize.py
- Read "prehook_deps" on other CM scripts from meta
- Run native script if exists
- Read "posthook_deps" on other CM scripts from meta
- Run "postrocess" function from customize.py
- Read "post_deps" on other CM scripts from meta
cmr "run-mlperf-inference[,variations]" [--input_flags] -j