Skip to content

Latest commit

 

History

History
412 lines (334 loc) · 17.3 KB

File metadata and controls

412 lines (334 loc) · 17.3 KB
Click here to see the table of contents.

Note that this README is automatically generated - don't edit!

About

See extra notes from the authors and contributors.

Summary

  • Category: Modular MLPerf inference benchmark pipeline.
  • CM GitHub repository: mlcommons@ck
  • GitHub directory for this script: GitHub
  • CM meta description for this script: _cm.yaml
  • CM "database" tags to find this script: run-mlperf-inference
  • Output cached? False

Reuse this script in your project

Install CM automation language

Pull CM repository with this automation

cm pull repo mlcommons@ck

Run this script from command line

  1. cm run script --tags=run-mlperf-inference[,variations] [--input_flags]

  2. cmr "run-mlperf-inference[ variations]" [--input_flags]

  • variations can be seen here

  • input_flags can be seen here

Run this script from Python

Click here to expand this section.
import cmind

r = cmind.access({'action':'run'
                  'automation':'script',
                  'tags':'run-mlperf-inference'
                  'out':'con',
                  ...
                  (other input keys for this script)
                  ...
                 })

if r['return']>0:
    print (r['error'])

Run this script via GUI

cmr "cm gui" --script="run-mlperf-inference"

Use this online GUI to generate CM CMD.

Run this script via Docker (beta)

cm docker script "run-mlperf-inference[ variations]" [--input_flags]


Customization

Variations

  • No group (any variation can be selected)

    Click here to expand this section.
    • _all-scenarios
      • Environment variables:
        • CM_MLPERF_LOADGEN_ALL_SCENARIOS: yes
      • Workflow:
    • _compliance
      • Environment variables:
        • CM_MLPERF_LOADGEN_COMPLIANCE: yes
      • Workflow:
    • _dashboard
      • Environment variables:
        • CM_MLPERF_DASHBOARD: on
      • Workflow:
  • Group "benchmark-version"

    Click here to expand this section.
    • _r2.1
      • Environment variables:
        • CM_MLPERF_INFERENCE_VERSION: 2.1
        • CM_RUN_MLPERF_INFERENCE_APP_DEFAULTS: r2.1_default
      • Workflow:
    • _r3.0
      • Environment variables:
        • CM_MLPERF_INFERENCE_VERSION: 3.0
        • CM_RUN_MLPERF_INFERENCE_APP_DEFAULTS: r3.0_default
      • Workflow:
    • _r3.1
      • Environment variables:
        • CM_MLPERF_INFERENCE_VERSION: 3.1
        • CM_RUN_MLPERF_INFERENCE_APP_DEFAULTS: r3.1_default
      • Workflow:
    • _r4.0 (default)
      • Environment variables:
        • CM_MLPERF_INFERENCE_VERSION: 4.0
        • CM_RUN_MLPERF_INFERENCE_APP_DEFAULTS: r4.0_default
      • Workflow:
  • Group "mode"

    Click here to expand this section.
    • _all-modes
      • Environment variables:
        • CM_MLPERF_LOADGEN_ALL_MODES: yes
      • Workflow:
  • Group "submission-generation"

    Click here to expand this section.
    • _accuracy-only
      • Environment variables:
        • CM_MLPERF_LOADGEN_MODE: accuracy
        • CM_MLPERF_SUBMISSION_RUN: yes
        • CM_RUN_MLPERF_ACCURACY: on
        • CM_RUN_SUBMISSION_CHECKER: no
      • Workflow:
    • _find-performance (default)
      • Environment variables:
        • CM_MLPERF_FIND_PERFORMANCE_MODE: yes
        • CM_MLPERF_LOADGEN_ALL_MODES: no
        • CM_MLPERF_LOADGEN_MODE: performance
        • CM_MLPERF_RESULT_PUSH_TO_GITHUB: False
      • Workflow:
    • _performance-only
      • Environment variables:
        • CM_MLPERF_LOADGEN_MODE: performance
        • CM_MLPERF_SUBMISSION_RUN: yes
        • CM_RUN_SUBMISSION_CHECKER: no
      • Workflow:
    • _populate-readme
      • Environment variables:
        • CM_MLPERF_README: yes
        • CM_MLPERF_SUBMISSION_RUN: yes
        • CM_RUN_SUBMISSION_CHECKER: no
      • Workflow:
    • _submission
      • Environment variables:
        • CM_MLPERF_LOADGEN_COMPLIANCE: yes
        • CM_MLPERF_SUBMISSION_RUN: yes
        • CM_RUN_MLPERF_ACCURACY: on
        • CM_RUN_SUBMISSION_CHECKER: yes
        • CM_TAR_SUBMISSION_DIR: yes
      • Workflow:
        1. Read "post_deps" on other CM scripts
          • generate,mlperf,inference,submission
            • if (CM_MLPERF_SKIP_SUBMISSION_GENERATION not in ['yes', 'True'])
            • CM names: --adr.['submission-generator']...
  • Group "submission-generation-style"

    Click here to expand this section.
    • _full
      • Environment variables:
        • CM_MLPERF_SUBMISSION_GENERATION_STYLE: full
      • Workflow:
    • _short (default)
      • Environment variables:
        • CM_MLPERF_SUBMISSION_GENERATION_STYLE: short
      • Workflow:

Default variations

_find-performance,_r4.0,_short

Input description

  • --division MLPerf division {open,closed} (open)
  • --category MLPerf category {edge,datacenter,network} (edge)
  • --device MLPerf device {cpu,cuda,rocm,qaic} (cpu)
  • --model MLPerf model {resnet50,retinanet,bert-99,bert-99.9,3d-unet-99,3d-unet-99.9,rnnt,dlrm-v2-99,dlrm-v2-99.9,gptj-99,gptj-99.9,sdxl,llama2-70b-99,llama2-70b-99.9,mobilenet,efficientnet} (resnet50)
  • --precision MLPerf model precision {float32,float16,bfloat16,int8,uint8}
  • --implementation MLPerf implementation {reference,mil,nvidia-original,intel-original,qualcomm,tflite-cpp} (reference)
  • --backend MLPerf framework (backend) {onnxruntime,tf,pytorch,deepsparse,tensorrt,glow,tvm-onnx} (onnxruntime)
  • --scenario MLPerf scenario {Offline,Server,SingleStream,MultiStream} (Offline)
  • --mode MLPerf benchmark mode {,accuracy,performance}
  • --execution_mode MLPerf execution mode {test,fast,valid} (test)
  • --submitter Submitter name (without space) (CTuning)
  • --results_dir Folder path to store results (defaults to the current working directory)
  • --submission_dir Folder path to store MLPerf submission tree
  • --adr.compiler.tags Compiler for loadgen and any C/C++ part of implementation
  • --adr.inference-src-loadgen.env.CM_GIT_URL Git URL for MLPerf inference sources to build LoadGen (to enable non-reference implementations)
  • --adr.inference-src.env.CM_GIT_URL Git URL for MLPerf inference sources to run benchmarks (to enable non-reference implementations)
  • --adr.mlperf-inference-implementation.max_batchsize Maximum batchsize to be used
  • --adr.mlperf-inference-implementation.num_threads Number of threads (reference & C++ implementation only)
  • --adr.python.name Python virtual environment name (optional)
  • --adr.python.version Force Python version (must have all system deps)
  • --adr.python.version_min Minimal Python version (3.8)
  • --power Measure power {yes,no} (no)
  • --adr.mlperf-power-client.power_server MLPerf Power server IP address (192.168.0.15)
  • --adr.mlperf-power-client.port MLPerf Power server port (4950)
  • --clean Clean run (False)
  • --compliance Whether to run compliance tests (applicable only for closed division) {yes,no} (no)
  • --dashboard_wb_project W&B dashboard project (cm-mlperf-dse-testing)
  • --dashboard_wb_user W&B dashboard user (cmind)
  • --hw_name MLPerf hardware name (for example "gcp.c3_standard_8", "nvidia_orin", "lenovo_p14s_gen_4_windows_11", "macbook_pro_m1_2", "thundercomm_rb6" ...)
  • --multistream_target_latency Set MultiStream target latency
  • --offline_target_qps Set LoadGen Offline target QPS
  • --quiet Quiet run (select default values for all questions) (True)
  • --server_target_qps Set Server target QPS
  • --singlestream_target_latency Set SingleStream target latency
  • --target_latency Set Target latency
  • --target_qps Set LoadGen target QPS
  • --j Print results dictionary to console at the end of the run (True)
  • --jf Record results dictionary to file at the end of the run (mlperf-inference-results)
  • --time Print script execution time at the end of the run (True)

Above CLI flags can be used in the Python CM API as follows:

r=cm.access({... , "division":...}

Script flags mapped to environment

Click here to expand this section.
  • --backend=valueCM_MLPERF_BACKEND=value
  • --batch_size=valueCM_MLPERF_LOADGEN_MAX_BATCHSIZE=value
  • --category=valueCM_MLPERF_SUBMISSION_SYSTEM_TYPE=value
  • --clean=valueCM_MLPERF_CLEAN_ALL=value
  • --compliance=valueCM_MLPERF_LOADGEN_COMPLIANCE=value
  • --dashboard_wb_project=valueCM_MLPERF_DASHBOARD_WANDB_PROJECT=value
  • --dashboard_wb_user=valueCM_MLPERF_DASHBOARD_WANDB_USER=value
  • --debug=valueCM_DEBUG_SCRIPT_BENCHMARK_PROGRAM=value
  • --device=valueCM_MLPERF_DEVICE=value
  • --division=valueCM_MLPERF_SUBMISSION_DIVISION=value
  • --docker=valueCM_MLPERF_USE_DOCKER=value
  • --dump_version_info=valueCM_DUMP_VERSION_INFO=value
  • --execution_mode=valueCM_MLPERF_RUN_STYLE=value
  • --find_performance=valueCM_MLPERF_FIND_PERFORMANCE_MODE=value
  • --gpu_name=valueCM_NVIDIA_GPU_NAME=value
  • --hw_name=valueCM_HW_NAME=value
  • --hw_notes_extra=valueCM_MLPERF_SUT_SW_NOTES_EXTRA=value
  • --imagenet_path=valueIMAGENET_PATH=value
  • --implementation=valueCM_MLPERF_IMPLEMENTATION=value
  • --lang=valueCM_MLPERF_IMPLEMENTATION=value
  • --mode=valueCM_MLPERF_LOADGEN_MODE=value
  • --model=valueCM_MLPERF_MODEL=value
  • --multistream_target_latency=valueCM_MLPERF_LOADGEN_MULTISTREAM_TARGET_LATENCY=value
  • --network=valueCM_NETWORK_LOADGEN=value
  • --offline_target_qps=valueCM_MLPERF_LOADGEN_OFFLINE_TARGET_QPS=value
  • --output_dir=valueOUTPUT_BASE_DIR=value
  • --output_summary=valueMLPERF_INFERENCE_SUBMISSION_SUMMARY=value
  • --output_tar=valueMLPERF_INFERENCE_SUBMISSION_TAR_FILE=value
  • --performance_sample_count=valueCM_MLPERF_LOADGEN_PERFORMANCE_SAMPLE_COUNT=value
  • --power=valueCM_SYSTEM_POWER=value
  • --precision=valueCM_MLPERF_MODEL_PRECISION=value
  • --preprocess_submission=valueCM_RUN_MLPERF_SUBMISSION_PREPROCESSOR=value
  • --push_to_github=valueCM_MLPERF_RESULT_PUSH_TO_GITHUB=value
  • --readme=valueCM_MLPERF_README=value
  • --regenerate_accuracy_file=valueCM_MLPERF_REGENERATE_ACCURACY_FILE=value
  • --regenerate_files=valueCM_REGENERATE_MEASURE_FILES=value
  • --rerun=valueCM_RERUN=value
  • --results_dir=valueOUTPUT_BASE_DIR=value
  • --results_git_url=valueCM_MLPERF_RESULTS_GIT_REPO_URL=value
  • --run_checker=valueCM_RUN_SUBMISSION_CHECKER=value
  • --run_style=valueCM_MLPERF_RUN_STYLE=value
  • --save_console_log=valueCM_SAVE_CONSOLE_LOG=value
  • --scenario=valueCM_MLPERF_LOADGEN_SCENARIO=value
  • --server_target_qps=valueCM_MLPERF_LOADGEN_SERVER_TARGET_QPS=value
  • --singlestream_target_latency=valueCM_MLPERF_LOADGEN_SINGLESTREAM_TARGET_LATENCY=value
  • --skip_submission_generation=valueCM_MLPERF_SKIP_SUBMISSION_GENERATION=value
  • --skip_truncation=valueCM_SKIP_TRUNCATE_ACCURACY=value
  • --submission_dir=valueCM_MLPERF_INFERENCE_SUBMISSION_DIR=value
  • --submitter=valueCM_MLPERF_SUBMITTER=value
  • --sut_servers=valueCM_NETWORK_LOADGEN_SUT_SERVERS=value
  • --sw_notes_extra=valueCM_MLPERF_SUT_SW_NOTES_EXTRA=value
  • --system_type=valueCM_MLPERF_SUBMISSION_SYSTEM_TYPE=value
  • --target_latency=valueCM_MLPERF_LOADGEN_TARGET_LATENCY=value
  • --target_qps=valueCM_MLPERF_LOADGEN_TARGET_QPS=value
  • --test_query_count=valueCM_TEST_QUERY_COUNT=value
  • --threads=valueCM_NUM_THREADS=value

Above CLI flags can be used in the Python CM API as follows:

r=cm.access({... , "backend":...}

Default environment

Click here to expand this section.

These keys can be updated via --env.KEY=VALUE or env dictionary in @input.json or using script flags.

  • CM_MLPERF_IMPLEMENTATION: reference
  • CM_MLPERF_MODEL: resnet50
  • CM_MLPERF_RUN_STYLE: test

Versions

  • master
  • r2.1

Script workflow, dependencies and native scripts

Click here to expand this section.
  1. Read "deps" on other CM scripts from meta
  2. Run "preprocess" function from customize.py
  3. Read "prehook_deps" on other CM scripts from meta
  4. Run native script if exists
  5. Read "posthook_deps" on other CM scripts from meta
  6. Run "postrocess" function from customize.py
  7. Read "post_deps" on other CM scripts from meta

Script output

cmr "run-mlperf-inference[,variations]" [--input_flags] -j

New environment keys (filter)

New environment keys auto-detected from customize


Maintainers