Skip to content

Latest commit

 

History

History
727 lines (623 loc) · 28.1 KB

File metadata and controls

727 lines (623 loc) · 28.1 KB
Click here to see the table of contents.

Note that this README is automatically generated - don't edit!

About

Summary

  • Category: Modular MLPerf benchmarks.
  • CM GitHub repository: mlcommons@ck
  • GitHub directory for this script: GitHub
  • CM meta description for this script: _cm.yaml
  • CM "database" tags to find this script: reproduce,mlcommons,mlperf,inference,harness,qualcomm-harness,qualcomm,kilt-harness,kilt
  • Output cached? False

Reuse this script in your project

Install CM automation language

Pull CM repository with this automation

cm pull repo mlcommons@ck

Run this script from command line

  1. cm run script --tags=reproduce,mlcommons,mlperf,inference,harness,qualcomm-harness,qualcomm,kilt-harness,kilt[,variations] [--input_flags]

  2. cmr "reproduce mlcommons mlperf inference harness qualcomm-harness qualcomm kilt-harness kilt[ variations]" [--input_flags]

  • variations can be seen here

  • input_flags can be seen here

Run this script from Python

Click here to expand this section.
import cmind

r = cmind.access({'action':'run'
                  'automation':'script',
                  'tags':'reproduce,mlcommons,mlperf,inference,harness,qualcomm-harness,qualcomm,kilt-harness,kilt'
                  'out':'con',
                  ...
                  (other input keys for this script)
                  ...
                 })

if r['return']>0:
    print (r['error'])

Run this script via GUI

cmr "cm gui" --script="reproduce,mlcommons,mlperf,inference,harness,qualcomm-harness,qualcomm,kilt-harness,kilt"

Use this online GUI to generate CM CMD.

Run this script via Docker (beta)

cm docker script "reproduce mlcommons mlperf inference harness qualcomm-harness qualcomm kilt-harness kilt[ variations]" [--input_flags]


Customization

Variations

  • Internal group (variations should not be selected manually)

    Click here to expand this section.
    • _bert_
      • Environment variables:
        • CM_BENCHMARK: STANDALONE_BERT
        • kilt_model_name: bert
        • kilt_model_seq_length: 384
        • kilt_model_bert_variant: BERT_PACKED
        • kilt_input_format: INT64,1,384:INT64,1,8:INT64,1,384:INT64,1,384
        • kilt_output_format: FLOAT32,1,384:FLOAT32,1,384
        • dataset_squad_tokenized_max_seq_length: 384
        • loadgen_buffer_size: 10833
        • loadgen_dataset_size: 10833
      • Workflow:
        1. Read "deps" on other CM scripts
  • No group (any variation can be selected)

    Click here to expand this section.
    • _activation-count.#
      • Environment variables:
        • CM_MLPERF_QAIC_ACTIVATION_COUNT: #
      • Workflow:
    • _bert-99,offline
      • Workflow:
    • _bert-99,qaic
      • Workflow:
        1. Read "deps" on other CM scripts
          • compile,qaic,model,_bert-99,_pc.99.9980
            • if (CM_MLPERF_SKIP_RUN != True)
            • CM names: --adr.['qaic-model-compiler', 'bert-99-compiler']...
    • _bert-99.9,offline
      • Workflow:
    • _bert-99.9,qaic
      • Workflow:
        1. Read "deps" on other CM scripts
          • compile,qaic,model,_bert-99.9
            • if (CM_MLPERF_SKIP_RUN != True)
            • CM names: --adr.['qaic-model-compiler', 'bert-99.9-compiler']...
    • _bert_,network-client
      • Environment variables:
        • CM_BENCHMARK: NETWORK_BERT_CLIENT
      • Workflow:
    • _bert_,network-server
      • Environment variables:
        • CM_BENCHMARK: NETWORK_BERT_SERVER
      • Workflow:
    • _bert_,qaic
      • Environment variables:
        • kilt_model_batch_size: 1
        • kilt_input_format: UINT32,1,384:UINT32,1,8:UINT32,1,384:UINT32,1,384
        • kilt_input_formata: UINT32,1,384:UINT32,1,384:UINT32,1,384
        • kilt_output_formatia: UINT8,1,384:UINT8,1,384
        • kilt_device_qaic_skip_stage: convert
      • Workflow:
    • _bert_,singlestream
      • Environment variables:
        • kilt_model_batch_size: 1
      • Workflow:
    • _dl2q.24xlarge,bert-99.9,offline
      • Environment variables:
        • qaic_activation_count: 14
      • Workflow:
    • _dl2q.24xlarge,bert-99.9,server
      • Environment variables:
        • qaic_activation_count: 14
      • Workflow:
    • _dl2q.24xlarge,resnet50,offline
      • Environment variables:
        • qaic_activation_count: 3
      • Workflow:
    • _dl2q.24xlarge,resnet50,server
      • Environment variables:
        • qaic_activation_count: 3
      • Workflow:
    • _dl2q.24xlarge,retinanet,offline
      • Environment variables:
        • qaic_activation_count: 14
      • Workflow:
    • _dl2q.24xlarge,retinanet,server
      • Environment variables:
        • qaic_activation_count: 14
      • Workflow:
    • _dl2q.24xlarge,singlestream
      • Environment variables:
        • CM_QAIC_DEVICES: 0
        • qaic_activation_count: 1
      • Workflow:
    • _nsp.16
      • Workflow:
    • _num-devices.4
      • Environment variables:
        • CM_QAIC_DEVICES: 0,1,2,3
      • Workflow:
    • _pro
      • Environment variables:
        • qaic_queue_length: 10
      • Workflow:
    • _pro,num-devices.4,bert-99,offline
      • Environment variables:
        • qaic_activation_count: 16
      • Workflow:
        1. Read "deps" on other CM scripts
    • _pro,num-devices.4,bert-99.9,offline
      • Environment variables:
        • qaic_activation_count: 8
      • Workflow:
        1. Read "deps" on other CM scripts
    • _pro,num-devices.4,bert-99.9,server
      • Environment variables:
        • qaic_activation_count: 16
      • Workflow:
    • _pro,num-devices.4,resnet50,offline
      • Environment variables:
        • qaic_activation_count: 4
      • Workflow:
        1. Read "deps" on other CM scripts
    • _pro,num-devices.4,resnet50,server
      • Environment variables:
        • qaic_activation_count: 4
      • Workflow:
    • _pro,num-devices.4,retinanet,offline
      • Environment variables:
        • qaic_activation_count: 16
      • Workflow:
        1. Read "deps" on other CM scripts
    • _pro,num-devices.4,retinanet,server
      • Environment variables:
        • qaic_activation_count: 16
      • Workflow:
    • _pro,num-devices.4,singlestream
      • Environment variables:
        • CM_QAIC_DEVICES: 0
        • qaic_activation_count: 1
      • Workflow:
    • _rb6,bert-99,offline
      • Environment variables:
        • qaic_activation_count: 9
      • Workflow:
    • _rb6,resnet50,multistream
      • Environment variables:
        • qaic_activation_count: 2
      • Workflow:
    • _rb6,resnet50,offline
      • Environment variables:
        • qaic_activation_count: 2
      • Workflow:
    • _rb6,retinanet,multistream
      • Environment variables:
        • qaic_activation_count: 8
      • Workflow:
    • _rb6,retinanet,offline
      • Environment variables:
        • qaic_activation_count: 9
      • Workflow:
    • _rb6,singlestream
      • Environment variables:
        • qaic_activation_count: 1
      • Workflow:
    • _resnet50,uint8
      • Environment variables:
        • kilt_input_format: UINT8,-1,224,224,3
        • kilt_device_qaic_skip_stage: convert
        • CM_IMAGENET_ACCURACY_DTYPE: int8
      • Workflow:
    • _retinanet,qaic,uint8
      • Environment variables:
        • kilt_device_qaic_skip_stage: convert
        • kilt_input_format: UINT8,1,3,800,800
        • kilt_output_format: INT8,1,1000:INT8,1,1000:INT8,1,1000:INT8,1,1000:INT8,1,1000:INT8,1,1000:INT8,1,1000:INT8,1,1000:INT8,1,1000:INT8,1,1000:INT8,1,4,1000:INT8,14,1000:INT8,1,4,1000:INT8,1,4,1000:INT8,1,4,1000
      • Workflow:
    • _singlestream,resnet50
      • Workflow:
    • _singlestream,retinanet
      • Workflow:
  • Group "batch-size"

    Click here to expand this section.
    • _bs.#
      • Environment variables:
        • kilt_model_batch_size: #
      • Workflow:
    • _bs.0
      • Environment variables:
        • kilt_model_batch_size: 1
      • Workflow:
  • Group "device"

    Click here to expand this section.
    • _cpu (default)
      • Environment variables:
        • CM_MLPERF_DEVICE: cpu
        • kilt_backend_type: cpu
      • Workflow:
    • _cuda
      • Environment variables:
        • CM_MLPERF_DEVICE: gpu
        • CM_MLPERF_DEVICE_LIB_NAMESPEC: cudart
        • kilt_backend_type: gpu
      • Workflow:
    • _qaic
      • Environment variables:
        • CM_MLPERF_DEVICE: qaic
        • CM_MLPERF_DEVICE_LIB_NAMESPEC: QAic
        • kilt_backend_type: qaic
      • Workflow:
        1. Read "deps" on other CM scripts
  • Group "framework"

    Click here to expand this section.
    • _glow
      • Environment variables:
        • device: qaic
        • CM_MLPERF_BACKEND: glow
        • CM_MLPERF_BACKEND_LIB_NAMESPEC: QAic
      • Workflow:
    • _onnxruntime (default)
      • Environment variables:
        • device: onnxrt
        • CM_MLPERF_BACKEND: onnxruntime
        • CM_MLPERF_BACKEND_LIB_NAMESPEC: onnxruntime
      • Workflow:
    • _tensorrt
      • Environment variables:
        • CM_MLPERF_BACKEND: tensorrt
        • device: tensorrt
        • CM_MLPERF_BACKEND_NAME: TensorRT
      • Workflow:
  • Group "loadgen-batch-size"

    Click here to expand this section.
    • _loadgen-batch-size.#
      • Environment variables:
        • CM_MLPERF_LOADGEN_BATCH_SIZE: #
      • Workflow:
  • Group "loadgen-scenario"

    Click here to expand this section.
    • _multistream
      • Environment variables:
        • CM_MLPERF_LOADGEN_SCENARIO: MultiStream
      • Workflow:
    • _offline
      • Environment variables:
        • CM_MLPERF_LOADGEN_SCENARIO: Offline
      • Workflow:
    • _server
      • Environment variables:
        • CM_MLPERF_LOADGEN_SCENARIO: Server
      • Workflow:
    • _singlestream
      • Environment variables:
        • CM_MLPERF_LOADGEN_SCENARIO: SingleStream
      • Workflow:
  • Group "model"

    Click here to expand this section.
    • _bert-99
      • Environment variables:
        • CM_MODEL: bert-99
        • CM_SQUAD_ACCURACY_DTYPE: float32
        • CM_NOT_ML_MODEL_STARTING_WEIGHTS_FILENAME: https://zenodo.org/record/3750364/files/bert_large_v1_1_fake_quant.onnx
      • Workflow:
    • _bert-99.9
      • Environment variables:
        • CM_MODEL: bert-99.9
        • CM_NOT_ML_MODEL_STARTING_WEIGHTS_FILENAME: https://zenodo.org/record/3733910/files/model.onnx
      • Workflow:
    • _resnet50 (default)
      • Environment variables:
        • CM_MODEL: resnet50
        • kilt_model_name: resnet50
        • kilt_input_count: 1
        • kilt_output_count: 1
        • kilt_input_format: FLOAT32,-1,224,224,3
        • kilt_output_format: INT64,-1
        • dataset_imagenet_preprocessed_input_square_side: 224
        • ml_model_has_background_class: YES
        • ml_model_image_height: 224
        • loadgen_buffer_size: 1024
        • loadgen_dataset_size: 50000
        • CM_BENCHMARK: STANDALONE_CLASSIFICATION
      • Workflow:
    • _retinanet
      • Environment variables:
        • CM_MODEL: retinanet
        • CM_ML_MODEL_STARTING_WEIGHTS_FILENAME: https://zenodo.org/record/6617981/files/resnext50_32x4d_fpn.pth
        • kilt_model_name: retinanet
        • kilt_input_count: 1
        • kilt_model_max_detections: 600
        • kilt_output_count: 1
        • kilt_input_format: FLOAT32,-1,3,800,800
        • kilt_output_format: INT64,-1
        • dataset_imagenet_preprocessed_input_square_side: 224
        • ml_model_image_height: 800
        • ml_model_image_width: 800
        • loadgen_buffer_size: 64
        • loadgen_dataset_size: 24576
        • CM_BENCHMARK: STANDALONE_OBJECT_DETECTION
      • Workflow:
        1. Read "deps" on other CM scripts
  • Group "nsp"

    Click here to expand this section.
    • _nsp.#
      • Workflow:
    • _nsp.14
      • Workflow:
  • Group "power-mode"

    Click here to expand this section.
    • _maxn
      • Environment variables:
        • CM_MLPERF_NVIDIA_HARNESS_MAXN: True
      • Workflow:
    • _maxq
      • Environment variables:
        • CM_MLPERF_NVIDIA_HARNESS_MAXQ: True
      • Workflow:
  • Group "precision"

    Click here to expand this section.
    • _fp16
      • Workflow:
    • _fp32
      • Environment variables:
        • CM_IMAGENET_ACCURACY_DTYPE: float32
      • Workflow:
    • _uint8
      • Workflow:
  • Group "run-mode"

    Click here to expand this section.
    • _network-client
      • Environment variables:
        • CM_RUN_MODE: network-client
      • Workflow:
    • _network-server
      • Environment variables:
        • CM_RUN_MODE: network-server
      • Workflow:
    • _standalone (default)
      • Environment variables:
        • CM_RUN_MODE: standalone
      • Workflow:
  • Group "sut"

    Click here to expand this section.
    • _dl2q.24xlarge
      • Environment variables:
        • CM_QAIC_DEVICES: 0,1,2,3,4,5,6,7
        • qaic_queue_length: 4
      • Workflow:
    • _rb6
      • Environment variables:
        • CM_QAIC_DEVICES: 0
        • qaic_queue_length: 6
      • Workflow:

Default variations

_cpu,_onnxruntime,_resnet50,_standalone

Script flags mapped to environment

Click here to expand this section.
  • --count=valueCM_MLPERF_LOADGEN_QUERY_COUNT=value
  • --devices=valueCM_QAIC_DEVICES=value
  • --max_batchsize=valueCM_MLPERF_LOADGEN_MAX_BATCHSIZE=value
  • --mlperf_conf=valueCM_MLPERF_CONF=value
  • --mode=valueCM_MLPERF_LOADGEN_MODE=value
  • --multistream_target_latency=valueCM_MLPERF_LOADGEN_MULTISTREAM_TARGET_LATENCY=value
  • --offline_target_qps=valueCM_MLPERF_LOADGEN_OFFLINE_TARGET_QPS=value
  • --output_dir=valueCM_MLPERF_OUTPUT_DIR=value
  • --performance_sample_count=valueCM_MLPERF_LOADGEN_PERFORMANCE_SAMPLE_COUNT=value
  • --rerun=valueCM_RERUN=value
  • --scenario=valueCM_MLPERF_LOADGEN_SCENARIO=value
  • --server_target_qps=valueCM_MLPERF_LOADGEN_SERVER_TARGET_QPS=value
  • --singlestream_target_latency=valueCM_MLPERF_LOADGEN_SINGLESTREAM_TARGET_LATENCY=value
  • --skip_preprocess=valueCM_SKIP_PREPROCESS_DATASET=value
  • --skip_preprocessing=valueCM_SKIP_PREPROCESS_DATASET=value
  • --target_latency=valueCM_MLPERF_LOADGEN_TARGET_LATENCY=value
  • --target_qps=valueCM_MLPERF_LOADGEN_TARGET_QPS=value
  • --user_conf=valueCM_MLPERF_USER_CONF=value

Above CLI flags can be used in the Python CM API as follows:

r=cm.access({... , "count":...}

Default environment

Click here to expand this section.

These keys can be updated via --env.KEY=VALUE or env dictionary in @input.json or using script flags.

  • CM_BATCH_COUNT: 1
  • CM_BATCH_SIZE: 1
  • CM_FAST_COMPILATION: yes
  • CM_MLPERF_LOADGEN_SCENARIO: Offline
  • CM_MLPERF_LOADGEN_MODE: performance
  • CM_SKIP_PREPROCESS_DATASET: no
  • CM_SKIP_MODEL_DOWNLOAD: no
  • CM_MLPERF_SUT_NAME_IMPLEMENTATION_PREFIX: kilt
  • CM_MLPERF_SKIP_RUN: no
  • CM_KILT_REPO_URL: https://github.com/GATEOverflow/kilt-mlperf
  • CM_QAIC_DEVICES: 0
  • kilt_max_wait_abs: 10000
  • verbosity: 0
  • loadgen_trigger_cold_run: 0

Script workflow, dependencies and native scripts

Click here to expand this section.
  1. Read "deps" on other CM scripts from meta
    • detect,os
    • detect,cpu
    • get,sys-utils-cm
    • get,git,repo
      • CM names: --adr.['kilt-repo']...
    • get,mlcommons,inference,src
      • CM names: --adr.['inference-src']...
    • get,mlcommons,inference,loadgen
      • CM names: --adr.['inference-loadgen']...
    • generate,user-conf,mlperf,inference
      • CM names: --adr.['user-conf-generator']...
    • get,generic-python-lib,_mlperf_logging
      • CM names: --adr.['mlperf-logging']...
    • get,ml-model,resnet50,_fp32,_onnx,_from-tf
      • if (CM_MODEL == resnet50) AND (CM_MLPERF_DEVICE != qaic)
      • CM names: --adr.['resnet50-model', 'ml-model']...
    • compile,qaic,model,_resnet50
      • if (CM_MODEL == resnet50 AND CM_MLPERF_DEVICE == qaic) AND (CM_MLPERF_SKIP_RUN != True)
      • CM names: --adr.['qaic-model-compiler', 'resnet50-compiler']...
    • get,dataset,imagenet,preprocessed,_for.resnet50,_NHWC,_full
      • if (CM_MODEL == resnet50) AND (CM_MLPERF_SKIP_RUN != True)
      • CM names: --adr.['imagenet-preprocessed', 'dataset-preprocessed']...
    • get,squad-vocab
      • if (CM_MODEL in ['bert-99', 'bert-99.9']) AND (CM_MLPERF_SKIP_RUN != True)
      • CM names: --adr.['bert-vocab']...
    • get,dataset,tokenized,squad,_raw
      • if (CM_MODEL in ['bert-99', 'bert-99.9']) AND (CM_MLPERF_SKIP_RUN != True)
      • CM names: --adr.['squad-tokenized']...
    • compile,qaic,model,_retinanet
      • if (CM_MODEL == retinanet AND CM_MLPERF_DEVICE == qaic) AND (CM_MLPERF_SKIP_RUN != True)
      • CM names: --adr.['qaic-model-compiler', 'retinanet-compiler']...
    • get,dataset,preprocessed,openimages,_for.retinanet.onnx,_NCHW,_validation,_custom-annotations
      • if (CM_MODEL == retinanet) AND (CM_MLPERF_SKIP_RUN != True)
      • CM names: --adr.['openimages-preprocessed', 'dataset-preprocessed']...
    • get,lib,onnxruntime,lang-cpp,_cpu
      • if (CM_MLPERF_BACKEND == onnxruntime AND CM_MLPERF_DEVICE == cpu)
    • get,lib,onnxruntime,lang-cpp,_cuda
      • if (CM_MLPERF_BACKEND == onnxruntime AND CM_MLPERF_DEVICE == gpu)
  2. Run "preprocess" function from customize.py
  3. Read "prehook_deps" on other CM scripts from meta
  4. Run native script if exists
  5. Read "posthook_deps" on other CM scripts from meta
  6. Run "postrocess" function from customize.py
  7. Read "post_deps" on other CM scripts from meta

Script output

cmr "reproduce mlcommons mlperf inference harness qualcomm-harness qualcomm kilt-harness kilt[,variations]" [--input_flags] -j

New environment keys (filter)

  • CM_DATASET_*
  • CM_HW_NAME
  • CM_IMAGENET_ACCURACY_DTYPE
  • CM_MAX_EXAMPLES
  • CM_MLPERF_*
  • CM_ML_MODEL_*
  • CM_SQUAD_ACCURACY_DTYPE

New environment keys auto-detected from customize

  • CM_DATASET_LIST
  • CM_MLPERF_CONF
  • CM_MLPERF_DEVICE
  • CM_MLPERF_USER_CONF

Maintainers