Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V0.2.0 Release Plan #62

Closed
33 tasks done
TobeyQin opened this issue Apr 20, 2021 · 0 comments
Closed
33 tasks done

V0.2.0 Release Plan #62

TobeyQin opened this issue Apr 20, 2021 · 0 comments
Assignees

Comments

@TobeyQin
Copy link
Contributor

TobeyQin commented Apr 20, 2021

Release Manager

@TobeyQin

Endgame

Feature freeze: May. 10
Code freeze: May. 28
Demo date: May. 28
Bug Bash date: May. 28
Release date: Jun. 4

Main Features

SuperBench Framework Implementation

SB Benchmarks Implementation -- @guoshzhao

    • Design Doc
  1. SB Benchmark Base
    • Benchmark Base
    • Model Base
    • Microbenchmark Base
  2. Environment Build Pipeline ETA:
    • Design Doc
    • Implementation

SB Agent Implementation -- @abuccts

    • Design Doc
  1. SB CLI
    • SB CLI Implementation
    • Integration with SB Runner
    • Integration with SB Executor
  2. SB Executor
    • SB Executor Implementation
  3. SB Runner ETA:
    • SB Runner Implementation
    • Integration with SB Executor

Benchmark Tasks

E2E Benchmarks (including metrics: _float, _half, _float_throughput, _half_throughput) -- @guoshzhao

  • CNN models -- User PyTorch TORCHVISION.MODELS sub-package
  • ResNet: ResNet-50, ResNet-101, ResNet-152
  • DenseNet: DenseNet-169, DenseNet-201 ​
  • VGG: VGG-11, VGG-13, VGG-16, VGG-19​
  • BERT -- Use huggingface Transformers
  • BERT
  • BERT LARGE
  • LSTM -- Use PyTorch TORCH.NN sub-package
  • GPT-2 -- Use huggingface Transformers

Micro Benchmarks

    • GEMM FLOPS (Tool: Nvidia Cutlass Tool) -- @guoshzhao ETA: May 21
    Metrics Unit Description
    FP64 GFLOPS FP64 FLOPS without TensorCore
    FP32 GFLOPS FP32 FLOPS without TensorCore
    FP16 GFLOPS FP16 FLOPS without TensorCore
    FP64(TC) GFLOPS FP64 FLOPS with TensorCore
    TF32(TC) GFLOPS TF32 FLOPS with TensorCore
    FP16(TC) GFLOPS FP16 FLOPS with TensorCore
    BF16(TC) GFLOPS BF16 FLOPS with TensorCore
    INT8(TC) GOPS INT8 FLOPS with TensorCore
    INT4(TC) GOPS INT4 FLOPS with TensorCore
    • KernelLaunch (Tool: MSR-A build) -- @guoshzhao ETA: May 21
    Metrics Unit Description
    Kernel_Launch_Event_Time Time (ms) Dispatch latency measured in GPU time using cudaEventRecord()/hipEventRecord()
    Kernel_Launch_Wall_Time Time (ms) Dispatch latency measured in CPU time
    • Kernel (Tool: MSR-A build) -- @yukirora ETA: May 21
    Metrics Unit Description
    cublasSgemm Time (ms) Cublas Kernel Process Time for cublasSgemm
    cublasSgemmStridedBatched Time (ms) Time for cublasSgemmStridedBatched
    cublasGemmStridedBatchedEx Time (ms) Time for cublasGemmStridedBatchedEx
    cublasGemmEx Time (ms) Time for cublasGemmEx
    cublasCgemm3mStridedBatched Time (ms) Time for cublasCgemm3mStridedBatched
    cublasCgemm Time (ms) Time for cublasCgemm
    Mul_During_NCCL Time (ms) Time for Mul_During_NCCL
    MatMul_During_NCCL Time (ms) Time for MatMul_During_NCCL
    MM_AllReduce_Opsharding Time (ms) Time for MM_AllReduce_Opsharding
    MM_AllGather_Concat_Opsharing Time (ms) Time for MM_AllGather_Concat_Opsharing

Document refine -- @TobeyQin ETA: May 28

  1. Update README file
  • Add SuperBenchmark architecture and refine the goals
  • Define results format

Test Plan

Test Overall Pipeline

  • Set Environment and SuperBench Installation Test
  • Run through the entire benchmarking process
  • CLI commands execute test
  • E2E model benchmark test
  • Micro benchmark test

Utils

  1. Uni-Test platform
  • Add pipelines for CPU/GPU tests
@TobeyQin TobeyQin self-assigned this Apr 29, 2021
@TobeyQin TobeyQin pinned this issue Apr 29, 2021
@TobeyQin TobeyQin unpinned this issue Jul 1, 2021
@TobeyQin TobeyQin closed this as completed Jul 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant