Skip to content
This repository has been archived by the owner on Jan 15, 2024. It is now read-only.

[Benchmark] Improve NLP Backbone Benchmark #1473

Open
1 of 12 tasks
sxjscience opened this issue Jan 9, 2021 · 0 comments
Open
1 of 12 tasks

[Benchmark] Improve NLP Backbone Benchmark #1473

sxjscience opened this issue Jan 9, 2021 · 0 comments
Labels
enhancement New feature or request help wanted Extra attention is needed performance Performance issues

Comments

@sxjscience
Copy link
Member

sxjscience commented Jan 9, 2021

Description

In GluonNLP, we introduced the benchmarking script in https://github.com/dmlc/gluon-nlp/tree/master/scripts/benchmarks.

The goal is to track the training + inference latency of common NLP backbones so that we can choose the appropriate ones for our task. This will help users train + deploy models with AWS.

Currently, we covered:

  • Huggingface/Transformer-based backbone with FP32 + FP16 training / inference. For FP16 training, we are not profiling against the AMP-based solution so this gives an edge of pytorch, in which we need to fix
  • MXNet 2.0-nightly version (only for community use) + GluonNLP 1.0 with FP32 + FP16 (amp) training / inference.
  • TVM FP32 inference. Due to some recent upgrade of the code base, this is currently broken.

I will share the following action items that I feel are worthwhile doing:

Short-term Bug-fix + Improvement

Automation + Visualization

  • Support launching benchmark job with AWS Batch. Currently tracked in Fix Benchmark #1471.
  • Automate benchmarking process via Github actions.
  • Support visualization of benchmark results

Longer-term Backbone Benchmarking Effort

  • Add JAX/flax-based solution, which is internally using XLA.
  • Support AutoScheduler in TVM benchmark
  • Enable ONNX + TensorRT. This is considered the fastest solution for conducting NLP inference.

Other longer-term efforts

  • Support benchmarks for Data-loaders.
  • Support common end-to-end training benchmarks like the SQuAD 2.0 finetuning. We may focus on single-instance-based benchmarks.

@dmlc/gluon-nlp-committers

@sxjscience sxjscience added enhancement New feature or request help wanted Extra attention is needed performance Performance issues labels Jan 9, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request help wanted Extra attention is needed performance Performance issues
Projects
None yet
Development

No branches or pull requests

1 participant