Skip to content

Latest commit

 

History

History
93 lines (83 loc) · 3.66 KB

File metadata and controls

93 lines (83 loc) · 3.66 KB

PyTorch Transformer model Bert-base-uncased for Natural Language Classifier and Question Answering

This document describes evaluation of optimized checkpoints for transformer models Bert-base-uncased for NL Classification tasks and Question Answering tasks.

AIMET installation and setup

Please install and setup AIMET (Torch GPU variant) before proceeding further.

NOTE

  • All AIMET releases are available here: https://github.com/quic/aimet/releases
  • This model has been tested using AIMET version 1.23.0 (i.e. set release_tag="1.23.0" in the above instructions).
  • This model is compatible with the PyTorch GPU variant of AIMET (i.e. set AIMET_VARIANT="torch_gpu" in the above instructions).

Additional Setup Dependencies

pip install datasets==2.4.0
pip install transformers==4.11.3 

Model checkpoint

  • Original full precision checkpoints without downstream training were downloaded through hugging face
  • [Full precision model with downstream training weight files] are automatically downloaded using evaluation script
  • [Quantization optimized model weight files] are automatically downloaded using evaluation script

Dataset

Usage

To run evaluation with QuantSim in AIMET, use the following

python bert_quanteval.py \
        --model_config <MODEL_CONFIGURATION> \
        --per_device_eval_batch_size 4 \
        --output_dir <OUT_DIR> \
  • example

    python bert_quanteval.py --model_config bert_w8a8_rte  --per_device_eval_batch_size 4 --output_dir ./evaluation_result 
    
  • supported values of model_config are "bert_w8a8_rte","bert_w8a8_stsb","bert_w8a8_mrpc","bert_w8a8_cola","bert_w8a8_sst2","bert_w8a8_qnli","bert_w8a8_qqp","bert_w8a8_mnli", "bert_w8a8_squad"

Quantization Configuration

The following configuration has been used for the above models for INT8 quantization:

Results

Below are the results of the Pytorch transformer model Bert for GLUE dataset:

Configuration CoLA (corr) SST-2 (acc) MRPC (f1) STS-B (corr) QQP (acc) MNLI (acc) QNLI (acc) RTE (acc) GLUE
FP32 58.76 93.12 89.93 88.84 90.94 85.19 91.63 66.43 83.11
W8A8 56.93 91.28 90.34 89.13 90.78 81.68 91.14 68.23 82.44