<a href="https://colab.research.google.com/github/GabeRichmond/tagalog-bert-comparative-analysis/blob/main/Quantization.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## **Quantization Notebook**

*An Analysis of Quantized and Non-Quantized Tagalog BERT Model Performance using Benchmark Datasets*

Group NRS
*   NGO, Gabriel Richmond R.
*   REYES, Aramis Faye D.
*   SANTIAGO, Spencer Ivan S.

**STEP 1: Specify a BERT Model**

Enter the name of the BERT model to be quantized

In [None]:
model_name = 'jcblaise/bert-tagalog-base-uncased'

**STEP 2: Select a Quantization Method Below**

Choose one quantization method you would like to use on the BERT model

In [None]:
dynamic_quantization = True          # Post-Training Dynamic Quantization
static_quantization = False           # Post-Training Static Quantization
quantization_aware = False            # Quantization-Aware Training

**STEP 3: Select a Dataset Below**

Choose one dataset you would like to quantize the model with (applies to Static and QAT only)

In [None]:
hatespeech = True      # Hate Speech Dataset (Binary Text Classification)
dengue = False          # Dengue Dataset (Multiclass Text Classification)
newsph = False          # NewsPH-NLI Dataset (Sentence Entailment)



---



In order not to waste time or overwrite already-existing quantized models, the following section - **File Check** - will check for them

#### **File Check**

In [None]:
# Google Drive
from google.colab import drive
drive.mount('/content/drive/')

import os

Mounted at /content/drive/


In [None]:
# File Check for Hate Speech (Binary Text Classification)
if hatespeech == True:
  if dynamic_quantization == True:
    if os.path.exists('/content/drive/MyDrive/BERT/hatespeech/models/quantized/dynamic/dynamic_model.pt') == True:
        raise Exception('A dynamic quantized model already exists for the Hate Speech Dataset! Ignore this warning if you wish to requantize')
  elif static_quantization == True:
    if os.path.exists('/content/drive/MyDrive/BERT/hatespeech/models/quantized/static/best_model.pt') == True:
        raise Exception('A static quantized model already exists for the Hate Speech Dataset! Ignore this warning if you wish to requantize')
  elif quantization_aware == True:
    if os.path.exists('/content/drive/MyDrive/BERT/hatespeech/models/quantized/qat/pytorch_model.bin') == True:
        raise Exception('A quantization-aware model already exists for the Hate Speech Dataset! Ignore this warning if you wish to requantize')

In [None]:
# File Check for Dengue (Multilabel Text Classification)
if dengue == True:
  if dynamic_quantization == True:
    if os.path.exists('/content/drive/MyDrive/BERT/dengue/models/quantized/dynamic/dynamic_model.pt') == True:
        raise Exception('A dynamic quantized model already exists for the Dengue Dataset! Ignore this warning if you wish to requantize')
  elif static_quantization == True:
    if os.path.exists('/content/drive/MyDrive/BERT/dengue/models/quantized/static/best_model.pt') == True:
        raise Exception('A static quantized model already exists for the Dengue Dataset! Ignore this warning if you wish to requantize')
  elif quantization_aware == True:
    if os.path.exists('/content/drive/MyDrive/BERT/dengue/models/quantized/qat/pytorch_model.bin') == True:
        raise Exception('A quantization-aware model already exists for the Dengue Dataset! Ignore this warning if you wish to requantize')

In [None]:
# File Check for NewsPH-NLI (Sentence Entailment)
if newsph == True:
  if dynamic_quantization == True:
    if os.path.exists('/content/drive/MyDrive/BERT/newsph/models/quantized/dynamic/dynamic_model.pt') == True:
        raise Exception('A dynamic quantized model already exists for the NewsPH-NLI! Ignore this warning if you wish to requantize')
  elif static_quantization == True:
    if os.path.exists('/content/drive/MyDrive/BERT/newsph/models/quantized/static/best_model.pt') == True:
        raise Exception('A static quantized model already exists for the NewsPH-NLI Dataset! Ignore this warning if you wish to requantize')
  elif quantization_aware == True:
    if os.path.exists('/content/drive/MyDrive/BERT/newsph/models/quantized/qat/pytorch_model.bin') == True:
        raise Exception('A quantization-aware model already exists for the NewsPH-NLI Dataset! Ignore this warning if you wish to requantize')

#### **Imports**

**NOTE:** Huggingface Optimum with Intel Neural Compressor acceleration (used for Quantization-Aware Training) requires the runtime to be restarted after installation in order to avoid 'torch.mps' errors. **Do not forget to rerun the previous three sections (Steps 1-3, no need for File Check) again after restarting**.

In [None]:
if quantization_aware == True:
  # Huggingface Optimum with Intel Neural Compressor acceleration
  !pip install optimum[neural-compressor]

##### Imports for Quantization-Aware Training

In [None]:
if quantization_aware == True:
  # Intel Neural Compressor
  !pip install neural-compressor
  from neural_compressor.config import PostTrainingQuantConfig, TuningCriterion, AccuracyCriterion
  from neural_compressor.quantization import fit

  # Huggingface Datasets
  !pip install datasets
  import datasets
  from datasets import DatasetDict, metric

  # Huggingface Evaluate
  !pip install evaluate
  import evaluate

  # Huggingface Optimum
  from optimum.intel.neural_compressor import INCTrainer

  # Huggingface Transformers
  from transformers import default_data_collator, Trainer, TrainingArguments

  # Intel Neural Compressor
  from neural_compressor import QuantizationAwareTrainingConfig

##### Imports for Static Quantization

In [None]:
if static_quantization == True:
  # Intel Neural Compressor
  !pip install neural-compressor
  from neural_compressor.config import PostTrainingQuantConfig, TuningCriterion, AccuracyCriterion
  from neural_compressor.quantization import fit

  # PyTorch Lightning
  !pip install lightning
  from lightning.pytorch import LightningModule

##### Imports for Dynamic Quantization

In [None]:
if dynamic_quantization == True:
  import torch

##### General Imports

In [None]:
# Huggingface Transformers
!pip install transformers
import transformers
from transformers import BertConfig, BertForSequenceClassification, BertModel, BertTokenizer

# Pandas
import pandas as pd

Collecting transformers
  Downloading transformers-4.30.2-py3-none-any.whl (7.2 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.2/7.2 MB[0m [31m53.3 MB/s[0m eta [36m0:00:00[0m
Collecting huggingface-hub<1.0,>=0.14.1 (from transformers)
  Downloading huggingface_hub-0.15.1-py3-none-any.whl (236 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m236.8/236.8 kB[0m [31m29.3 MB/s[0m eta [36m0:00:00[0m
Collecting tokenizers!=0.11.3,<0.14,>=0.11.1 (from transformers)
  Downloading tokenizers-0.13.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.8/7.8 MB[0m [31m109.1 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting safetensors>=0.3.1 (from transformers)
  Downloading safetensors-0.3.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.3/1.3 MB[0m [31m65.0 MB/s[0m eta [36m0:00:

In [None]:
# Warnings (to disable)
import warnings
warnings.filterwarnings('ignore')
pd.options.mode.chained_assignment = None

### **Load Datasets**

#### **Static Quantization**

In [None]:
if static_quantization == True:
  if hatespeech == True:
    train_df = pd.read_csv('/content/drive/MyDrive/BERT/hatespeech/datasets/train.csv', lineterminator='\n', index_col = 0)        # Training
    val_df = pd.read_csv('/content/drive/MyDrive/BERT/hatespeech/datasets/valid.csv', lineterminator='\n', index_col = 0)          # Validation
    test_df = pd.read_csv('/content/drive/MyDrive/BERT/hatespeech/datasets/test.csv', lineterminator='\n', index_col = 0)          # Testing

  elif dengue == True:
    train_df = pd.read_csv('/content/drive/MyDrive/BERT/dengue/datasets/train.csv', lineterminator='\n', index_col = 0)            # Training
    val_df = pd.read_csv('/content/drive/MyDrive/BERT/dengue/datasets/valid.csv', lineterminator='\n', index_col = 0)              # Validation
    test_df = pd.read_csv('/content/drive/MyDrive/BERT/dengue/datasets/test.csv', lineterminator='\n', index_col = 0)              # Testing

  elif newsph == True:
    train_df = pd.read_csv('/content/drive/MyDrive/BERT/newsph/datasets/train.csv', lineterminator='\n', index_col = 0)            # Training
    val_df = pd.read_csv('/content/drive/MyDrive/BERT/newsph/datasets/valid.csv', lineterminator='\n', index_col = 0)              # Validation
    test_df = pd.read_csv('/content/drive/MyDrive/BERT/newsph/datasets/test.csv', lineterminator='\n', index_col = 0)              # Testing

  else:
    raise Exception

In [None]:
# Double-check for and Delete N/A Entries
if static_quantization == True:

  # Training Dataset
  train_df = train_df.dropna()
  train_df = train_df.reset_index(drop = True)

  # Validation Dataset
  val_df = val_df.dropna()
  val_df = val_df.reset_index(drop = True)

  # Testing Dataset
  test_df = test_df.dropna()
  test_df = test_df.reset_index(drop = True)

#### **Quantization-Aware Training**

In [None]:
if quantization_aware == True:
  if hatespeech == True:
    dataDict = DatasetDict.load_from_disk('/content/drive/MyDrive/BERT/hatespeech/datasets/dataDict')

  elif dengue == True:
    dataDict = DatasetDict.load_from_disk('/content/drive/MyDrive/BERT/dengue/datasets/dataDict')

  elif newsph == True:
    dataDict = DatasetDict.load_from_disk('/content/drive/MyDrive/BERT/newsph/datasets/dataDict')

  else:
    raise Exception

### **Quantization**

#### **Dynamic Quantization**

##### Base Model Preparation

In [None]:
if dynamic_quantization == True:
  bert_model = BertModel.from_pretrained(model_name, return_dict = False)

Downloading (…)lve/main/config.json:   0%|          | 0.00/624 [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/439M [00:00<?, ?B/s]

Some weights of the model checkpoint at jcblaise/bert-tagalog-base-uncased were not used when initializing BertModel: ['cls.predictions.transform.LayerNorm.bias', 'cls.seq_relationship.bias', 'cls.predictions.bias', 'cls.predictions.decoder.weight', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


##### Begin Quantization

In [None]:
if dynamic_quantization == True:
  quantized_model = torch.quantization.quantize_dynamic(bert_model, {torch.nn.Linear}, dtype = torch.qint8)

##### Save Quantized Model

In [None]:
if dynamic_quantization == True:
  if hatespeech == True:
    torch.save(quantized_model, '/content/drive/MyDrive/BERT/hatespeech/models/quantized/dynamic/dynamic_model.pt')

  if dengue == True:
    torch.save(quantized_model, '/content/drive/MyDrive/BERT/dengue/models/quantized/dynamic/dynamic_model.pt')

  if newsph == True:
    torch.save(quantized_model, '/content/drive/MyDrive/BERT/newsph/models/quantized/dynamic/dynamic_model.pt')

#### **Static Quantization**

##### Base Model Preparation

In [None]:
if static_quantization == True:
  class ModelForSQ(LightningModule):
      def __init__(self):
          super(ModelForSQ, self).__init__()
          self.config = BertConfig.from_pretrained(model_name, num_labels = 5 if dengue == True else 2)
          self.model = BertForSequenceClassification.from_pretrained(model_name, config = self.config)

      def forward(self, **inputs):
          return self.model(**inputs)

  model_to_quant = ModelForSQ()

##### Configuration

In [None]:
# Tokenizer
tokenizer = BertTokenizer.from_pretrained(model_name)

# Batch Size and Maximum Length
BATCH_SIZE = 16
MAX_LEN = 128 if dengue == True else 64

Downloading (…)solve/main/vocab.txt:   0%|          | 0.00/256k [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/54.0 [00:00<?, ?B/s]

Downloading (…)lve/main/config.json:   0%|          | 0.00/624 [00:00<?, ?B/s]

In [None]:
if static_quantization == True:
  accuracy_criterion = AccuracyCriterion(tolerable_loss=0.01)
  tuning_criterion = TuningCriterion(max_trials=600)

  conf = PostTrainingQuantConfig(
      approach="static", backend="default", tuning_criterion = tuning_criterion, accuracy_criterion = accuracy_criterion
  )

In [None]:
# Calibration Dataloader with the Hate Speech or Dengue Dataset
if static_quantization == True and (hatespeech or dengue == True):
  class CalibrationDataloader(object):
      def __init__(self):
          self.tokenizer = tokenizer
          self.sequence = val_df.text.tolist()
          self.encoded_input = self.tokenizer(
              self.sequence,
              max_length = MAX_LEN,
              pad_to_max_length = True,
              return_tensors = 'pt'
          )
          self.label = val_df.labels if dengue == True else val_df.label
          self.batch_size = BATCH_SIZE

      def __iter__(self):
          yield self.encoded_input, self.label

  calib_dataloader = CalibrationDataloader()

# Calibration Dataloader with the NewsPH-NLI Dataset
else:
  class CalibrationDataloader(object):
      def __init__(self):
          self.tokenizer = tokenizer
          self.sequence1 = str(val_df.s1)
          self.sequence2 = str(val_df.s2)
          self.encoded_input = self.tokenizer(
              self.sequence1,
              self.sequence2,
              max_length = MAX_LEN,
              truncation = True,
              return_tensors = 'pt'
          )
          self.label = val_df.label
          self.batch_size = BATCH_SIZE

      def __iter__(self):
          yield self.encoded_input, self.label

  calib_dataloader = CalibrationDataloader()

##### Begin Quantization

In [None]:
if static_quantization == True:
  quantized_model = fit(model = model_to_quant.model, conf = conf, calib_dataloader = calib_dataloader)

##### Save Quantized Model

In [None]:
if static_quantization == True:
  if hatespeech == True:
    quantized_model.save('/content/drive/MyDrive/BERT/hatespeech/models/quantized/static')

  if dengue == True:
    quantized_model.save('/content/drive/MyDrive/BERT/dengue/models/quantized/static')

  if newsph == True:
    quantized_model.save('/content/drive/MyDrive/BERT/newsph/models/quantized/static')

#### **Quantization-Aware Training**

##### Base Model Preparation

In [None]:
if quantization_aware == True:
  model_to_quant = BertForSequenceClassification.from_pretrained(
      model_name,
      num_labels = 5 if dengue == True else 2,
      problem_type = "multi_label_classification" if dengue == True else None)

Downloading (…)lve/main/config.json:   0%|          | 0.00/624 [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/439M [00:00<?, ?B/s]

Some weights of the model checkpoint at jcblaise/bert-tagalog-base-uncased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.bias', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.weight', 'cls.predictions.decoder.weight', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForSequenceClassification were not initialized from the mo

##### Configuration

In [None]:
# Tokenizer
tokenizer = BertTokenizer.from_pretrained(model_name)

# Batch Size and Maximum Length
BATCH_SIZE = 16
MAX_LEN = 128 if dengue == True else 64

Downloading (…)solve/main/vocab.txt:   0%|          | 0.00/256k [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/54.0 [00:00<?, ?B/s]

In [None]:
if quantization_aware == True:
  if hatespeech == True:
    output_dir = "/content/drive/MyDrive/BERT/hatespeech/models/quantized/qat"
  elif dengue == True:
    output_dir = "/content/drive/MyDrive/BERT/dengue/models/quantized/qat"
  elif newsph == True:
    output_dir = "/content/drive/MyDrive/BERT/newsph/models/quantized/qat"

In [None]:
# Training Arguments and Function for Computing Metrics
if quantization_aware == True:
  args = TrainingArguments(
      output_dir = output_dir,
      do_train = True,
      do_eval = True,
      evaluation_strategy = "epoch",
      save_strategy = "epoch",
      learning_rate = 2e-5,
      per_device_train_batch_size = BATCH_SIZE,
      per_device_eval_batch_size = BATCH_SIZE,
      num_train_epochs = 1,
      weight_decay = 0.01,
      load_best_model_at_end = True,
      metric_for_best_model = "accuracy",
  )

  def compute_metrics(eval_pred):
      predictions, labels = eval_pred
      predictions = predictions[:, 0]
      return metric.compute(predictions = predictions, references = labels)

In [None]:
# Load Preset Quantization Configuration for QAT and Define Training Parameters
if quantization_aware == True:
  quantization_config = QuantizationAwareTrainingConfig()
  metric = evaluate.load("accuracy")

  trainer = INCTrainer(
      model = model_to_quant,
      quantization_config = quantization_config,
      args = args,
      train_dataset = dataDict["train"].select(range(300)),
      eval_dataset = dataDict["val"].select(range(300)),
      compute_metrics = compute_metrics,
      tokenizer = tokenizer,
      data_collator = default_data_collator,
  )

Downloading builder script:   0%|          | 0.00/4.20k [00:00<?, ?B/s]

##### Begin Quantization

In [None]:
# Begin Training
if quantization_aware:
  trainer.train()

The following columns in the training set don't have a corresponding argument in `BertForSequenceClassification.forward` and have been ignored: text. If text are not expected by `BertForSequenceClassification.forward`,  you can safely ignore this message.
***** Running training *****
  Num examples = 300
  Num Epochs = 1
  Instantaneous batch size per device = 16
  Total train batch size (w. parallel, distributed & accumulation) = 16
  Gradient Accumulation steps = 1
  Total optimization steps = 19
2023-06-21 07:38:00 [INFO] Fx trace of the entire model failed. We will conduct auto quantization


Epoch,Training Loss,Validation Loss


Evaluation of quantized models is not supported by the CUDA backend.
The following columns in the evaluation set don't have a corresponding argument in `BertForSequenceClassification.forward` and have been ignored: text. If text are not expected by `BertForSequenceClassification.forward`,  you can safely ignore this message.


In [None]:
# Begin Evaluation (OPTIONAL)
if quantization_aware == True:
  trainer.evaluate()

Evaluation of quantized models is not supported by the CUDA backend.
The following columns in the evaluation set don't have a corresponding argument in `BertForSequenceClassification.forward` and have been ignored: text. If text are not expected by `BertForSequenceClassification.forward`,  you can safely ignore this message.


Epoch,Training Loss,Validation Loss


##### Save Quantized Model

In [None]:
if quantization_aware == True:
  trainer.save_model()

Saving model checkpoint to /content/drive/MyDrive/BERT/dengue/models/quantized/qat
Configuration saved in /content/drive/MyDrive/BERT/dengue/models/quantized/qat/inc_config.json
Model weights saved in /content/drive/MyDrive/BERT/dengue/models/quantized/qat/pytorch_model.bin
