## How-to Guide: Using a PIP package for fine-tuning a BERT model

Author: [Chen Chen](https://github.com/chenGitHuber)

In this example, we will work through fine-tuning a BERT model using the tensorflow-models PIP package.

## License

Copyright 2020 The TensorFlow Authors. All Rights Reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

## Learning objectives

In this Colab notebook, you will learn how to fine-tune a BERT model using the TensorFlow Model Garden PIP package.

## Enable the GPU acceleration
Please enable GPU for better performance.
*   Navigate to Edit 🡒 Notebook settings
*   Select GPU from the "Hardware Accelerator" drop-down list


## Install the Model Garden PIP package

Install the Model Garden PIP package (tf-models-nightly) and other necessary PIP packages.

In [0]:
pip install tf-models-nightly

Collecting tf-models-nightly
[?25l  Downloading https://files.pythonhosted.org/packages/bd/7c/1390d4e05d4d370e91d32dd9700d3a462dbc560c7f4e95a6477592b17def/tf_models_nightly-2.2.0.dev20200326-py2.py3-none-any.whl (710kB)
[K     |████████████████████████████████| 716kB 2.8MB/s 
Collecting opencv-python-headless
[?25l  Downloading https://files.pythonhosted.org/packages/0b/23/5f10b30a48b218a4884bc84188c14381ac71288b210f6f8079a54f7a05e8/opencv_python_headless-4.2.0.32-cp36-cp36m-manylinux1_x86_64.whl (21.6MB)
[K     |████████████████████████████████| 21.6MB 1.3MB/s 
Collecting tensorflow-model-optimization>=0.2.1
[?25l  Downloading https://files.pythonhosted.org/packages/8f/c4/4c3d011e432bd9c19f0323f7da7d3f783402615e4c3b5a98416c7da9cb05/tensorflow_model_optimization-0.2.1-py2.py3-none-any.whl (93kB)
[K     |████████████████████████████████| 102kB 10.2MB/s 
Collecting mlperf-compliance==0.0.10
  Downloading https://files.pythonhosted.org/packages/f4/08/f2febd8cbd5c9371f7dab311e90400d8

## BERT Fine-tuning

The following code import necessary modules for fine-tuning a BERT model on a classification task.



In [0]:
%tensorflow_version 2.x
import tensorflow as tf

import json
import math

from official.utils.misc import distribution_utils
from official.nlp import optimization
from official.nlp.bert import bert_models
from official.nlp.bert import configs as bert_configs
from official.nlp.bert import run_classifier
from official.modeling import activations
from official.nlp.modeling import networks
from official.nlp.modeling.models import bert_classifier

This section of code performs the following tasks:
* Load data for fine-tuning
* Fine-tune a BERT model
* Save the fine-tuned model to a TensorFlow SavedModel file

Please check [create_finetuning_data.py](https://github.com/tensorflow/models/blob/master/official/nlp/data/create_finetuning_data.py) if you want to know how the train/eval data are created.

In [0]:

train_data_path = "gs://cloud-tpu-checkpoints/bert/classification/mrpc_train.tf_record"
eval_data_path = "gs://cloud-tpu-checkpoints/bert/classification/mrpc_eval.tf_record"
input_meta_path = "gs://cloud-tpu-checkpoints/bert/classification/mrpc_meta_data"

bert_config_file = "gs://cloud-tpu-checkpoints/bert/keras_bert/uncased_L-12_H-768_A-12/bert_config.json"
ckpt_path = 'gs://cloud-tpu-checkpoints/bert/keras_bert/uncased_L-12_H-768_A-12/bert_model.ckpt'

with tf.io.gfile.GFile(input_meta_path, 'rb') as reader:
  input_meta_data = json.loads(reader.read().decode('utf-8'))

max_seq_length = input_meta_data['max_seq_length']
num_classes = input_meta_data['num_labels']
batch_size = 32
eval_batch_size = 32
train_input_fn = run_classifier.get_dataset_fn(train_data_path, max_seq_length, batch_size, is_training=True)
eval_input_fn = run_classifier.get_dataset_fn(eval_data_path, max_seq_length, eval_batch_size, is_training=False)

strategy = distribution_utils.get_distribution_strategy(
      distribution_strategy='one_device', num_gpus=1)

with strategy.scope():
  training_dataset = train_input_fn()
  evaluation_dataset = eval_input_fn()
  bert_config = bert_configs.BertConfig.from_json_file(bert_config_file)
  classifier_model, encoder = bert_models.classifier_model(
      bert_config, num_classes, max_seq_length)

  checkpoint = tf.train.Checkpoint(model=encoder)
  checkpoint.restore(ckpt_path).assert_consumed()

  epochs = 3
  train_data_size = input_meta_data['train_data_size']
  eval_data_size = input_meta_data['eval_data_size']
  steps_per_epoch = int(train_data_size / batch_size)
  warmup_steps = int(epochs * train_data_size * 0.1 / batch_size)
  optimizer = optimization.create_optimizer(
      2e-5, num_train_steps=steps_per_epoch * epochs, num_warmup_steps=warmup_steps)

  def metric_fn():
    return tf.keras.metrics.SparseCategoricalAccuracy(
        'test_accuracy', dtype=tf.float32)

  classifier_model.compile(optimizer=optimizer,
                           loss=run_classifier.get_loss_fn(num_classes=2),
                           metrics=[metric_fn()])
  classifier_model.fit(
        x=training_dataset,
        validation_data=evaluation_dataset,
        steps_per_epoch=steps_per_epoch,
        epochs=epochs,
        validation_steps=int(eval_data_size / eval_batch_size))

  classifier_model.save('/tmp/saved_model', include_optimizer=False, save_format='tf')

Note that input tensors are instantiated via `tensor = tf.keras.Input(shape)`.
The tensor that caused the issue was: input_mask:0
Note that input tensors are instantiated via `tensor = tf.keras.Input(shape)`.
The tensor that caused the issue was: input_type_ids:0
Epoch 1/3
Epoch 2/3
Epoch 3/3
INFO:tensorflow:Assets written to: /tmp/saved_model/assets
