<a href="https://colab.research.google.com/github/hoangtheanhhp/ZaloQA/blob/bert/Zalo_AI_anhht.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# BERT FineTuning on Quora Questions Pairs 

---

In this Colab Notebook, We will try to reproduce state of the art results on Quora Questions Pairs using BERT Model FineTuning. 

If you are not familiar with BERT, Please visit [The Illustrated BERT](http://jalammar.github.io/illustrated-bert/), [BERT Research Paper](https://arxiv.org/abs/1810.04805) and [BERT Github Repo](https://github.com/google-research/bert).

<table class="tfo-notebook-buttons" align="left" >
 <td>
    <a target="_blank" href="https://colab.research.google.com/drive/1dCbs4Th3hzJfWEe6KT-stIVDMqHZSA5V"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/drc10723/bert_quora_question_pairs"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />View Copy on GitHub</a>
  </td>
</table>

\

\

This colab notebook supports both TPU and GPU runtype.

In [0]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


## Setting Up Environment 


**USE_TPU :-** True, If you want to use TPU runtime. First change Colab Notebook runtype to TPU

**BERT_MODEL:-**  Choose BERT model
1.   **uncased_L-12_H-768_A-12**: uncased BERT base model
2.   **uncased_L-24_H-1024_A-16**: uncased BERT large model
3.   **cased_L-12_H-768_A-12:** cased BERT large model

**BUCKET:- ** Add bucket details, It is necessary to add bucket for TPU. For GPU runtype, If Bucket is empty, We will use disk.

For more details on how to setup TPU, Follow [Colab Notebook](https://colab.research.google.com/github/tensorflow/tpu/blob/master/tools/colab/bert_finetuning_with_cloud_tpus.ipynb#scrollTo=RRu1aKO1D7-Z) 



In [0]:
%tensorflow_version 1.x
import os
import sys
import json
import datetime
import pprint
import tensorflow as tf

# Authenticate, so we can access storage bucket and TPU
from google.colab import auth
auth.authenticate_user()

# If you want to use TPU, first switch to tpu runtime in colab
USE_TPU = True #@param{type:"boolean"}

# We will use base uncased bert model, you can give try with large models
# For large model TPU is necessary
BERT_MODEL = '' #@param {type:"string"}

# BERT checkpoint bucket
BERT_PRETRAINED_DIR = 'gs://bertquora/bert_multi/' + BERT_MODEL
print('***** BERT pretrained directory: {} *****'.format(BERT_PRETRAINED_DIR))
!gsutil ls $BERT_PRETRAINED_DIR

# Bucket for saving checkpoints and outputs
BUCKET = 'bertquora' #@param {type:"string"}
if BUCKET!="":
  OUTPUT_DIR = 'gs://{}/zaloAI/bert_0106'.format(BUCKET)
  tf.gfile.MakeDirs(OUTPUT_DIR)
elif USE_TPU:
  raise ValueError('Must specify an existing GCS bucket name for running on TPU')
else:
  OUTPUT_DIR = 'out_dir'
  os.mkdir(OUTPUT_DIR)
print('***** Model output directory: {} *****'.format(OUTPUT_DIR))

if USE_TPU:
  # getting info on TPU runtime
  assert 'COLAB_TPU_ADDR' in os.environ, 'ERROR: Not connected to a TPU runtime; Change notebook runtype to TPU'
  TPU_ADDRESS = 'grpc://' + os.environ['COLAB_TPU_ADDR']
  print('TPU address is', TPU_ADDRESS)


TensorFlow 1.x selected.
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.

***** BERT pretrained directory: gs://bertquora/bert_multi/ *****
gs://bertquora/bert_multi/
gs://bertquora/bert_multi/bert_config.json
gs://bertquora/bert_multi/bert_model.ckpt.data-00000-of-00001
gs://bertquora/bert_multi/bert_model.ckpt.index
gs://bertquora/bert_multi/bert_model.ckpt.meta
gs://bertquora/bert_multi/vocab.txt
***** Model output directory: gs://bertquora/zaloAI/bert_0106 *****
TPU address is grpc://10.119.22.218:8470


## Clone BERT Repo and Download Zalo Dataset 


In [0]:
# Clone BERT repo and add bert in system path
!test -d bert || git clone -q https://github.com/google-research/bert.git
if not 'bert' in sys.path:
  sys.path += ['bert']
# Download QQP Task dataset present in GLUE Tasks.
TASK_DATA_DIR = 'drive/My Drive/ZaloAI/dataset/zalo'

## Model Configs and Hyper Parameters


In [0]:
import modeling
import optimization
import tokenization
import run_classifier
from tqdm import tqdm

# Model Hyper Parameters
TRAIN_BATCH_SIZE = 128 # For GPU, reduce to 16
EVAL_BATCH_SIZE = 8
PREDICT_BATCH_SIZE = 8
LEARNING_RATE = 2e-5
NUM_TRAIN_EPOCHS = 5.0
WARMUP_PROPORTION = 0.1
MAX_SEQ_LENGTH = 512

# Model configs
SAVE_CHECKPOINTS_STEPS = 100
ITERATIONS_PER_LOOP = 100
NUM_TPU_CORES = 8
VOCAB_FILE = os.path.join(BERT_PRETRAINED_DIR, 'vocab.txt')
CONFIG_FILE = os.path.join(BERT_PRETRAINED_DIR, 'bert_config.json')
INIT_CHECKPOINT = os.path.join(BERT_PRETRAINED_DIR, 'bert_model.ckpt')
DO_LOWER_CASE = False





## Read Questions Answer Pair

We will read data from TSV file and covert to list of InputExample. For `InputExample` and `DataProcessor` class defination refer to [run_classifier](https://github.com/google-research/bert/blob/master/run_classifier.py) file 

In [0]:
import json
from run_classifier import DataProcessor, InputExample
class ZQAProcessor(DataProcessor):
  """Processor for the Zalo Quation Answering data set."""

  def _read_json(self, input_file, mode="train"):
    """Reads a tab separated value file."""
    print('read_json..........')
    try:
        with open(filepath, 'r', encoding=encode) as file:
            data = json.load(file)
            if mode == 'squad':
                data = data.get('data')
                res = []
                for d in data:
                    for par in d.get('paragraphs'):
                        for qas in par.get('qas'):
                            ques = qas.get('question')
                            answer = qas.get('answers')
                            if answer and len(answer) > 0:
                                answer = answer[0]
                                res.append([ques,
                                            answer.get('text'),
                                            qas.get('is_impossible')])
                return res

            else:
                return [[data_instance['question'],
                        data_instance['text'],
                        data_instance.get('label', False)]
                        for data_instance in tqdm(data)]
    except FileNotFoundError:
        return []

  def get_train_examples(self, data_dir):
    """See base class."""
    return self._create_examples(
        self._read_json(os.path.join(data_dir, "train.json")), "train")
    
  def get_squad_examples(self, data_dir):
    """See base class."""
    return self._create_examples(
        self._read_json(os.path.join(data_dir, "squad_vi.json", mode="squad")), "train")

  def get_test_examples(self, data_dir):
    """See base class."""
    return self._create_examples(
        self._read_json(os.path.join(data_dir, "test.json")), "test")

  def get_labels(self):
    """See base class."""
    return ["False", "True"]

  def _create_examples(self, lines, set_type):
    """Creates examples for the training and dev sets."""
    examples = []
    for (i, line) in enumerate(lines):
      guid = "%s-%s" % (set_type, i)
      text_a = tokenization.convert_to_unicode(line[0])
      text_b = tokenization.convert_to_unicode(line[1])
      label = tokenization.convert_to_unicode(line[2])
      examples.append(
          InputExample(guid=guid, text_a=text_a, text_b=text_b, label=label))
    return examples

## Convert to Features

We will read examples and tokenize using Wordpiece based tokenization. Finally We will convert to `InputFeatures`.

BERT follows below tokenization procedure
1.   Instantiate an instance of tokenizer = tokenization.FullTokenizer
2.   Tokenize the raw text with tokens = tokenizer.tokenize(raw_text).
3.   Truncate to the maximum sequence length.
4.   Add the [CLS] and [SEP] tokens in the right place.

We need to create `segment_ids`, `input_mask` for `InputFeatures`. `segment_ids` will be `0` for question1 tokens and `1` for question2 tokens.

We will use following functions from [run_classifier](https://github.com/google-research/bert/blob/master/run_classifier.py) file for converting examples to features :-


1.   `convert_single_example` :- Converts a single `InputExample` into a single `InputFeatures`.
2.   `file_based_convert_examples_to_features` :- Convert a set of `InputExamples` to a TF_Record file.

For more details observe outputs for below cells



In [0]:
# Instantiate an instance of ZQAProcessor and tokenizer
processor = ZQAProcessor()
label_list = processor.get_labels()
tokenizer = tokenization.FullTokenizer(vocab_file=VOCAB_FILE, do_lower_case=DO_LOWER_CASE)




In [0]:
# Converting training examples to features
print("################  Processing Training Data #####################")
TRAIN_TF_RECORD = os.path.join(OUTPUT_DIR, "train.tf_record")
TRAIN_SQUAD = os.path.join(OUTPUT_DIR,
train_examples = processor.get_train_examples('drive/My Drive/ZaloAI/dataset/zalo')
train_examples = processor.get_train_examples(TASK_DATA_DIR)
num_train_examples = len(train_examples)
num_train_steps = int(num_train_examples / TRAIN_BATCH_SIZE * NUM_TRAIN_EPOCHS)
num_warmup_steps = int(num_train_steps * WARMUP_PROPORTION)
run_classifier.file_based_convert_examples_to_features(train_examples, label_list, MAX_SEQ_LENGTH, tokenizer, TRAIN_TF_RECORD)

################  Processing Training Data #####################
read_json..........


100%|██████████| 18106/18106 [00:00<00:00, 722562.76it/s]




INFO:tensorflow:Writing example 0 of 18106
INFO:tensorflow:*** Example ***
INFO:tensorflow:guid: train-0
INFO:tensorflow:tokens: [CLS] Ai đang là tổng bí thư Việt Nam hiện nay [SEP] Hiện nay ( 2018 . 1 ) , bà là Bí thư Ban cá ##n sự Đảng Cộng sản Việt Nam , Bí thư Đảng ủy Viện kiểm sát nhân dân thành phố Hải Phòng . [SEP]
INFO:tensorflow:input_ids: 101 19672 21080 10331 23258 57696 30355 14426 12645 13526 21537 102 77513 21537 113 10434 119 122 114 117 27083 10331 90349 30355 21631 23664 10115 12636 37489 26073 16913 14426 12645 117 90349 30355 37489 54931 56995 43229 25133 14694 12486 11629 16851 24964 82419 119 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

## Creating Classification Model

In [0]:
def create_model(bert_config, is_training, input_ids, input_mask, segment_ids,
                 labels, num_labels, use_one_hot_embeddings):
  """Creates a classification model."""
  model = modeling.BertModel(
      config=bert_config,
      is_training=is_training,
      input_ids=input_ids,
      input_mask=input_mask,
      token_type_ids=segment_ids,
      use_one_hot_embeddings=use_one_hot_embeddings)

  # In the demo, we are doing a simple classification task on the entire
  # segment.
  #
  # If you want to use the token-level output, use model.get_sequence_output()
  # instead.
  output_layer = model.get_pooled_output()

  hidden_size = output_layer.shape[-1].value

  output_weights = tf.get_variable(
      "output_weights", [num_labels, hidden_size],
      initializer=tf.truncated_normal_initializer(stddev=0.02))

  output_bias = tf.get_variable(
      "output_bias", [num_labels], initializer=tf.zeros_initializer())

  with tf.variable_scope("loss"):
    if is_training:
      # I.e., 0.1 dropout
      output_layer = tf.nn.dropout(output_layer, keep_prob=0.9)

    logits = tf.matmul(output_layer, output_weights, transpose_b=True)
    logits = tf.nn.bias_add(logits, output_bias)
    probabilities = tf.nn.softmax(logits, axis=-1)
    log_probs = tf.nn.log_softmax(logits, axis=-1)

    one_hot_labels = tf.one_hot(labels, depth=num_labels, dtype=tf.float32)

    per_example_loss = -tf.reduce_sum(one_hot_labels * log_probs, axis=-1)
    loss = tf.reduce_mean(per_example_loss)

    return (loss, per_example_loss, logits, probabilities)

## Model Function Builder for Estimator

Based on mode, We will create optimizer for training, evaluation metrics for evalution and estimator spec

In [0]:
def model_fn_builder(bert_config, num_labels, init_checkpoint, learning_rate,
                     num_train_steps, num_warmup_steps, use_tpu,
                     use_one_hot_embeddings):
  """Returns `model_fn` closure for TPUEstimator."""

  def model_fn(features, labels, mode, params):  # pylint: disable=unused-argument
    """The `model_fn` for TPUEstimator."""

    tf.logging.info("*** Features ***")
    for name in sorted(features.keys()):
      tf.logging.info("  name = %s, shape = %s" % (name, features[name].shape))

    input_ids = features["input_ids"]
    input_mask = features["input_mask"]
    segment_ids = features["segment_ids"]
    label_ids = features["label_ids"]
    is_real_example = None
    if "is_real_example" in features:
      is_real_example = tf.cast(features["is_real_example"], dtype=tf.float32)
    else:
      is_real_example = tf.ones(tf.shape(label_ids), dtype=tf.float32)

    is_training = (mode == tf.estimator.ModeKeys.TRAIN)

    (total_loss, per_example_loss, logits, probabilities) = create_model(
        bert_config, is_training, input_ids, input_mask, segment_ids, label_ids,
        num_labels, use_one_hot_embeddings)

    tvars = tf.trainable_variables()
    initialized_variable_names = {}
    scaffold_fn = None
    if init_checkpoint:
      (assignment_map, initialized_variable_names
      ) = modeling.get_assignment_map_from_checkpoint(tvars, init_checkpoint)
      if use_tpu:

        def tpu_scaffold():
          tf.train.init_from_checkpoint(init_checkpoint, assignment_map)
          return tf.train.Scaffold()

        scaffold_fn = tpu_scaffold
      else:
        tf.train.init_from_checkpoint(init_checkpoint, assignment_map)

    tf.logging.info("**** Trainable Variables ****")
    for var in tvars:
      init_string = ""
      if var.name in initialized_variable_names:
        init_string = ", *INIT_FROM_CKPT*"
      tf.logging.info("  name = %s, shape = %s%s", var.name, var.shape,
                      init_string)

    output_spec = None
    if mode == tf.estimator.ModeKeys.TRAIN:

      train_op = optimization.create_optimizer(
          total_loss, learning_rate, num_train_steps, num_warmup_steps, use_tpu)

      output_spec = tf.contrib.tpu.TPUEstimatorSpec(
          mode=mode,
          loss=total_loss,
          train_op=train_op,
          scaffold_fn=scaffold_fn)
    elif mode == tf.estimator.ModeKeys.EVAL:

      def metric_fn(per_example_loss, label_ids, logits, is_real_example):
        predictions = tf.argmax(logits, axis=-1, output_type=tf.int32)
        accuracy = tf.metrics.accuracy(
            labels=label_ids, predictions=predictions, weights=is_real_example)
        loss = tf.metrics.mean(values=per_example_loss, weights=is_real_example)
        f1_score = tf.contrib.metrics.f1_score(label_ids, predictions)
        recall = tf.compat.v1.metrics.recall(label_ids, predictions)
        precision = tf.compat.v1.metrics.precision(label_ids, predictions)
        return {
            "eval_accuracy": accuracy,
            "eval_loss": loss,
            "f1_score": f1_score,
            "recall": recall,
            "precision": precision
        }

      eval_metrics = (metric_fn,
                      [per_example_loss, label_ids, logits, is_real_example])
      output_spec = tf.contrib.tpu.TPUEstimatorSpec(
          mode=mode,
          loss=total_loss,
          eval_metrics=eval_metrics,
          scaffold_fn=scaffold_fn)
    else:
      output_spec = tf.contrib.tpu.TPUEstimatorSpec(
          mode=mode,
          predictions={"probabilities": probabilities},
          scaffold_fn=scaffold_fn)
    return output_spec

  return model_fn

## Creating TPUEstimator

In [0]:
# Define TPU configs
if USE_TPU:
  tpu_cluster_resolver = tf.contrib.cluster_resolver.TPUClusterResolver(TPU_ADDRESS)
else:
  tpu_cluster_resolver = None
run_config = tf.contrib.tpu.RunConfig(
    cluster=tpu_cluster_resolver,
    model_dir=OUTPUT_DIR,
    save_checkpoints_steps=SAVE_CHECKPOINTS_STEPS,
    tpu_config=tf.contrib.tpu.TPUConfig(
        iterations_per_loop=ITERATIONS_PER_LOOP,
        num_shards=NUM_TPU_CORES,
        per_host_input_for_training=tf.contrib.tpu.InputPipelineConfig.PER_HOST_V2))

In [0]:
# create model function for estimator using model function builder
model_fn = model_fn_builder(
    bert_config=modeling.BertConfig.from_json_file(CONFIG_FILE),
    num_labels=len(label_list),
    init_checkpoint=INIT_CHECKPOINT,
    learning_rate=LEARNING_RATE,
    num_train_steps=num_train_steps,
    num_warmup_steps=num_warmup_steps,
    use_tpu=USE_TPU,
    use_one_hot_embeddings=True)
    

In [0]:
# Defining TPU Estimator
estimator = tf.contrib.tpu.TPUEstimator(
    use_tpu=USE_TPU,
    model_fn=model_fn,
    config=run_config,
    train_batch_size=TRAIN_BATCH_SIZE,
    eval_batch_size=EVAL_BATCH_SIZE)

INFO:tensorflow:Using config: {'_model_dir': 'gs://bertquora/zaloAI/bert_0106', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 100, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
cluster_def {
  job {
    name: "worker"
    tasks {
      key: 0
      value: "10.119.22.218:8470"
    }
  }
}
isolate_session_state: true
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': None, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f47e35e5080>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': 'grpc://10.119.22.218:8470', '_evaluation_master': 'grpc://10.119.22.218:8470', '_is_chief': True, '_num_ps_replic

## Finetune Training



In [0]:
# Train the model.
print('QQP on BERT base model normally takes about 1 hour on TPU and 15-20 hours on GPU. Please wait...')
print('***** Started training at {} *****'.format(datetime.datetime.now()))
print('  Num examples = {}'.format(num_train_examples))
print('  Batch size = {}'.format(TRAIN_BATCH_SIZE))
tf.logging.info("  Num steps = %d", num_train_steps)
# we are using `file_based_input_fn_builder` for creating input function from TF_RECORD file
train_input_1 = run_classifier.file_based_input_fn_builder(TRAIN_TF_RECORD,
                                                            seq_length=MAX_SEQ_LENGTH,
                                                            is_training=True,
                                                            drop_remainder=True)
estimator.train(input_fn=train_input_1, max_steps=num_train_steps)
print('***** Finished training at {} *****'.format(datetime.datetime.now()))

QQP on BERT base model normally takes about 1 hour on TPU and 15-20 hours on GPU. Please wait...
***** Started training at 2020-06-01 07:26:53.721868 *****
  Num examples = 18106
  Batch size = 128
INFO:tensorflow:  Num steps = 707

INFO:tensorflow:Querying Tensorflow master (grpc://10.119.22.218:8470) for TPU system metadata.
INFO:tensorflow:Found TPU system:
INFO:tensorflow:*** Num TPU Cores: 8
INFO:tensorflow:*** Num TPU Workers: 1
INFO:tensorflow:*** Num TPU Cores Per Worker: 8
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:CPU:0, CPU, -1, 5463785012886567022)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:0, TPU, 17179869184, 7915997976428828056)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:1, TPU, 17179869184, 11323418296666605543)
INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:2, TPU, 171

## Evalute FineTuned model
First we will evalute on Train set and Then on Dev set

In [0]:
# eval the model on train set.
print('***** Started Train Set evaluation at {} *****'.format(datetime.datetime.now()))
print('  Num examples = {}'.format(num_train_examples))
print('  Batch size = {}'.format(EVAL_BATCH_SIZE))
# eval input function for train set
train_eval_input_fn = run_classifier.file_based_input_fn_builder(TRAIN_TF_RECORD,
                                                           seq_length=MAX_SEQ_LENGTH,
                                                           is_training=False,
                                                           drop_remainder=True)
# evalute on train set
result = estimator.evaluate(input_fn=train_eval_input_fn, 
                            steps=int(num_train_examples/EVAL_BATCH_SIZE))
print('***** Finished evaluation at {} *****'.format(datetime.datetime.now()))
print("***** Eval results *****")
for key in sorted(result.keys()):
  print('  {} = {}'.format(key, str(result[key])))

***** Started Train Set evaluation at 2020-06-01 07:42:01.598523 *****
  Num examples = 18106
  Batch size = 8
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:*** Features ***
INFO:tensorflow:  name = input_ids, shape = (1, 512)
INFO:tensorflow:  name = input_mask, shape = (1, 512)
INFO:tensorflow:  name = is_real_example, shape = (1,)
INFO:tensorflow:  name = label_ids, shape = (1,)
INFO:tensorflow:  name = segment_ids, shape = (1, 512)
INFO:tensorflow:**** Trainable Variables ****
INFO:tensorflow:  name = bert/embeddings/word_embeddings:0, shape = (119547, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/embeddings/token_type_embeddings:0, shape = (2, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/embeddings/position_embeddings:0, shape = (512, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/embeddings/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/embeddings/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name 

In [0]:
# Converting eval examples to features
print("################  Processing Dev Data #####################")
EVAL_TF_RECORD = os.path.join(OUTPUT_DIR, "eval.tf_record")
eval_examples = processor.get_test_examples(TASK_DATA_DIR)
num_eval_examples = len(eval_examples)
run_classifier.file_based_convert_examples_to_features(eval_examples, label_list, MAX_SEQ_LENGTH, tokenizer, EVAL_TF_RECORD)

################  Processing Dev Data #####################
read_json..........


100%|██████████| 2061/2061 [00:00<00:00, 736325.43it/s]

INFO:tensorflow:Writing example 0 of 2061
INFO:tensorflow:*** Example ***
INFO:tensorflow:guid: test-0
INFO:tensorflow:tokens: [CLS] Con trai đầu của Ngô Qu ##yền tên gì [SEP] Ngô Qu ##yền hạ thành Đại La , giết Công Ti ##ễn rồi bày trận trên sông B ##ạch Đ ##ằng đó ##n quân Nam Hán . [SEP]
INFO:tensorflow:input_ids: 101 12845 34101 11201 10447 58709 71267 96097 15322 49309 102 58709 71267 96097 39446 11629 17404 10159 117 43011 21498 29033 39345 36827 80432 19344 12598 23546 139 29372 298 47279 12393 10115 12488 12645 38056 119 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0




In [0]:
# Eval the model on Dev set.
print('***** Started Dev Set evaluation at {} *****'.format(datetime.datetime.now()))
print('  Num examples = {}'.format(num_eval_examples))
print('  Batch size = {}'.format(EVAL_BATCH_SIZE))

# eval input function for dev set
eval_input_fn = run_classifier.file_based_input_fn_builder(EVAL_TF_RECORD,
                                                           seq_length=MAX_SEQ_LENGTH,
                                                           is_training=False,
                                                           drop_remainder=True)
# evalute on dev set
result = estimator.evaluate(input_fn=eval_input_fn, steps=int(num_eval_examples/EVAL_BATCH_SIZE))
print('***** Finished evaluation at {} *****'.format(datetime.datetime.now()))
print("***** Eval results *****")
for key in sorted(result.keys()):
  print('  {} = {}'.format(key, str(result[key])))

***** Started Dev Set evaluation at 2020-06-01 07:43:44.057972 *****
  Num examples = 2061
  Batch size = 8
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:*** Features ***
INFO:tensorflow:  name = input_ids, shape = (1, 512)
INFO:tensorflow:  name = input_mask, shape = (1, 512)
INFO:tensorflow:  name = is_real_example, shape = (1,)
INFO:tensorflow:  name = label_ids, shape = (1,)
INFO:tensorflow:  name = segment_ids, shape = (1, 512)
INFO:tensorflow:**** Trainable Variables ****
INFO:tensorflow:  name = bert/embeddings/word_embeddings:0, shape = (119547, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/embeddings/token_type_embeddings:0, shape = (2, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/embeddings/position_embeddings:0, shape = (512, 768), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/embeddings/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = bert/embeddings/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
INFO:tensorflow:  name = b


## Evaluation Results


---
Evaluation results are on BERT base uncased model. For reproducing similar results, train for 3 epochs.



|**Metrics** | **Train Set** | **Dev Set** |
|---|---|---|
|**Loss**|0.150|0.497|
|**Accuracy**|0.969|0.907|
|**F1**|0.959|0.875|
|**AUC**|0.969|0.902|
|**Precision**|0.949|0.864|
|**Recall**|0.969|0.886|


## Predictions on Model

First We will predict on custom examples.

For test set, We will get predictions and save in file.

In [0]:
# examples sentences, feel free to change and try
sent_pairs = [("how can i improve my english?", "how can i become fluent in english?"), ("How can i recover old gmail account ?","How can i delete my old gmail account ?"),
             ("How can i recover old gmail account ?","How can i access my old gmail account ?")]

In [0]:
print("*******  Predictions on Custom Data ********")
# create `InputExample` for custom examples
predict_examples = processor.get_predict_examples(sent_pairs)
num_predict_examples = len(predict_examples)

# For TPU, We will append `PaddingExample` for maintaining batch size
if USE_TPU:
  while(len(predict_examples)%EVAL_BATCH_SIZE!=0):
    predict_examples.append(run_classifier.PaddingInputExample())

# Converting to features 
predict_features = run_classifier.convert_examples_to_features(predict_examples, label_list, MAX_SEQ_LENGTH, tokenizer)

print('  Num examples = {}'.format(num_predict_examples))
print('  Batch size = {}'.format(PREDICT_BATCH_SIZE))

# Input function for prediction
predict_input_fn = run_classifier.input_fn_builder(predict_features,
                                                seq_length=MAX_SEQ_LENGTH,
                                                is_training=False,
                                                drop_remainder=False)
result = list(estimator.predict(input_fn=predict_input_fn))
print(result)
for ex_i in range(num_predict_examples):
  print("****** Example {} ******".format(ex_i))
  print("Question1 :", sent_pairs[ex_i][0])
  print("Question2 :", sent_pairs[ex_i][1])
  print("Prediction :", result[ex_i]['probabilities'][1])

*******  Predictions on Custom Data ********


AttributeError: ignored

In [0]:
# Converting test examples to features
print("################  Processing Test Data #####################")
TEST_TF_RECORD = os.path.join(OUTPUT_DIR, "test.tf_record")
test_examples = processor.get_test_examples(TASK_DATA_DIR)
num_test_examples = len(test_examples)
run_classifier.file_based_convert_examples_to_features(test_examples, label_list, MAX_SEQ_LENGTH, tokenizer, TEST_TF_RECORD)

In [0]:
# Predictions on test set.
print('***** Started Prediction at {} *****'.format(datetime.datetime.now()))
print('  Num examples = {}'.format(num_test_examples))
print('  Batch size = {}'.format(PREDICT_BATCH_SIZE))
# predict input function for test set
test_input_fn = run_classifier.file_based_input_fn_builder(TEST_TF_RECORD,
                                                           seq_length=MAX_SEQ_LENGTH,
                                                           is_training=False,
                                                           drop_remainder=True)
tf.logging.set_verbosity(tf.logging.ERROR)
# predict on test set
result = list(estimator.predict(input_fn=test_input_fn))
print('***** Finished Prediction at {} *****'.format(datetime.datetime.now()))

# saving test predictions
output_test_file = os.path.join(OUTPUT_DIR, "test_predictions.txt")
with tf.gfile.GFile(output_test_file, "w") as writer:
  for (example_i, predictions_i) in enumerate(result):
    writer.write("%s , %s\n" % (test_examples[example_i].guid, str(predictions_i['probabilities'][1])))