# BERT Fine-Tuned Notebook
## W266 Final Project
### Game of Thrones Text Classification
### T. P. Goter
### Fall 2019

This notebook is used to perform the baseline, finetuned BERT supervised text classification. The original UDA process utilized a Python script wrapped in a bash shell script. This notebook was generated in order to better show and annotate the process.

## Import Data Libraries

In [3]:
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import json
import os
import tensorflow as tf

import uda
from bert import modeling
from utils import proc_data_utils
from utils import raw_data_utils

from absl import app
from absl import flags
from absl import logging

## Define Some Options
This section replaces passing the input parameters as command line arguments. This section is very important. It controls the entire model. See the dictionary below.

### Task Options:
- do_train: Boolean of whether we are training
- do_eval: Boolean of whether we are just evaluating

### Training Options:
flags.DEFINE_string(
    "sup_train_data_dir", None,
    help="The input data dir of the supervised data. Should contain"
    "`tf_examples.tfrecord*`")
flags.DEFINE_string(
    "eval_data_dir", None,
    help="The input data dir of the evaluation data. Should contain "
    "`tf_examples.tfrecord*`")
flags.DEFINE_string(
    "unsup_data_dir", None,
    help="The input data dir of the unsupervised data. Should contain "
    "`tf_examples.tfrecord*`")
flags.DEFINE_string(
    "bert_config_file", None,
    help="The config json file corresponding to the pre-trained BERT model. "
    "This specifies the model architecture.")
flags.DEFINE_string(
    "vocab_file", None,
    help="The vocabulary file that the BERT model was trained on.")
flags.DEFINE_string(
    "init_checkpoint", None,
    help="Initial checkpoint (usually from a pre-trained BERT model).")
flags.DEFINE_string(
    "task_name", None,
    help="The name of the task to train.")
flags.DEFINE_string(
    "model_dir", None,
    help="The output directory where the model checkpoints will be written.")

### UDA Options:
- unsup_ratio: Integer - ratio between unsupervised batch size and supervised batch size. If zero - dont use
- aug_ops: String - what augmentation procedure do you want to run
- aug_copy: Integer - how many augmentations per example are to be generated
- uda_coeff: Float - default 1 - This is the coefficient on the UDA loss. Basically you can rely more or less on the UDA loss during the supervised training. The UDA paper generally kept this at 1
- tsa: String - Annealing schedule to use. Options provided are "" none, linear_schedule, log_schedule, exp_schedule
- uda_softmax_temp: Float, default -1, A smaller temperature will accentuate differences in probabilities. Low temps were used in the UDA paper for cases with low numbers of labeled data, after masking out uncertain predictions.
- uda_confidence_thresh: Float, default -1, Threshold value above which the consistency loss term from the UDA is used. Basically ensures we are using loss from random guesses.

### TPU and GPU Options:
- use_tpu: Boolean - self-explanatory - it affects how the model is run. If we run in colab this could be important. False means use CPU or GPU. We will default to FALSE.
- tpu_name: String - address of the tpu
- gcp_project: String - project name when using TPU
- tpu_zone: String - can be set or detected
- master: Address of the TPU master, if applicable



In [None]:

do_train = True
do_eval = False
unsup_ratio = 0
aug_ops = ""
uda_coeff = 1
tsa = ""
uda_softmax_temp = -1
uda_confidence_thresh = -1
    
    
    
    
    
    
    --use_tpu=False \
  --do_train=True \
  --do_eval=True \
  --sup_train_data_dir=../Data/proc_data/GoT/${sub_set}_${sup_size} \
  --eval_data_dir=Data/proc_data/GoT/dev \
  --bert_config_file=${bert_config} \
  --vocab_file=${bert_vocab_file} \
  --init_checkpoint=${bert_ckpt} \
  --task_name=GoT \
  --model_dir=ckpt/base \
  --num_train_steps=3000 \
  --learning_rate=3e-05 \
  --num_warmup_steps=300 \