<a href="https://colab.research.google.com/github/aditya-malte/Colab-XLNet-FineTuning/blob/master/Colab_XLNet_FineTuning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

##### Copyright 2018 The TensorFlow Hub Authors.

Licensed under the Apache License, Version 2.0 (the "License");

<a href="https://colab.research.google.com/github/tensorflow/tpu/blob/master/tools/colab/bert_finetuning_with_cloud_tpus.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [0]:
# Copyright 2018 The TensorFlow Hub Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================

In [0]:
#install dependencies
!pip install emoji
!pip install sentencepiece

In [0]:
#install dependencies
import os
import csv
import tensorflow as tf
import pandas as pd  
import subprocess
import sys





###Pull XLNet Repo for given task

In [0]:
git_url = "https://github.com/aditya-malte/Colab-XLNet-FineTuning.git"  #@param {type:"string"}
os.system("git clone "+git_url)

In [0]:
!git pull origin master
#Use if you have updated git repo and want changes to reflect

In [0]:
repo_name = 'Colab-XLNet-FineTuning' #@param {type:"string"}
%ls
%cd {repo_name}
!ls

# XLNet End to End (Fine-tuning + Evaluation) in 5 minutes with Cloud TPU

## Overview


This Colab demonstates using a free Colab Cloud TPU to fine-tune sentence and sentence-pair classification tasks built on top of pretrained XLNet models and 
run predictions on tuned model. The colab demonsrates loading pretrained BERT models from both [TF Hub](https://www.tensorflow.org/hub) and checkpoints.

**Note:**  You will need a GCP (Google Compute Engine) account and a GCS (Google Cloud 
Storage) bucket for this Colab to run.

Please follow the [Google Cloud TPU quickstart](https://cloud.google.com/tpu/docs/quickstart) for how to create GCP account and GCS bucket. You have [$300 free credit](https://cloud.google.com/free/) to get started with any GCP product. You can learn more about Cloud TPU at https://cloud.google.com/tpu/docs.

This notebook is hosted on GitHub. To view it in its original repository, after opening the notebook, select **File > View on GitHub**.

## Instructions

<h3><a href="https://cloud.google.com/tpu/"><img valign="middle" src="https://raw.githubusercontent.com/GoogleCloudPlatform/tensorflow-without-a-phd/master/tensorflow-rl-pong/images/tpu-hexagon.png" width="50"></a>  &nbsp;&nbsp;Train on TPU</h3>

   1. Create a Cloud Storage bucket for your TensorBoard logs at http://console.cloud.google.com/storage and fill in the BUCKET parameter in the "Parameters" section below.
 
   1. On the main menu, click Runtime and select **Change runtime type**. Set "TPU" as the hardware accelerator.
   1. Click Runtime again and select **Runtime > Run All** (Watch out: the "Colab-only auth for this notebook and the TPU" cell requires user input). You can also run the cells manually with Shift-ENTER.

### Set up your TPU environment

In this section, you perform the following tasks:

*   Set up a Colab TPU running environment
*   Verify that you are connected to a TPU device
*   Upload your credentials to TPU to access your GCS bucket.

In [0]:
import datetime
import json
import pprint
import random
import string
import sys
import tensorflow as tf

print(os.environ)

assert 'COLAB_TPU_ADDR' in os.environ, 'ERROR: Not connected to a TPU runtime; please see the first cell in this notebook for instructions!'
TPU_ADDRESS = 'grpc://' + os.environ['COLAB_TPU_ADDR']
print('TPU address is', TPU_ADDRESS)

from google.colab import auth
auth.authenticate_user()
with tf.Session(TPU_ADDRESS) as session:
  print('TPU devices:')
  pprint.pprint(session.list_devices())

  # Upload credentials to TPU.
  with open('/content/adc.json', 'r') as f:
    auth_info = json.load(f)
  tf.contrib.cloud.configure_gcs(session, credentials=auth_info)
  # Now credentials are set for all future sessions on this TPU.

### Prepare for training

This next section of code performs the following tasks:

*  Specify task and download training data.
*  Specify BERT pretrained model
*  Specify GS bucket, create output directory for model checkpoints and eval results.




In [0]:
TASK = 'TASK_NAME' #@param {type:"string"}

TASK_DATA_DIR = 'DIRECTORY' #@param {type:"string"}
print('***** Task data directory: {} *****'.format(TASK_DATA_DIR))
!ls $TASK_DATA_DIR

BUCKET = 'BUCKET_NAME' #@param {type:"string"}
assert BUCKET, 'Must specify an existing GCS bucket name'
OUTPUT_DIR = 'gs://{}/xlnet/output/{}'.format(BUCKET, TASK)
MODEL_DIR = 'gs://{}/xlnet/model/{}'..format(BUCKET, TASK)

tf.gfile.MakeDirs(OUTPUT_DIR)
tf.gfile.MakeDirs(MODEL_DIR)

print('***** Model output directory: {} *****'.format(OUTPUT_DIR))



Now let's load tokenizer module from TF Hub and play with it.

Also we initilize our hyperprams, prepare the training data and initialize TPU config.

In [0]:
os.system("wget https://storage.googleapis.com/xlnet/released_models/cased_L-24_H-1024_A-16.zip")
os.system("unzip cased_L-24_H-1024_A-16.zip")
!ls

In [0]:
%cd xlnet_cased_L-24_H-1024_A-16
!ls

In [0]:
file_names = os.listdir(os.getcwd())
print(file_names)

In [0]:
for file_name in file_names:
  print(file_name)
  os.system("gsutil cp "+ file_name + " " + OUTPUT_DIR)
os.system("gsutil ls " + OUTPUT_DIR)
%cd ..

In [0]:
os.system("gsutil cp -r " + MODEL_DIR + "/spiece.model spiece.model")
!ls

In [0]:
TRAIN_BATCH_SIZE = 64
EVAL_BATCH_SIZE = 8
PREDICT_BATCH_SIZE = 8
LEARNING_RATE = 2e-5
NUM_TRAIN_STEPS = 1200
MAX_SEQ_LENGTH = 128 
# Warmup is a period of time where the learning rate 
# is small and gradually increases--usually helps training.
WARMUP_STEPS = 120
# Model configs
SAVE_CHECKPOINTS_STEPS = 1200
NUM_ITERATIONS = 1200

In [0]:
command = "python run_classifier.py \
  --use_tpu=True \
  --do_train=True \
  --do_eval=True \
  --eval_all_ckpt=True \
  --task_name="+TASK.lower()+" \
  --data_dir=./"+TASK_DATA_DIR+" \
  --output_dir="+OUTPUT_DIR+" \
  --model_dir="+MODEL_DIR+" \
  --uncased=False \
  --tpu_address="+TPU_ADDRESS+"  \
  --spiece_model_file=./spiece.model \
  --model_config_path="+MODEL_DIR+"/xlnet_config.json \
  --init_checkpoint="+MODEL_DIR+"/xlnet_model.ckpt \
  --max_seq_length="+str(MAX_SEQ_LENGTH)+" \
  --train_batch_size="+str(TRAIN_BATCH_SIZE)+" \
  --eval_batch_size="+str(EVAL_BATCH_SIZE)+" \
  --num_hosts=1 \
  --num_core_per_host=8 \
  --learning_rate=2e-5 \
  --train_steps="+str(NUM_TRAIN_STEPS)+" \
  --warmup_steps="+str(WARMUP_STEPS)+" \
  --save_steps="+str(SAVE_CHECKPOINTS_STEPS)+" \
  --iterations="+ str(NUM_ITERATIONS)

print(command)


In [0]:
!{command}