# Pre-training BERT from scratch with cloud TPU

In this experiment, we will be pre-training a state-of-the-art Natural Language Understanding model [BERT](https://arxiv.org/abs/1810.04805.) on arbitrary text data using Google Cloud infrastructure.

This guide covers all stages of the procedure, including:

1. Setting up the training environment
2. Downloading raw text data
3. Preprocessing text data
4. Learning a new vocabulary
5. Creating sharded pre-training data
6. Setting up GCS storage for data and model
7. Training the model on a cloud TPU

For persistent storage of training data and model, you will require a Google Cloud Storage bucket. 
Please follow the [Google Cloud TPU quickstart](https://cloud.google.com/tpu/docs/quickstart) to create a GCP account and GCS bucket. New Google Cloud users have [$300 free credit](https://cloud.google.com/free/) to get started with any GCP product. 

Steps 1-5 of this tutorial can be run without a GCS bucket for demonstration purposes. In that case, however, you will not be able to train the model.

**Note** 
The only parameter you *really have to set* is BUCKET_NAME in steps 5 and 6. Everything else has default values which should work for most use-cases.

**Note** 
Pre-training a BERT-Base model on a TPUv2 will take about 54 hours. Google Colab is not designed for executing such long-running jobs and will interrupt the training process every 8 hours or so. For uninterrupted training, consider using a preemptible TPUv2 instance. 

That said, at the time of writing (09.05.2019), with a Colab TPU, pre-training a BERT model from scratch can be achieved at a negligible cost of storing the said model and data in GCS  (~1 USD).

Now, let's get to business.

MIT License

Copyright (c) [2019] [Antyukhov Denis Olegovich]

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

## Step 1: setting up training environment
First and foremost, we get the packages required to train the model. 
The Jupyter environment allows executing bash commands directly from the notebook by using an exclamation mark ‘!’. I will be exploiting this approach to make use of several other bash commands throughout the experiment.

Now, let’s import the packages and authorize ourselves in Google Cloud.

In [0]:
from google.colab import drive
drive.mount('/content/drive/')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /content/drive/


In [0]:
%cd drive/'My Drive'/MalayalamNLP/'Contextual embedding: BERT'/
!ls

/content/drive/My Drive/MalayalamNLP/Contextual embedding: BERT
bert		dataset.txt	  proc_dataset.txt  tokenizer.vocab
bert_input.zip	ml.txt.gz	  shards	    vocab.txt
bert_model	pretraining_data  tokenizer.model


In [0]:
!unzip bert_input.zip
!mv bert_input.txt dataset.txt

Archive:  bert_input.zip
  inflating: bert_input.txt          


In [0]:
!ls

bert  bert_input.zip  dataset.txt  ml.txt.gz


In [0]:
!pip install tensorflow==1.14

Collecting tensorflow==1.14
[?25l  Downloading https://files.pythonhosted.org/packages/de/f0/96fb2e0412ae9692dbf400e5b04432885f677ad6241c088ccc5fe7724d69/tensorflow-1.14.0-cp36-cp36m-manylinux1_x86_64.whl (109.2MB)
[K     |████████████████████████████████| 109.2MB 50kB/s 


In [0]:
!pip install sentencepiece
!rm -rf bert
!git clone https://github.com/ashiks-qb/bert.git

import os
import sys
import json
import nltk
import random
import logging
import tensorflow as tf
import sentencepiece as spm

from glob import glob
from google.colab import auth, drive
from tensorflow.keras.utils import Progbar

sys.path.append("bert")

from bert import modeling, optimization, tokenization
from bert.run_pretraining import input_fn_builder, model_fn_builder

auth.authenticate_user()
  
# configure logging
log = logging.getLogger('tensorflow')
log.setLevel(logging.INFO)

# create formatter and add it to the handlers
formatter = logging.Formatter('%(asctime)s :  %(message)s')
sh = logging.StreamHandler()
sh.setLevel(logging.INFO)
sh.setFormatter(formatter)
log.handlers = [sh]

print(tf.__version__)
if 'COLAB_TPU_ADDR' in os.environ:
  log.info("Using TPU runtime")
  USE_TPU = True
  TPU_ADDRESS = 'grpc://' + os.environ['COLAB_TPU_ADDR']

  with tf.Session(TPU_ADDRESS) as session:
    log.info('TPU address is ' + TPU_ADDRESS)
    # Upload credentials to TPU.
    with open('/content/adc.json', 'r') as f:
      auth_info = json.load(f)
    tf.contrib.cloud.configure_gcs(session, credentials=auth_info)
    
else:
  log.warning('Not connected to TPU runtime')
  USE_TPU = False

In [0]:
print(tf.__version__)

1.14.0


## Step 2: getting the data

We begin with obtaining a corpus of raw text data. For this experiment, we will be using the [OpenSubtitles](http://www.opensubtitles.org/) dataset, which is available for 65 languages [here](http://opus.nlpl.eu/OpenSubtitles-v2016.php). 

Unlike more common text datasets (like Wikipedia) it does not require any complex pre-processing. It also comes pre-formatted with one sentence per line.

Feel free to use the dataset for your language instead by changing the language code (en) below.

In [0]:
# AVAILABLE =  {'af','ar','bg','bn','br','bs','ca','cs',
#               'da','de','el','en','eo','es','et','eu',
#               'fa','fi','fr','gl','he','hi','hr','hu',
#               'hy','id','is','it','ja','ka','kk','ko',
#               'lt','lv','mk','ml','ms','nl','no','pl',
#               'pt','pt_br','ro','ru','si','sk','sl','sq',
#               'sr','sv','ta','te','th','tl','tr','uk',
#               'ur','vi','ze_en','ze_zh','zh','zh_cn',
#               'zh_en','zh_tw','zh_zh'}

# LANG_CODE = "ml" #@param {type:"string"}

# assert LANG_CODE in AVAILABLE, "Invalid language code selected"

# !wget http://opus.nlpl.eu/download.php?f=OpenSubtitles/v2016/mono/OpenSubtitles.raw.'$LANG_CODE'.gz -O dataset.txt.gz
# !gzip -d dataset.txt.gz
!tail dataset.txt

 20 ദിവസത്തിനുള്ളിൽ മക്കളെ വിളിച്ചുവരുത്തി ചർച്ചചെയ്ത് പ്രശ്നപരിഹാരം ഉണ്ടാക്കുമെന്നും ഇല്ലെങ്കിൽ നിയമനടപടി സ്വീകരിക്കുമെന്നും പോലീസ് അറിയിച്ചു.
      കൈക്കുഞ്ഞുമായി അഭയം തേടിയെത്തിയ കുടുംബത്തിന് പോലീസ് സ്റ്റേഷനിൽ മർദനം   കട്ടപ്പന വഴിയിലുണ്ടായ തർക്കത്തെ തുടർന്ന് ഒരു മാസം പ്രായമായ കുഞ്ഞുമായി സ്റ്റേഷനിൽ .. 



://..///-------------1.2719084------------------ടുറിന്: ചാമ്പ്യന്സ് ലീഗ് ഫുട്ബോൾക്വാര്ട്ടര് ഫൈനലിന്റെ ആദ്യപാദ മത്സരത്തില് റയല് മാഡ്രിഡിന് തകര്പ്പന് ജയം.
 യുവന്റസിനെ അവരുടെ തട്ടകത്തില് എതിരില്ലാത്ത മൂന്ന് ഗോളിനാണ് റയല് തോല്പിച്ചത്.
 ഇരട്ട ഗോള് നേടിയ ക്രിസ്റ്റിയാനോയുടെ പ്രകടനമാണ് റയലിന് കാര്യങ്ങള് എളുപ്പമാക്കിയത്.




For demonstration purposes, we will only use a small fraction of the whole corpus for this experiment. 

When training the real model, make sure to uncheck the DEMO_MODE checkbox to use a 100x larger dataset.

Rest assured, 100M lines are perfectly sufficient to train a reasonably good BERT-base model.

In [0]:
DEMO_MODE = False #@param {type:"boolean"}

if DEMO_MODE:
  CORPUS_SIZE = 1000000
else:
  CORPUS_SIZE = 100000000 #@param {type: "integer"}
  
!(head -n $CORPUS_SIZE dataset.txt) > subdataset.txt
!mv subdataset.txt dataset.txt

## Step 3: preprocessing text

The raw text data we have downloaded contains punсtuation, uppercase letters and non-UTF symbols which we will remove before proceeding. During inference, we will apply the same normalization procedure to new data.

If your use-case requires different preprocessing (e.g. if uppercase letters or punctuation are expected during inference), feel free to modify the function below to accomodate for your needs.

In [0]:
regex_tokenizer = nltk.RegexpTokenizer('[ഀ-ൿ]+')#("\w+")

def normalize_text(text):
  # lowercase text
  text = str(text).lower()
  # remove non-UTF
  text = text.encode("utf-8", "ignore").decode()
  # remove punktuation symbols
  text = " ".join(regex_tokenizer.tokenize(text))
  return text

def count_lines(filename):
  count = 0
  with open(filename) as fi:
    for line in fi:
      count += 1
  return count

Check how that works.

In [0]:
normalize_text('അറിവ് മാഞ്ഞുപോയേക്കാം. വിശ്വാസം , പ്രത്യാശ , സ്നേഹം.')

'അറിവ് മാഞ്ഞുപോയേക്കാം വിശ്വാസം പ്രത്യാശ സ്നേഹം'

Apply normalization to the whole dataset.

In [0]:
RAW_DATA_FPATH = "dataset.txt" #@param {type: "string"}
PRC_DATA_FPATH = "proc_dataset.txt" #@param {type: "string"}

# apply normalization to the dataset
# this will take a minute or two

total_lines = count_lines(RAW_DATA_FPATH)
bar = Progbar(total_lines)

with open(RAW_DATA_FPATH,encoding="utf-8") as fi:
  with open(PRC_DATA_FPATH, "w",encoding="utf-8") as fo:
    for l in fi:
      fo.write(normalize_text(l)+"\n")
      bar.add(1)



## Step 4: building the vocabulary

For the next step, we will learn a new vocabulary that we will use to represent our dataset. 

The BERT paper uses a WordPiece tokenizer, which is not available in opensource. Instead, we will be using SentencePiece tokenizer in unigram mode. While it is not directly compatible with BERT, with a small hack we can make it work.

SentencePiece requires quite a lot of RAM, so running it on the full dataset in Colab will crash the kernel. To avoid this, we will randomly subsample a fraction of the dataset for building the vocabulary. Another option would be to use a machine with more RAM for this step - that decision is up to you.

Also, SentencePiece adds BOS and EOS control symbols to the vocabulary by default. We disable them explicitly by setting their indices to -1.

The typical values for VOC_SIZE are somewhere in between 32000 and 128000. We reserve NUM_PLACEHOLDERS tokens in case one wants to update the vocabulary and fine-tune the model after the pre-training phase is finished.

In [0]:
MODEL_PREFIX = "tokenizer" #@param {type: "string"}
VOC_SIZE = 32000 #@param {type:"integer"}
SUBSAMPLE_SIZE = 2800000 #@param {type:"integer"}
NUM_PLACEHOLDERS = 256 #@param {type:"integer"}

SPM_COMMAND = ('--input={} --model_prefix={} '
               '--vocab_size={} --input_sentence_size={} '
               '--shuffle_input_sentence=true ' 
               '--bos_id=-1 --eos_id=-1').format(
               PRC_DATA_FPATH, MODEL_PREFIX, 
               VOC_SIZE - NUM_PLACEHOLDERS, SUBSAMPLE_SIZE)

spm.SentencePieceTrainer.Train(SPM_COMMAND)

Now let's see how we can make SentencePiece tokenizer work for the BERT model. 

Below is a sentence tokenized using the WordPiece vocabulary from a pretrained English [BERT-base](https://storage.googleapis.com/bert_models/2018_10_18/uncased_L-12_H-768_A-12.zip) model from the official [repo](https://github.com/google-research/bert). 

In [0]:
testcase = "അറിവ് മാഞ്ഞുപോയേക്കാം. വിശ്വാസം , പ്രത്യാശ , സ്നേഹം."



```
>>> wordpiece.tokenize("Colorless geothermal substations are generating furiously")

['color',
 '##less',
 'geo',
 '##thermal',
 'sub',
 '##station',
 '##s',
 'are',
 'generating',
 'furiously']
```



As we can see, the WordPiece tokenizer prepends the subwords which occur in the middle of words with '##'. The subwords occurring at the beginning of words are unchanged. If the subword occurs both in the beginning and in the middle of words, both versions (with and without '##') are added to the vocabulary.

Now let's have a look at the vocabulary that the SentencePiece tokenizer has learned.

In [0]:
!ls

bert		dataset.txt  proc_dataset.txt  tokenizer.vocab
bert_input.zip	ml.txt.gz    tokenizer.model


SentencePiece has created two files: tokenizer.model and tokenizer.vocab. Let's have a look at the learned vocabulary:

In [0]:
!head -n 100 tokenizer.vocab

<unk>	0
്	-3.99722
ം	-4.73794
ും	-4.989
ു	-5.10135
▁ഒരു	-5.24594
യും	-5.45962
▁	-5.48165
▁ഈ	-5.53906
ത്	-5.59775
യുടെ	-5.64494
▁എന്ന	-5.76411
ാണ്	-5.778
ി	-5.8287
ിൽ	-5.8923
വും	-5.92505
▁പി	-6.03418
ൻ	-6.05712
ക്ക്	-6.06132
ില്	-6.07787
യിൽ	-6.18809
െ	-6.18821
ിന്റെ	-6.19667
യാണ്	-6.21885
▁സി	-6.23306
സ്	-6.24604
▁കെ	-6.25729
ാ	-6.26497
▁ആ	-6.29451
മാണ്	-6.34441
ന്	-6.34448
യ	-6.34761
ത്തിന്റെ	-6.36696
യില്	-6.42428
▁എ	-6.44004
യെ	-6.46724
ത്തിൽ	-6.46756
ാൻ	-6.48408
ുന്ന	-6.48421
ർ	-6.52026
▁എസ്	-6.52224
▁എം	-6.5374
ങ്ങൾ	-6.54335
ത	-6.55983
ിലെ	-6.56884
▁വി	-6.58389
മായ	-6.59422
▁തന്നെ	-6.59484
മ	-6.60833
▁പറഞ്ഞു	-6.61537
ിന്	-6.62775
മായി	-6.66802
യിലെ	-6.67333
ാന്	-6.68137
ത്തിന്	-6.68778
ൽ	-6.69409
ര്	-6.70071
കൾ	-6.70389
ോ	-6.72414
ായി	-6.7422
ത്തെ	-6.74319
ത്തില്	-6.75179
ുള്ള	-6.78941
വ	-6.79008
ന	-6.7911
▁ൽ	-6.81111
▁നിന്ന്	-6.82675
▁ചെയ്തു	-6.84138
ക്ക	-6.87696
ക	-6.88418
ിരുന്നു	-6.88745
ായ	-6.8948
▁അദ്ദേഹം	-6.8983
ുകൾ	-6.90068
▁ന്	-6.91241
ുന്നത്	-6.92008
ില്ല	-6.92759
▁ഇത്	

In [0]:
def read_sentencepiece_vocab(filepath):
  voc = []
  with open(filepath, encoding='utf-8') as fi:
    for line in fi:
      voc.append(line.split("\t")[0])
  # skip the first <unk> token
  voc = voc[1:]
  return voc

snt_vocab = read_sentencepiece_vocab("{}.vocab".format(MODEL_PREFIX))
print("Learnt vocab size: {}".format(len(snt_vocab)))
print("Sample tokens: {}".format(random.sample(snt_vocab, 10)))

Learnt vocab size: 31743
Sample tokens: ['▁പുരാതന', '▁പുതുക്കിയ', 'ിയായ', '▁അൻസ', 'ചിട്ടപ്പെടുത്തിയ', '▁ട്വീറ്റ', '▁വേണ്ടിവന്ന', '▁മറ്റൊരാള', '▁ആട', 'യിലുണ്ട്']


As we may observe, SentencePiece does quite the opposite to WordPiece. From the [documentation](https://github.com/google/sentencepiece/blob/master/README.md):


SentencePiece first escapes the whitespace with a meta-symbol "▁" (U+2581) as follows:

`Hello▁World`.

Then, this text is segmented into small pieces, for example:

`[Hello] [▁Wor] [ld] [.]`

Subwords which occur after whitespace (which are also those that most words begin with) are prepended with '▁', while others are unchanged. This excludes subwords which only occur at the beginning of sentences and nowhere else. These cases should be quite rare, however. 

So, in order to obtain a vocabulary analogous to WordPiece, we need to perform a simple conversion, removing "▁" from the tokens that contain it and adding "##"  to the ones that don't.

In [0]:
def parse_sentencepiece_token(token):
    if token.startswith("▁"):
        return token[1:]
    else:
        return "##" + token

In [0]:
bert_vocab = list(map(parse_sentencepiece_token, snt_vocab))

We also add some special control symbols which are required by the BERT architecture. By convention, we put those at the beginning of the vocabulary.

In [0]:
ctrl_symbols = ["[PAD]","[UNK]","[CLS]","[SEP]","[MASK]"]
bert_vocab = ctrl_symbols + bert_vocab

We also append some placeholder tokens to the vocabulary. Those are useful if one wishes to update the pre-trained model with new, task-specific tokens. 

In that case, the placeholder tokens are replaced with new real ones, the pre-training data is re-generated, and the model is fine-tuned on new data.

In [0]:
bert_vocab += ["[UNUSED_{}]".format(i) for i in range(VOC_SIZE - len(bert_vocab))]
print(len(bert_vocab))

32000


Finally, we write the obtained vocabulary to file.

In [0]:
VOC_FNAME = "vocab.txt" #@param {type:"string"}

with open(VOC_FNAME, "w") as fo:
  for token in bert_vocab:
    fo.write(token+"\n")

Now let's see how the new vocabulary works in practice:

In [0]:
bert_tokenizer = tokenization.FullTokenizer(VOC_FNAME)
bert_tokenizer.tokenize(testcase)

2020-04-22 08:15:33,597 :  From /content/drive/My Drive/MalayalamNLP/Contextual embedding: BERT/bert/tokenization.py:125: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.



['അറിവ്',
 'മാഞ്ഞ',
 '##ുപോയ',
 '##േക്കാം',
 '[UNK]',
 'വിശ്വാസം',
 '[UNK]',
 'പ്രത്യാശ',
 '[UNK]',
 'സ്നേഹം',
 '[UNK]']

Looking good!

## Step 5: generating pre-training data

With the vocabulary at hand, we are ready to generate pre-training data for the BERT model. Since our dataset might be quite large, we will split it into shards:

In [0]:
!mkdir ./shards
!split -a 4 -l 256000 -d $PRC_DATA_FPATH ./shards/shard_
!ls ./shards/

shard_0000  shard_0002	shard_0004  shard_0006	shard_0008  shard_0010
shard_0001  shard_0003	shard_0005  shard_0007	shard_0009  shard_0011


Before we start generating, we need to set some model-specific parameters.  

In [0]:
MAX_SEQ_LENGTH = 128 #@param {type:"integer"}
MASKED_LM_PROB = 0.15 #@param
MAX_PREDICTIONS = 20 #@param {type:"integer"}
DO_LOWER_CASE = True #@param {type:"boolean"}
PROCESSES = 2 #@param {type:"integer"}
PRETRAINING_DIR = "pretraining_data" #@param {type:"string"}

Now, for each shard we need to call *create_pretraining_data.py* script. To that end, we will employ the  *xargs* command. 

Running this might take quite some time depending on the size of your dataset.

In [0]:
XARGS_CMD = ("ls ./shards/ | "
             "xargs -n 1 -P {} -I{} "
             "python3 bert/create_pretraining_data.py "
             "--input_file=./shards/{} "
             "--output_file={}/{}.tfrecord "
             "--vocab_file={} "
             "--do_lower_case={} "
             "--max_predictions_per_seq={} "
             "--max_seq_length={} "
             "--masked_lm_prob={} "
             "--random_seed=34 "
             "--dupe_factor=5")

XARGS_CMD = XARGS_CMD.format(PROCESSES, '{}', '{}', PRETRAINING_DIR, '{}', 
                             VOC_FNAME, DO_LOWER_CASE, 
                             MAX_PREDICTIONS, MAX_SEQ_LENGTH, MASKED_LM_PROB)

In [0]:
tf.gfile.MkDir(PRETRAINING_DIR)
!$XARGS_CMD

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])




W0422 08:17:08.643953 139814707267456 deprecation_wrapper.py:119] From bert/create_pretraining_data.py:437: The name tf.logging.set_verbosity is deprecated. Please use tf.compat.v1.logging.set_verbosity instead.
W0422 08:17:08.643962 140146372753280 deprecation_wrapper.py:119] From bert/create_pretraining_data.py:437: The name tf.logging.set_verbosity is deprecated. Please use tf.compat.v1.logging.set_verbosity instead.



W0422 08:17:08.644225 140146372753280 deprecation_wrapper.py:119] From bert/create_pretraining_data.py:437: The name tf.logging.INFO is deprecated. Please use tf.compat.v1.logging.INFO instead.


W0422 08:17:08.644225 139814707267456 deprecation_wrapper.py:119] From bert/create_pretraining_dat

## Step 6: setting up persistent storage

To preserve our hard-earned assets, we will persist them to Google Cloud Storage. Provided that you have created the GCS bucket, this should be simple.

We will create two directories in GCS, one for the data and one for the model.
In the model directory, we will put the model vocabulary and configuration file.

**Configure your BUCKET_NAME variable here before proceeding, otherwise the model and data will not be saved.**

In [0]:
BUCKET_NAME = "" #@param {type:"string"}
MODEL_DIR = "bert_model" #@param {type:"string"}
tf.gfile.MkDir(MODEL_DIR)

if not BUCKET_NAME:
  log.warning("WARNING: BUCKET_NAME is not set. "
              "You will not be able to train the model.")

Below is the sample hyperparameter configuration for BERT-base. Change at your own risk.

In [0]:
# use this for BERT-base

bert_base_config = {
  "attention_probs_dropout_prob": 0.1, 
  "directionality": "bidi", 
  "hidden_act": "gelu", 
  "hidden_dropout_prob": 0.1, 
  "hidden_size": 768, 
  "initializer_range": 0.02, 
  "intermediate_size": 3072, 
  "max_position_embeddings": 512, 
  "num_attention_heads": 12, 
  "num_hidden_layers": 12, 
  "pooler_fc_size": 768, 
  "pooler_num_attention_heads": 12, 
  "pooler_num_fc_layers": 3, 
  "pooler_size_per_head": 128, 
  "pooler_type": "first_token_transform", 
  "type_vocab_size": 2, 
  "vocab_size": VOC_SIZE
}

with open("{}/bert_config.json".format(MODEL_DIR), "w") as fo:
  json.dump(bert_base_config, fo, indent=2)
  
# with open('vocab.txt', 'r') as f:
#   bert_vocab = f.read().splitlines()

with open("{}/{}".format(MODEL_DIR, VOC_FNAME), "w") as fo:
  for token in bert_vocab:
    fo.write(token+"\n")

In [0]:
if BUCKET_NAME:
  !gsutil -m cp -r $MODEL_DIR gs://$BUCKET_NAME

Copying file://bert_model/bert_config.json [Content-Type=application/json]...
Copying file://bert_model/vocab.txt [Content-Type=text/plain]...
/ [2/2 files][730.9 KiB/730.9 KiB] 100% Done                                    
Operation completed over 2 objects/730.9 KiB.                                    


## Step 7: training the model

We are almost ready to begin training our model. If you wish  to continue an interrupted training run, you may skip steps 2-6 and proceed from here.

**Make sure that you have set the BUCKET_NAME here as well.**

In [0]:
BUCKET_NAME = "" #@param {type:"string"}
MODEL_DIR = "bert_model" #@param {type:"string"}
PRETRAINING_DIR = "pretraining_data" #@param {type:"string"}
VOC_FNAME = "vocab.txt" #@param {type:"string"}

# Input data pipeline config
TRAIN_BATCH_SIZE = 128 #@param {type:"integer"}
MAX_PREDICTIONS = 20 #@param {type:"integer"}
MAX_SEQ_LENGTH = 128 #@param {type:"integer"}
MASKED_LM_PROB = 0.15 #@param

# Training procedure config
EVAL_BATCH_SIZE = 64
LEARNING_RATE = 2e-5
TRAIN_STEPS = 1000000 #@param {type:"integer"}
SAVE_CHECKPOINTS_STEPS = 2500 #@param {type:"integer"}
NUM_TPU_CORES = 8

if BUCKET_NAME:
  BUCKET_PATH = "gs://{}".format(BUCKET_NAME)
else:
  BUCKET_PATH = "."

BERT_GCS_DIR = "{}/{}".format(BUCKET_PATH, MODEL_DIR)
DATA_GCS_DIR = "{}/{}".format(BUCKET_PATH, PRETRAINING_DIR)

VOCAB_FILE = os.path.join(BERT_GCS_DIR, VOC_FNAME)
CONFIG_FILE = os.path.join(BERT_GCS_DIR, "bert_config.json")

INIT_CHECKPOINT = tf.train.latest_checkpoint(BERT_GCS_DIR)

bert_config = modeling.BertConfig.from_json_file(CONFIG_FILE)
input_files = tf.gfile.Glob(os.path.join(DATA_GCS_DIR,'*tfrecord'))

log.info("Using checkpoint: {}".format(INIT_CHECKPOINT))
log.info("Using {} data shards".format(len(input_files)))

2020-05-10 10:32:10,576 :  From /content/bert/modeling.py:93: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.

2020-05-10 10:32:10,780 :  Using checkpoint: gs://malayalamnlp/bert_model/model.ckpt-712500
2020-05-10 10:32:10,783 :  Using 84 data shards


In [0]:
!ls

adc.json  bert	sample_data


Prepare the training run configuration, build the estimator and input function, power up the bass cannon.

In [0]:
model_fn = model_fn_builder(
      bert_config=bert_config,
      init_checkpoint=INIT_CHECKPOINT,
      learning_rate=LEARNING_RATE,
      num_train_steps=TRAIN_STEPS,
      num_warmup_steps=10,
      use_tpu=USE_TPU,
      use_one_hot_embeddings=True)

tpu_cluster_resolver = tf.contrib.cluster_resolver.TPUClusterResolver(TPU_ADDRESS)

run_config = tf.contrib.tpu.RunConfig(
    cluster=tpu_cluster_resolver,
    model_dir=BERT_GCS_DIR,
    save_checkpoints_steps=SAVE_CHECKPOINTS_STEPS,
    tpu_config=tf.contrib.tpu.TPUConfig(
        iterations_per_loop=SAVE_CHECKPOINTS_STEPS,
        num_shards=NUM_TPU_CORES,
        per_host_input_for_training=tf.contrib.tpu.InputPipelineConfig.PER_HOST_V2))

estimator = tf.contrib.tpu.TPUEstimator(
    use_tpu=USE_TPU,
    model_fn=model_fn,
    config=run_config,
    train_batch_size=TRAIN_BATCH_SIZE,
    eval_batch_size=EVAL_BATCH_SIZE)
  
train_input_fn = input_fn_builder(
        input_files=input_files,
        max_seq_length=MAX_SEQ_LENGTH,
        max_predictions_per_seq=MAX_PREDICTIONS,
        is_training=True)

2020-05-10 10:32:20,544 :  Estimator's model_fn (<function model_fn_builder.<locals>.model_fn at 0x7f6960c43ea0>) includes params argument, but params are not passed to Estimator.
2020-05-10 10:32:20,546 :  Using config: {'_model_dir': 'gs://malayalamnlp/bert_model', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 2500, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
cluster_def {
  job {
    name: "worker"
    tasks {
      key: 0
      value: "10.94.225.34:8470"
    }
  }
}
isolate_session_state: true
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': None, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f6960be1f28>, '_task_type': 'worker', '_task_id': 0, '_glo

Fire!

In [0]:
estimator.train(input_fn=train_input_fn, max_steps=TRAIN_STEPS)

2020-05-10 10:32:25,693 :  Querying Tensorflow master (grpc://10.94.225.34:8470) for TPU system metadata.
2020-05-10 10:32:25,702 :  Found TPU system:
2020-05-10 10:32:25,702 :  *** Num TPU Cores: 8
2020-05-10 10:32:25,703 :  *** Num TPU Workers: 1
2020-05-10 10:32:25,707 :  *** Num TPU Cores Per Worker: 8
2020-05-10 10:32:25,709 :  *** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:CPU:0, CPU, -1, 3136714441908519932)
2020-05-10 10:32:25,712 :  *** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:0, TPU, 17179869184, 13151635859550423673)
2020-05-10 10:32:25,714 :  *** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:1, TPU, 17179869184, 4613835520552954457)
2020-05-10 10:32:25,714 :  *** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:2, TPU, 17179869184, 5149093999699681015)
2020-05-10 10:32:25,715 :  *** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:T



2020-05-10 10:32:26,355 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695fc17b38>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695fc17b38>>: AssertionError: Bad argument number for Name: 3, expecting 4
2020-05-10 10:32:26,498 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695fc179e8>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695fc179e8>>: AssertionError: Bad argument number for



2020-05-10 10:32:26,641 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695fb50400>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695fb50400>>: AssertionError: Bad argument number for Name: 3, expecting 4
2020-05-10 10:32:26,753 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695fb50400>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695fb50400>>: AssertionError: Bad argument number for



2020-05-10 10:32:26,893 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f90f5f8>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f90f5f8>>: AssertionError: Bad argument number for Name: 3, expecting 4
2020-05-10 10:32:26,994 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f90f5f8>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f90f5f8>>: AssertionError: Bad argument number for



2020-05-10 10:32:27,237 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695fc17940>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695fc17940>>: AssertionError: Bad argument number for Name: 3, expecting 4
2020-05-10 10:32:27,377 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695fba4748>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695fba4748>>: AssertionError: Bad argument number for



2020-05-10 10:32:27,494 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f90f198>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f90f198>>: AssertionError: Bad argument number for Name: 3, expecting 4




2020-05-10 10:32:27,776 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f637a20>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f637a20>>: AssertionError: Bad argument number for Name: 3, expecting 4
2020-05-10 10:32:27,880 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f637a20>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f637a20>>: AssertionError: Bad argument number for



2020-05-10 10:32:27,993 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f637a20>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f637a20>>: AssertionError: Bad argument number for Name: 3, expecting 4
2020-05-10 10:32:28,131 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f74d4a8>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f74d4a8>>: AssertionError: Bad argument number for



2020-05-10 10:32:28,273 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f77ca58>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f77ca58>>: AssertionError: Bad argument number for Name: 3, expecting 4
2020-05-10 10:32:28,388 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f66fc88>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f66fc88>>: AssertionError: Bad argument number for



2020-05-10 10:32:28,538 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f342128>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f342128>>: AssertionError: Bad argument number for Name: 3, expecting 4
2020-05-10 10:32:28,640 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f342128>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f342128>>: AssertionError: Bad argument number for



2020-05-10 10:32:28,743 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f342128>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f342128>>: AssertionError: Bad argument number for Name: 3, expecting 4
2020-05-10 10:32:28,886 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f6960be1f98>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f6960be1f98>>: AssertionError: Bad argument number for



2020-05-10 10:32:29,032 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695fb50ef0>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695fb50ef0>>: AssertionError: Bad argument number for Name: 3, expecting 4
2020-05-10 10:32:29,150 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f738b38>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f738b38>>: AssertionError: Bad argument number for



2020-05-10 10:32:29,294 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f009518>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f009518>>: AssertionError: Bad argument number for Name: 3, expecting 4
2020-05-10 10:32:29,397 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f009518>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f009518>>: AssertionError: Bad argument number for



2020-05-10 10:32:29,508 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f009518>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f009518>>: AssertionError: Bad argument number for Name: 3, expecting 4
2020-05-10 10:32:29,655 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f2173c8>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f2173c8>>: AssertionError: Bad argument number for



2020-05-10 10:32:29,798 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695fb50ef0>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695fb50ef0>>: AssertionError: Bad argument number for Name: 3, expecting 4
2020-05-10 10:32:29,909 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f3955f8>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f3955f8>>: AssertionError: Bad argument number for



2020-05-10 10:32:30,051 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695ed5cf60>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695ed5cf60>>: AssertionError: Bad argument number for Name: 3, expecting 4
2020-05-10 10:32:30,153 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695edd7e10>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695edd7e10>>: AssertionError: Bad argument number for



2020-05-10 10:32:30,255 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695edd7e10>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695edd7e10>>: AssertionError: Bad argument number for Name: 3, expecting 4
2020-05-10 10:32:30,409 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695ed46e48>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695ed46e48>>: AssertionError: Bad argument number for



2020-05-10 10:32:30,557 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695fb50ef0>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695fb50ef0>>: AssertionError: Bad argument number for Name: 3, expecting 4




2020-05-10 10:32:30,820 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695eecbd30>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695eecbd30>>: AssertionError: Bad argument number for Name: 3, expecting 4
2020-05-10 10:32:30,958 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695ea6c860>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695ea6c860>>: AssertionError: Bad argument number for



2020-05-10 10:32:31,063 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695edf33c8>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695edf33c8>>: AssertionError: Bad argument number for Name: 3, expecting 4
2020-05-10 10:32:31,167 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695edf33c8>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695edf33c8>>: AssertionError: Bad argument number for



2020-05-10 10:32:31,309 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695ea48710>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695ea48710>>: AssertionError: Bad argument number for Name: 3, expecting 4
2020-05-10 10:32:31,448 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695fb50ef0>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695fb50ef0>>: AssertionError: Bad argument number for



2020-05-10 10:32:31,569 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695ef634e0>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695ef634e0>>: AssertionError: Bad argument number for Name: 3, expecting 4
2020-05-10 10:32:31,708 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695e6a8710>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695e6a8710>>: AssertionError: Bad argument number for



2020-05-10 10:32:31,814 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695e6a8710>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695e6a8710>>: AssertionError: Bad argument number for Name: 3, expecting 4
2020-05-10 10:32:31,918 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695e6a8710>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695e6a8710>>: AssertionError: Bad argument number for



2020-05-10 10:32:32,063 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695e73d7f0>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695e73d7f0>>: AssertionError: Bad argument number for Name: 3, expecting 4
2020-05-10 10:32:32,205 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695fb50ef0>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695fb50ef0>>: AssertionError: Bad argument number for



2020-05-10 10:32:32,322 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695ed2e860>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695ed2e860>>: AssertionError: Bad argument number for Name: 3, expecting 4
2020-05-10 10:32:32,463 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695e4e2dd8>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695e4e2dd8>>: AssertionError: Bad argument number for



2020-05-10 10:32:32,579 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695e4e2dd8>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695e4e2dd8>>: AssertionError: Bad argument number for Name: 3, expecting 4
2020-05-10 10:32:32,683 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695e4e2dd8>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695e4e2dd8>>: AssertionError: Bad argument number for



2020-05-10 10:32:32,823 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695e436668>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695e436668>>: AssertionError: Bad argument number for Name: 3, expecting 4
2020-05-10 10:32:32,963 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695fb50ef0>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695fb50ef0>>: AssertionError: Bad argument number for



2020-05-10 10:32:33,083 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f3f53c8>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f3f53c8>>: AssertionError: Bad argument number for Name: 3, expecting 4
2020-05-10 10:32:33,223 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695e159cc0>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695e159cc0>>: AssertionError: Bad argument number for



2020-05-10 10:32:33,328 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695e159cc0>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695e159cc0>>: AssertionError: Bad argument number for Name: 3, expecting 4
2020-05-10 10:32:33,455 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695e159cc0>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695e159cc0>>: AssertionError: Bad argument number for



2020-05-10 10:32:33,604 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695e0f5080>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695e0f5080>>: AssertionError: Bad argument number for Name: 3, expecting 4
2020-05-10 10:32:33,745 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695fb50ef0>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695fb50ef0>>: AssertionError: Bad argument number for



2020-05-10 10:32:33,859 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695e48a7b8>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695e48a7b8>>: AssertionError: Bad argument number for Name: 3, expecting 4
2020-05-10 10:32:34,000 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695dd71ba8>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695dd71ba8>>: AssertionError: Bad argument number for



2020-05-10 10:32:34,110 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695dd71ba8>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695dd71ba8>>: AssertionError: Bad argument number for Name: 3, expecting 4
2020-05-10 10:32:34,214 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695dd71ba8>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695dd71ba8>>: AssertionError: Bad argument number for



2020-05-10 10:32:34,355 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695e0fd160>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695e0fd160>>: AssertionError: Bad argument number for Name: 3, expecting 4




2020-05-10 10:32:34,673 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695fa05c18>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695fa05c18>>: AssertionError: Bad argument number for Name: 3, expecting 4
2020-05-10 10:32:34,788 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f77c940>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f77c940>>: AssertionError: Bad argument number for



2020-05-10 10:32:34,933 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695db11898>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695db11898>>: AssertionError: Bad argument number for Name: 3, expecting 4
2020-05-10 10:32:35,037 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695db11898>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695db11898>>: AssertionError: Bad argument number for



2020-05-10 10:32:35,138 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695db11898>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695db11898>>: AssertionError: Bad argument number for Name: 3, expecting 4
2020-05-10 10:32:35,279 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695db9b9b0>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695db9b9b0>>: AssertionError: Bad argument number for



2020-05-10 10:32:35,426 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695fa05c18>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695fa05c18>>: AssertionError: Bad argument number for Name: 3, expecting 4
2020-05-10 10:32:35,539 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f79ef28>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f79ef28>>: AssertionError: Bad argument number for



2020-05-10 10:32:35,708 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f77ca58>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695f77ca58>>: AssertionError: Bad argument number for Name: 3, expecting 4
2020-05-10 10:32:35,823 :  Entity <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695d6714e0>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: converting <bound method Dense.call of <tensorflow.python.layers.core.Dense object at 0x7f695d6714e0>>: AssertionError: Bad argument number for



2020-05-10 10:32:36,074 :  **** Trainable Variables ****
2020-05-10 10:32:36,075 :    name = bert/embeddings/word_embeddings:0, shape = (32000, 768), *INIT_FROM_CKPT*
2020-05-10 10:32:36,075 :    name = bert/embeddings/token_type_embeddings:0, shape = (2, 768), *INIT_FROM_CKPT*
2020-05-10 10:32:36,080 :    name = bert/embeddings/position_embeddings:0, shape = (512, 768), *INIT_FROM_CKPT*
2020-05-10 10:32:36,082 :    name = bert/embeddings/LayerNorm/beta:0, shape = (768,), *INIT_FROM_CKPT*
2020-05-10 10:32:36,083 :    name = bert/embeddings/LayerNorm/gamma:0, shape = (768,), *INIT_FROM_CKPT*
2020-05-10 10:32:36,085 :    name = bert/encoder/layer_0/attention/self/query/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
2020-05-10 10:32:36,086 :    name = bert/encoder/layer_0/attention/self/query/bias:0, shape = (768,), *INIT_FROM_CKPT*
2020-05-10 10:32:36,088 :    name = bert/encoder/layer_0/attention/self/key/kernel:0, shape = (768, 768), *INIT_FROM_CKPT*
2020-05-10 10:32:36,090 :    name =

Training the model with the default parameters for 1 million steps will take ~53 hours. 

In case the kernel is restarted, you may always continue training from the latest checkpoint. 

This concludes the guide to pre-training BERT from scratch on a cloud TPU. However, the really fun stuff is still  to come, so stay tuned.

Keep learning!

In [0]:
!gsutil cp gs://hermes_assets/russian_uncased_L-12_H-768_A-12.zip gs://bert_resourses/