Simplified version of Bidirectional Encoder Representations from Transformers (BERT)

Overview

This is a modified version of the Bidirectional Encoder Representations from Transformers (BERT) from google research.

The main purpose is to allow for periodic evaluation of the model during training while using a custom estimator. Something that was surprisingly lacking in the original code. This code will evaluate the model accuracy and loss, on both training and evaluation sets, after a specified number of seconds (evaluate_secs) and record the results to tensorboard. In the process several other evaluation metrics have been added for the test dataset, including a confusion matrix, F1, Precision and Recall.

This code is also intended to make it more clear how the BERT model is created and trained for those who are not familiar with Tensorflow Estimators.

Datasets

Currently there is only two datsets included. The Switchboard Dialogue Act Corpus (SWDA) and the Meeting Recorder Dialogue Act Corpus (MRDA). If you want to use the MRDA corpus you need to specify which label types you want to use within the MrdaProcessor() function. For more information on the label types see the above MRDA link.

However, to run this code on any of the other datasets included with the original BERT model you can simply process and run them in the same way described in the original documentation.

Usage

You must include one of the original pre-trained BERT models in the root directory (BERT_Base or BERT_Large). Due to the model checkpoint size they have not been included in this repository.

Currently the model will save the single best checkpoint (lowest loss) to the best_checkpoint directory.

Within the run_bert.py script you must specify the following parameters.

task_name = Which task (DataProcessor) to use
experiment_name = An (optional) parameter for running different experiments on the same dataset
max_seq_length = Maximum sentence/sequence length (dependent on pre-trained model)
batch_size = How large each training batch should be (default 32)
learning_rate = Model learning rate (default 2e-5)
num_epochs = Number of times to iterate over training data (default 3)
save_checkpoint_steps = Number of global steps to run before saving a checkpoint
evaluate_secs = After how many seconds to evaluate the model
checkpoints_to_keep = Number of 'best' checkpoints to keep
training = Boolean flag whether to train the model
testing = Boolean flag whether to test the model after training
bert_model_type = Directory of the pre-trained BERT model (i.e Base or Large)
do_lower_case = Depending on the pre-trained model you are using

The run_bert_no_estimator.py script runs the BERT model without estimators entirely, however this is inconsistent and untested. This is because the original BERT code does not allow easy switching between using dropout for training and not using it for evaluation. Use at your own risk!

TODO

Add command line support for scripts.
Add option to extract features rather than make predictions (like BERT's extract_features.py).

Run Tensorboard

tensorboard --logdir=./<output_dir>

Acknowledgments

Thanks to bluecamel for the Best Checkpoint Copier code.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
mrda_data		mrda_data
swda_data		swda_data
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
bert.py		bert.py
bert_utilities.py		bert_utilities.py
best_checkpoint_copier.py		best_checkpoint_copier.py
confusion_matrix.py		confusion_matrix.py
data_processor.py		data_processor.py
logger.py		logger.py
optimization.py		optimization.py
process_results.py		process_results.py
run_bert.py		run_bert.py
run_bert_no_estimator.py		run_bert_no_estimator.py
tokenization.py		tokenization.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Simplified version of Bidirectional Encoder Representations from Transformers (BERT)

Overview

Datasets

Usage

TODO

Run Tensorboard

Acknowledgments

About

Uh oh!

Releases

Packages

Languages

License

BurakaKrishna/Simple-BERT

Folders and files

Latest commit

History

Repository files navigation

Simplified version of Bidirectional Encoder Representations from Transformers (BERT)

Overview

Datasets

Usage

TODO

Run Tensorboard

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages