TensorFlow BERT Large Training

This section has instructions for running BERT Large Training with the SQuAD dataset.

Set the OUTPUT_DIR to point to the directory where all logs will get stored. Set the PRECISION to choose the appropriate precision for training. Choose from fp32, bfloat16, or fp16

Datasets

SQuAD data

Download and unzip the BERT Large uncased (whole word masking) model from the google bert repo. Set the DATASET_DIR to point to this directory when running BERT Large.

mkdir -p $DATASET_DIR && cd $DATASET_DIR
wget https://storage.googleapis.com/bert_models/2019_05_30/wwm_uncased_L-24_H-1024_A-16.zip
unzip wwm_uncased_L-24_H-1024_A-16.zip

wget https://rajpurkar.github.io/SQuAD-explorer/dataset/train-v1.1.json -P wwm_uncased_L-24_H-1024_A-16
wget https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v1.1.json -P wwm_uncased_L-24_H-1024_A-16

Quick Start Scripts

Script name	Description
`training_squad.sh`	Uses mpirun to execute 1 process per socket for BERT Large training with the specified precision (fp32, bfloat16 or fp16). Logs for each instance are saved to the output directory.

TensorFlow BERT Large Pretraining

This section has instructions for running BERT Large Pretraining using Intel-optimized TensorFlow.

Datasets

SQuAD data

Download and unzip the BERT Large uncased (whole word masking) model from the google bert repo. Set the DATASET_DIR to point to this directory when running BERT Large.

mkdir -p $DATASET_DIR && cd $DATASET_DIR
wget https://storage.googleapis.com/bert_models/2019_05_30/wwm_uncased_L-24_H-1024_A-16.zip
unzip wwm_uncased_L-24_H-1024_A-16.zip

wget https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v1.1.json -P wwm_uncased_L-24_H-1024_A-16

Follow instructions to generate BERT pre-training dataset in TensorFlow record file format. The output TensorFlow record files are expected to be located in the dataset directory ${DATASET_DIR}/tf_records. An example for the TF record file path should be ${DATASET_DIR}/tf_records/part-00430-of-00500.

Quick Start Scripts

Script name	Description
`pretraining.sh`	Uses mpirun to execute 1 process per socket for BERT Large pretraining with the specified precision (fp32, bfloat16 or fp16). Logs for each instance are saved to the output directory.

Run the model

Setup on baremetal

Setup your environment using the instructions below, depending on if you are using AI Tools:

Setup using AI Tools

Setup without AI Tools

To run using AI Tools you will need:

numactl
unzip
wget
openmpi-bin (only required for multi-instance)
openmpi-common (only required for multi-instance)
openssh-client (only required for multi-instance)
openssh-server (only required for multi-instance)
libopenmpi-dev (only required for multi-instance)
horovod==0.27.0 (only required for multi-instance)
Activate the `tensorflow` conda environment
```
conda activate tensorflow
```

To run without AI Tools you will need:

Python 3
intel-tensorflow>=2.5.0
git
numactl
openmpi-bin (only required for multi-instance)
openmpi-common (only required for multi-instance)
openssh-client (only required for multi-instance)
openssh-server (only required for multi-instance)
libopenmpi-dev (only required for multi-instance)
horovod==0.27.0 (only required for multi-instance)

A clone of the AI Reference Models repo

git clone https://github.com/IntelAI/models.git

Download checkpoints:

wget https://storage.googleapis.com/intel-optimized-tensorflow/models/v1_8/bert_large_checkpoints.zip
unzip bert_large_checkpoints.zip
export CHECKPOINT_DIR=$(pwd)/bert_large_checkpoints

Use --amp flag to run with grappler Auto-Mixed Precision pass with FP16 precision: ./quickstart/language_modeling/tensorflow/bert_large/inference/cpu/<script_name.sh> --amp. By default, BERT-Large-SQuAD FP16 inference runs with Keras Mixed-Precision policy. For more information on grappler Auto-Mixed Precision, please see https://www.intel.com/content/www/us/en/developer/articles/guide/getting-started-with-automixedprecisionmkl.html.

Run on Linux

Set environment variables to specify the dataset directory, precision to run, and an output directory.

# Navigate to the container package directory
cd models

# Set the required environment vars
export PRECISION=<specify the precision to run:fp32, bfloat16 or fp16>
export DATASET_DIR=<path to the dataset>
export OUTPUT_DIR=<directory where log files will be written>
export CHECKPOINT_DIR=<path to the downloaded checkpoints folder>

# Run the container with pretraining.sh quickstart script
./quickstart/language_modeling/tensorflow/bert_large/training/cpu/pretraining.sh

Additional Resources

To run more advanced use cases, see the instructions for the available precisions FP32 BFloat16 FP16 for calling the launch_benchmark.py script directly.
To run the model using docker, please see the Intel® Developer Catalog workload container:
https://www.intel.com/content/www/us/en/developer/articles/containers/bert-large-fp32-training-tensorflow-container.html.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

TensorFlow BERT Large Training

Datasets

SQuAD data

Quick Start Scripts

TensorFlow BERT Large Pretraining

Datasets

SQuAD data

Quick Start Scripts

Run the model

Setup on baremetal

Download checkpoints:

Run on Linux

Additional Resources

Files

README.md

Latest commit

History

README.md

File metadata and controls

TensorFlow BERT Large Training

Datasets

SQuAD data

Quick Start Scripts

TensorFlow BERT Large Pretraining

Datasets

SQuAD data

Quick Start Scripts

Run the model

Setup on baremetal

Download checkpoints:

Run on Linux

Additional Resources