# Fine-Tuning a BERT Model and Create a Text Classifier

We have already performed the Feature Engineering to create BERT embeddings from the `reviews_body` text using the pre-trained BERT model, and split the dataset into train, validation and test files. To optimize for Tensorflow training, we saved the files in TFRecord format. 

Now, let’s fine-tune the BERT model to our Customer Reviews Dataset and add a new classification layer to predict the `star_rating` for a given `review_body`.

![BERT Training](img/bert_training.png)

As mentioned earlier, BERT’s attention mechanism is called a Transformer. This is, not coincidentally, the name of the popular BERT Python library, “Transformers,” maintained by a company called [HuggingFace](https://github.com/huggingface/transformers). We will use a variant of BERT called [DistilBert](https://arxiv.org/pdf/1910.01108.pdf) which requires less memory and compute, but maintains very good accuracy on our dataset.

# DEMO 2: 

# Run Model Training on Amazon Elastic Kubernetes Service (Amazon EKS)

Amazon EKS is a managed service that makes it easy for you to run Kubernetes on AWS without needing to install and operate your own Kubernetes control plane or worker nodes.

## Amazon FSx For Lustre

Amazon FSx for Lustre is a fully managed service that provides cost-effective, high-performance storage for compute workloads. Many workloads such as machine learning, high performance computing (HPC), video rendering, and financial simulations depend on compute instances accessing the same set of data through high-performance shared storage.

Powered by Lustre, the world's most popular high-performance file system, FSx for Lustre offers sub-millisecond latencies, up to hundreds of gigabytes per second of throughput, and millions of IOPS. It provides multiple deployment options and storage types to optimize cost and performance for your workload requirements.

FSx for Lustre file systems can also be linked to Amazon S3 buckets, allowing you to access and process data concurrently from both a high-performance file system and from the S3 API.

## Using Amazon FSx for Lustre Container Storage Interface (CSI) 

The Amazon FSx for Lustre Container Storage Interface (CSI)  driver provides a CSI interface that allows Amazon EKS clusters to manage the lifecycle of Amazon FSx for Lustre file systems. 

* https://docs.aws.amazon.com/eks/latest/userguide/fsx-csi.html
* https://github.com/kubernetes-sigs/aws-fsx-csi-driver


```
code/
	train.py

input/
	data/
		test/
			*.tfrecord
		train/
			*.tfrecord
		validation/
			*.tfrecord

```

## List FSx Files

In [None]:
!pip install -q awscli==1.18.183 boto3==1.16.23

In [None]:
!aws s3 ls --recursive s3://fsx-antje/

## Model Training Code `train.py`

In [None]:
!pygmentize code/train.py

## Write `train.yaml`

In [None]:
!pygmentize ./train.yaml

In [None]:
!aws s3 cp code/train.py s3://fsx-antje/code/train.py

## Create Kubernetes Training Job

In [None]:
!kubectl get nodes

In [None]:
!kubectl delete -f train.yaml

In [None]:
!kubectl create -f train.yaml

## Describe Training Job

In [None]:
!kubectl get pods

In [None]:
!kubectl get pod bert-model-training

In [None]:
!kubectl describe pod bert-model-training

## Review Training Job Logs

In [None]:
%%time

!kubectl logs -f bert-model-training