### Implementing BERT for Text Classification

#### Install Hugging Face Transformers
```pip install transformers torch datasets tf-keras```

In [1]:
import torch
from transformers import BertTokenizer, BertForSequenceClassification
from transformers import Trainer, TrainingArguments
from datasets import load_dataset

  from .autonotebook import tqdm as notebook_tqdm


#### Authenticate with Hugging Face
In terminal:
```sh
huggingface-cli login
```
Put the hugging-face token

#### Load Dataset
We'll use the IMDb movie reviews dataset.

In [2]:
dataset = load_dataset("imdb")

# Tokenizer for BERT
tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")

# Tokenizing the dataset
def tokenize_function(examples):
    return tokenizer(examples["text"], padding="max_length", truncation=True)

tokenized_datasets = dataset.map(tokenize_function, batched=True)

#### Define BERT Model

In [3]:
model = BertForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=2)

Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


#### Train the Model
``` pip install "accelerate>=0.26.0 ```

In [None]:
training_args = TrainingArguments(
    output_dir="./results",
    evaluation_strategy="epoch",
    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    num_train_epochs=2,
    weight_decay=0.01,
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_datasets["train"],
    eval_dataset=tokenized_datasets["test"],
)

trainer.train()




Epoch,Training Loss,Validation Loss
