# ```Hugging Face Trainers```


### Hugging Face trainers offer a simplified approach to training generative AI models, making it easier to set up and run complex machine learning tasks. This tool wraps up the hard parts, like handling data and carrying out the training process, allowing us to focus on the big picture and achieve better outcomes with our AI endeavors.



## ``` Technical Terms Explained```:
```Truncating```: This refers to shortening longer pieces of text to fit a certain size limit.

```Padding```: Adding extra data to shorter texts to reach a uniform length for processing.

```Batches```: Batches are small, evenly divided parts of data that the AI looks at and learns from each step of the way.

```Batch Size```: The number of data samples that the machine considers in one go during training.

```Epochs```: A complete pass through the entire training dataset. The more epochs, the more the computer goes over the material to learn.

```Dataset Splits```: Dividing the dataset into parts for different uses, such as training the model and testing how well it works.



---

In [6]:
from datasets import load_dataset
from transformers import (DistilBertForSequenceClassification,
                            DistilBertTokenizer,
                            TrainingArguments,
                            Trainer)

In [None]:
model = DistilBertForSequenceClassification.from_pretrained('distilbert-base-uncased', num_labels=2)

tokenizer = DistilBertTokenizer.from_pretrained('distinbert-base-uncased')

In [None]:
def tokenize_funcion(examples):
    return tokenizer(examples['text'], padding='max_length', truncation=True)


dataset = load_dataset('imdb')
tokenized_datasets = dataset.map(tokenize_funcion, batched=True)

In [None]:
training_args = TrainingArguments(
    per_device_train_batch_size=64,
    output_dir='./results',
    learning_rate=2e-5,
    num_train_epochs=3,
)

trainer = trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_datasets['train'],
    eval_dataset=tokenized_datasets['test'],
    args=training_args,
    train_dataset=tokenized_datasets['train'],
    eval_dataset=tokenized_datasets['test'],
)

In [None]:
trainer.train()

---

## Code Example


In [8]:
from transformers import (DistilBertForSequenceClassification,
    DistilBertTokenizer,
    TrainingArguments,
    Trainer
)
from datasets import load_dataset

model = DistilBertForSequenceClassification.from_pretrained(
    "distilbert-base-uncased", num_labels=2
)
tokenizer = DistilBertTokenizer.from_pretrained("distilbert-base-uncased")

def tokenize_function(examples):
    return tokenizer(examples["text"], padding="max_length", truncation=True)


dataset = load_dataset("imdb")
tokenized_datasets = dataset.map(tokenize_function, batched=True)

training_args = TrainingArguments(
    per_device_train_batch_size=64,
    output_dir="./results",
    learning_rate=2e-5,
    num_train_epochs=3,
)
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_datasets["train"],
    eval_dataset=tokenized_datasets["test"],
)


Map:   0%|          | 0/25000 [00:00<?, ? examples/s]

Map:   0%|          | 0/50000 [00:00<?, ? examples/s]

  0%|          | 0/1173 [00:00<?, ?it/s]

: 

---

## ```Resources```

[Hugging Face Trainers documentation index](https://huggingface.co/docs/transformers/main_classes/trainer)

[Hugging Face DistilBertForSequenceClassification documentation](https://huggingface.co/docs/transformers/model_doc/distilbert#transformers.DistilBertForSequenceClassification)

[Hugging Face DistilBertTokenizer documentation](https://huggingface.co/docs/transformers/model_doc/distilbert#transformers.DistilBertTokenizer)

[distilbert-base-uncased Model documentation on Hugging Face](https://huggingface.co/distilbert-base-uncased)

[Hugging Face transformers.TrainingArguments documentation](https://huggingface.co/docs/transformers/main/en/main_classes/trainer#transformers.TrainingArguments)

[Hugging Face transformers.Trainer docum](https://huggingface.co/docs/transformers/main/en/main_classes/trainer#transformers.Trainer)