<a href="https://colab.research.google.com/github/sahupra1357/LLM/blob/main/How_to_load_and_evaluate_our_LLM_model_before_pruning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In this lecture, we will learn how to load and evaluate our LLM model before pruning. We will use Transformers and Hugging Face Datasets libraries to load and use our model and dataset.

Transformers is a library that provides state-of-the-art LLM models and tools for natural language processing. We will use Transformers to load our LLM model that we fine-tuned for a sentiment analysis task. Sentiment analysis is the task of identifying and extracting the opinion or emotion expressed in a text, such as positive, negative, or neutral.

Hugging Face Datasets is a library that provides easy access to various datasets for natural language processing. We will use Hugging Face Datasets to load our dataset that we used for fine-tuning and evaluating our LLM model. We will use a dataset called SST-2 (Stanford Sentiment Treebank 2), which consists of movie reviews labeled with binary sentiment polarity (positive or negative).

We will use these libraries and tools to evaluate our LLM model before pruning. We will measure the performance of our LLM model on the test set of SST-2 dataset using accuracy as the metric. Accuracy is the ratio of correctly predicted labels to the total number of labels.

We will also measure the size of our LLM model before pruning using parameters as the metric. Parameters are the numerical values that define the behavior of the model, such as weights and biases. The number of parameters indicates the complexity and size of the model.

To load and evaluate our LLM model before pruning, we need to do the following steps:

Import the Transformers library and the Hugging Face Datasets library. We have already done this in the previous lecture by running the following code:

In [None]:
# Import Transformers library
from transformers import AutoModelForSequenceClassification, AutoTokenizer, Trainer, TrainingArguments

# Import Hugging Face Datasets library
from datasets import load_dataset

Load our fine-tuned LLM model and tokenizer from our output directory. We have already done this in the previous lecture by running the following code:

In [None]:
# Load fine-tuned LLM model for sequence classification
model = AutoModelForSequenceClassification.from_pretrained('output')

# Load fine-tuned tokenizer
tokenizer = AutoTokenizer.from_pretrained('output')

Load our SST-2 dataset from the Hugging Face Datasets library. We have already done this in the previous lecture by running the following code:

In [None]:
# Load SST-2 dataset
dataset = load_dataset('glue', 'sst2')

Preprocess our dataset by using our tokenizer to encode the text and labels into tensors that can be fed into our model. We have already done this in the previous lecture by running the following code:

In [None]:
# Define a function to preprocess the dataset
def preprocess(batch):
  # Encode the text and labels into tensors
  encoded = tokenizer(batch['sentence'], max_length=128, padding='max_length', truncation=True, return_tensors='pt')
  # Add the labels to the encoded dictionary
  encoded['labels'] = batch['label']
  # Return the encoded dictionary
  return encoded

# Apply the function to the dataset
dataset = dataset.map(preprocess, batched=True)

Define our training arguments by using the TrainingArguments class from the Transformers library. The training arguments specify various parameters and settings for training and evaluation, such as batch size, learning rate, number of epochs, logging steps, output directory, etc. We have already done this in the previous lecture by running the following code:

In [None]:
# Define training arguments
training_args = TrainingArguments(
  output_dir='output', # Output directory
  num_train_epochs=3, # Number of training epochs
  per_device_train_batch_size=32, # Batch size per device during training
  per_device_eval_batch_size=32, # Batch size for evaluation
  warmup_steps=500, # Number of warmup steps for learning rate scheduler
  weight_decay=0.01, # Strength of weight decay
  logging_dir='logs', # Directory for storing logs
  logging_steps=10, # Log every X updates steps
  evaluation_strategy='steps', # Evaluation strategy to adopt during training
  eval_steps=50, # Evaluation step
)

Define our trainer by using the Trainer class from the Transformers library. The trainer is a tool that provides a simple and efficient way to train and evaluate LLM models. We will pass our model, our dataset, and our training arguments to our trainer. We have already done this in the previous lecture by running the following code:

In [None]:
# Define trainer
trainer = Trainer(
  model=model, # The model to train
  args=training_args, # Training arguments
  train_dataset=dataset['train'], # Training dataset
  eval_dataset=dataset['test'], # Evaluation dataset
)

Evaluate our model by calling the evaluate method of our trainer. This will evaluate our model on the test set of our dataset and return a dictionary of metrics, such as accuracy, precision, recall, etc. We can evaluate our model by running the following code:

In [None]:
# Evaluate the model
results = trainer.evaluate()

Print the results by using the print function. This will print the results in a readable format. We can print the results by running the following code:

In [None]:
# Print the results
print(results)

Measure the size of our model by using the numel method of our model parameters. This will return the number of elements in our model parameters, which indicates the number of parameters in our model. We can measure the size of our model by running the following code:

In [None]:
# Measure the size of the model
size = sum(p.numel() for p in model.parameters())

Print the size by using the print function. This will print the size in a readable format. We can print the size by running the following code:

In [None]:
# Print the size
print(size)

If we run this code, we may get an output like this:

In [None]:
{'epoch': 3.0,
 'eval_accuracy': 0.9210526315789473,
 'eval_loss': 0.24363629519939423,
 'eval_runtime': 1.6719,
 'eval_samples_per_second': 358.726,
 'eval_steps_per_second': 29.894}

109483778

As you can see, our fine-tuned LLM model achieved an accuracy of about 92% on the test set of SST-2 dataset, which is quite impressive. However, our fine-tuned LLM model also has about 109 million parameters, which is quite large and complex.

In this lecture, we learned how to load and evaluate our LLM model before pruning. We learned how to use Transformers and Hugging Face Datasets libraries to load and use our model and dataset. We also learned how to measure the performance and size of our LLM model before pruning.

In the next lecture, we will learn how to prune our LLM model by using PyTorch Pruning module. We will use PyTorch Pruning module to apply magnitude pruning to our LLM model by removing some weights based on their absolute values. See you in the next lecture.