<a href="https://colab.research.google.com/github/FahadEbrahim/AdaptIRC/blob/main/NLBSE2024_AdaptIRC_withSave.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# How to run

AdaptIRC:


* This notebook implements the adapter approach (AdaptIRC) on the NLBSE 2024 Issue Report Classification task.

* To run the notebook in Colab, just change the environment to GPU through: Runtime >> Change runtime type >> Hardware Accelerator >> GPU.

* You may require WANDB token if using newer versions of transformers lib

# Install Dependencies

The code uses the adapters library, Huggingface datasets and accelerate. The adapters would install the transformers libraries
For reproducibility, the versions have been specified.


In [1]:
!pip install adapters==0.0.0.dev20231116
!pip install datasets==2.14.7
!pip install accelerate==0.24.1

Collecting adapters==0.0.0.dev20231116
  Downloading adapters-0.0.0.dev20231116-py3-none-any.whl (243 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m243.7/243.7 kB[0m [31m5.0 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting transformers==4.33.3 (from adapters==0.0.0.dev20231116)
  Downloading transformers-4.33.3-py3-none-any.whl (7.6 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.6/7.6 MB[0m [31m68.4 MB/s[0m eta [36m0:00:00[0m
Collecting tokenizers!=0.11.3,<0.14,>=0.11.1 (from transformers==4.33.3->adapters==0.0.0.dev20231116)
  Downloading tokenizers-0.13.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.8/7.8 MB[0m [31m115.8 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: tokenizers, transformers, adapters
  Attempting uninstall: tokenizers
    Found existing installation: tokenizers 0.15.0
    Uninstalling tokenizers-0.15.0:
      Succ

# Get the Dataset from Github

The code clones the Github repository provided by NLBSE'2024 (https://github.com/nlbse2024/issue-report-classification):



In [2]:
!git clone https://github.com/nlbse2024/issue-report-classification.git

Cloning into 'issue-report-classification'...
remote: Enumerating objects: 76, done.[K
remote: Counting objects: 100% (76/76), done.[K
remote: Compressing objects: 100% (59/59), done.[K
remote: Total 76 (delta 37), reused 39 (delta 14), pack-reused 0[K
Receiving objects: 100% (76/76), 1.59 MiB | 5.02 MiB/s, done.
Resolving deltas: 100% (37/37), done.


# Import Libraires

Here, we are importing libraries that would be used throughout the notebook. (Pandas, Json, OS, Sklearn, numpy, collections, transformers, adapters, random, torch, re [regular expression] ).

In [3]:
import pandas as pd
import json
import os
from collections import defaultdict
import numpy as np
import torch
import random
import re

from transformers import TextClassificationPipeline,AutoConfig
from transformers import TrainingArguments, EvalPrediction
from transformers import AutoTokenizer, DataCollatorWithPadding
from transformers import AutoConfig
from transformers import TrainingArguments, EvalPrediction
from transformers import set_seed
from transformers import EarlyStoppingCallback, IntervalStrategy,TrainerCallback

from adapters import AdapterTrainer,AutoAdapterModel
from datasets import Dataset
from sklearn.metrics import classification_report, accuracy_score,recall_score,f1_score,precision_score

# Setting Seed

These lines set the seed for reproducability for several libraries ( torch, random, numpy, transformers)

In [4]:
RANDOM_SEED = 42
set_seed(RANDOM_SEED)
torch.manual_seed(RANDOM_SEED)
random.seed(RANDOM_SEED)
np.random.seed(RANDOM_SEED)

# Dataset

Reading the dataset cloned from NLBSE Github repository:

In [5]:
train_set = pd.read_csv("/content/issue-report-classification/data/issues_train.csv")
test_set = pd.read_csv("/content/issue-report-classification/data/issues_test.csv")

# Dataset Processing

In [6]:
# There were some Nan values that causes some issues, so, they are replaced with a single space
train_set=train_set.fillna(' ')
test_set=test_set.fillna(' ')

This function is used to pre-process the issues with various steps (
  
  * removing strings between triple Quotes
  * Remove new lines
  * Remove Links
  * Remove digits
  * Remove special characters except the question mark
  * Remove multiple spaces


In [7]:
def preprocess(issues):
    processed_issues = []

    for issue in issues:

        # Remove strings between triple quotes
        issue = re.sub(r'```.*?```', ' ', issue, flags=re.DOTALL)

        # Remove new lines
        issue = re.sub(r'\n', ' ', issue)

        # Remove links
        issue = re.sub(r'http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\\(\\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+', ' ', issue)

        # Remove digits
        issue = re.sub(r'\d+', ' ', issue)

        # Remove special characters except the question marks
        issue = re.sub(r'[^a-zA-Z0-9?\s]', ' ', issue)
        issue = re.sub(r'\s+', ' ', issue)

        processed_issues.append(issue)

    return processed_issues

In [8]:
# Apply the pre-process function for both train and testing sets on both the title and body.
train_set['title'] = preprocess(train_set['title'])
train_set['body'] = preprocess(train_set['body'])

test_set['title'] = preprocess(train_set['title'])
test_set['body'] = preprocess(train_set['body'])

In [9]:
# This code is taken from NLBSE
# creating the dataset with grouping it via repositry (repo)

repos = list(set(train_set["repo"].unique()))

train_set.groupby(["repo", "label"]).size().unstack(fill_value=0)

# Combining the title and body for a new field called text.
def process_dataset(dataset):
    dataset['text'] = dataset['title'] + " " + str(dataset['body'])
    dataset = dataset[['text', 'label', 'repo']]
    return dataset

train_set = process_dataset(train_set)
test_set = process_dataset(test_set)

group_by_repo = lambda dataset: {
    repo: Dataset.from_pandas(dataset[dataset["repo"] == repo]).class_encode_column("label")
    for repo in dataset["repo"].unique()
}

train_sets = group_by_repo(train_set)
test_sets = group_by_repo(test_set)

datasets = {
    repo: {'train': train_sets[repo], 'test': test_sets[repo]} for repo in train_sets.keys()
}

Casting to class labels:   0%|          | 0/300 [00:00<?, ? examples/s]

Casting to class labels:   0%|          | 0/300 [00:00<?, ? examples/s]

Casting to class labels:   0%|          | 0/300 [00:00<?, ? examples/s]

Casting to class labels:   0%|          | 0/300 [00:00<?, ? examples/s]

Casting to class labels:   0%|          | 0/300 [00:00<?, ? examples/s]

Casting to class labels:   0%|          | 0/300 [00:00<?, ? examples/s]

Casting to class labels:   0%|          | 0/300 [00:00<?, ? examples/s]

Casting to class labels:   0%|          | 0/300 [00:00<?, ? examples/s]

Casting to class labels:   0%|          | 0/300 [00:00<?, ? examples/s]

Casting to class labels:   0%|          | 0/300 [00:00<?, ? examples/s]

# Model Configuration

Here is the new important code: Setting the configurations of the adapters and transformer model.

In [10]:
# Model name to be used is: Roberta-Base
model_name = "roberta-base"

# The tokenizer is based on Roberta. The configurations are: Max_length = 256, truncation = true, padding = max_length.
tokenizer = AutoTokenizer.from_pretrained(model_name,max_length=256,truncation=True, padding="max_length")

# Configuration: We have 3 labels: Bug, Enhancment, Question.
config = AutoConfig.from_pretrained(
    model_name,
    num_labels=3,
)

# Configuration of the Adapter model.
model = AutoAdapterModel.from_pretrained(
    model_name,
    config=config,
)

# Metrics used for evaluation (accuracy, precision, recall and F1)
def compute_accuracy(p: EvalPrediction):
  labels = p.label_ids
  preds = np.argmax(p.predictions, axis=1)
  accuracy = accuracy_score(y_true=labels, y_pred=preds)
  recall = recall_score(y_true=labels, y_pred=preds,average="weighted")
  precision = precision_score(y_true=labels, y_pred=preds,average="weighted")
  f1 = f1_score(y_true=labels, y_pred=preds,average="weighted")
  return {"accuracy": accuracy, "precision": precision, "recall": recall, "f1": f1}

# The function used to tokenize the issues. The same settings
# Max_length = 256, truncation = true, padding = max_length.

def encode_batch(batch):
  return tokenizer(batch["text"], max_length=256, truncation=True, padding="max_length")

# Having a data Collator
data_collator = DataCollatorWithPadding(tokenizer=tokenizer)


config.json:   0%|          | 0.00/481 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/499M [00:00<?, ?B/s]

Some weights of RobertaAdapterModel were not initialized from the model checkpoint at roberta-base and are newly initialized: ['roberta.pooler.dense.weight', 'roberta.pooler.dense.bias', 'heads.default.3.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


# Creating Training and Infering Adapters

In [11]:
# Clearing the GPU just in case
import torch
torch.cuda.empty_cache()

The training occurs at every repository:
* The train set is divided into: train and validate with the percentage of 30%.
* A classfication head is attached to the model defining the number of labels to be 3 and defining the labels.
* Initilaising the training of the Adapter
* Using Adapter Droput Trainer as the Callback.
* Configuring the adapter configuarion.
* Configure the trainer
* Adding the callback.
* Start training the adapter
* Evalauting the adapter

In [12]:
!mkdir Adapters

In [13]:
results = defaultdict(dict)

for repo in datasets.keys():

    # Extracting the training and testing sets from the dataset per repo
    train_set = datasets[repo]['train']
    test_set = datasets[repo]['test']

    # Tokenizing the training set
    train_set = train_set.map(encode_batch, batched=True)
    train_set.set_format(type="torch", columns=["input_ids", "attention_mask", "label"])

    # Creating a Validation set:
    train_val_set = train_set.train_test_split(test_size=0.3,seed = RANDOM_SEED)
    train_set = train_val_set['train']
    val_set = train_val_set['test']

    # Adapter Name and Saving Directory
    adapter_name = "Report_" + repo
    adapter_name = adapter_name.replace('/','_')
    saved_directoy = "./training_output/" + adapter_name + ""
    adapter_save_directoy = "Adapters/" + adapter_name + ""
    !mkdir {adapter_save_directoy}

    # Add an Adapter
    model.add_adapter(adapter_name,overwrite_ok=True)
    # Add a matching classification head

    # Add a Classification Head
    model.add_classification_head(
    adapter_name,
    num_labels=3,
    id2label={ 0: "bug", 1: "feature",2:"question"},overwrite_ok=True)

    # Initilaize the adapter training
    model.train_adapter(adapter_name)

    # Create an Adapter Callback
    class AdapterDropTrainerCallback(TrainerCallback):
        def on_step_begin(self, args, state, control, **kwargs):
          skip_layers = list(range(np.random.randint(0, 11)))
          kwargs['model'].set_active_adapters(adapter_name, skip_layers=skip_layers)

        def on_evaluate(self, args, state, control, **kwargs):
          kwargs['model'].set_active_adapters(adapter_name, skip_layers=None)

    # Configure Training arguements
    training_args = TrainingArguments(
    learning_rate=1e-4,
    num_train_epochs=200,
    per_device_train_batch_size=32,
    per_device_eval_batch_size=32,
    logging_steps=200,
    output_dir=saved_directoy,
    overwrite_output_dir=True,
    remove_unused_columns=False,
    seed= RANDOM_SEED
    )

    # Configure the Adapter Trainer
    trainer = AdapterTrainer(
    model=model,
    args=training_args,
    train_dataset=train_set,
    eval_dataset=val_set,
    compute_metrics=compute_accuracy,
    data_collator=data_collator,)

    # Add the callback to the trainer
    trainer.add_callback(AdapterDropTrainerCallback())

    # Start training the adapter
    trainer.train()

    # Evaluating the Adapter and printing the results.
    eval_results = trainer.evaluate()
    print(eval_results)

    # Merging the Repo
    model.merge_adapter(adapter_name)

    # This part is for inferencing
    # A TextClassificationPipeline is being created with the same settings
    # max_length = 256, padding = 'max_length',truncation =True

    classifier = TextClassificationPipeline(model=model, tokenizer=tokenizer,
                                            device=training_args.device.index,
                                            max_length = 256, padding = 'max_length',truncation =True)
    y_pred = []

    # Looping through the testing set and getting the prediction labels.
    for i in range(len(test_set['text'])):
        text = test_set['text'][i]
        y = classifier(text)
        y = y[0]['label']
        label_id = model.config.label2id[y]
        y_pred.append(label_id)

    # Calcualting and Adding the metrics
    results[repo]['metrics'] = classification_report(test_set['label'], y_pred, digits=4, output_dict=True)
    results[repo]['predictions'] = y_pred
    results['label_mapping'] = {train_set.features["label"].int2str(x): x for x in range(train_set.features["label"].num_classes)}


    # Saving the Adapter
    model.save_adapter(adapter_save_directoy,adapter_name)

    # Clearing the adapter and CH for every repo.
    model.delete_adapter(adapter_name)
    model.delete_head(adapter_name)

Map:   0%|          | 0/300 [00:00<?, ? examples/s]

You're using a RobertaTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.


Step,Training Loss
200,0.7805
400,0.3005
600,0.13
800,0.0735
1000,0.0535
1200,0.0496
1400,0.0465


The model 'RobertaAdapterModel' is not supported for . Supported models are ['AlbertForSequenceClassification', 'BartForSequenceClassification', 'BertForSequenceClassification', 'BigBirdForSequenceClassification', 'BigBirdPegasusForSequenceClassification', 'BioGptForSequenceClassification', 'BloomForSequenceClassification', 'CamembertForSequenceClassification', 'CanineForSequenceClassification', 'LlamaForSequenceClassification', 'ConvBertForSequenceClassification', 'CTRLForSequenceClassification', 'Data2VecTextForSequenceClassification', 'DebertaForSequenceClassification', 'DebertaV2ForSequenceClassification', 'DistilBertForSequenceClassification', 'ElectraForSequenceClassification', 'ErnieForSequenceClassification', 'ErnieMForSequenceClassification', 'EsmForSequenceClassification', 'FalconForSequenceClassification', 'FlaubertForSequenceClassification', 'FNetForSequenceClassification', 'FunnelForSequenceClassification', 'GPT2ForSequenceClassification', 'GPT2ForSequenceClassification', 

{'eval_loss': 2.1851906776428223, 'eval_accuracy': 0.7555555555555555, 'eval_precision': 0.7673099415204679, 'eval_recall': 0.7555555555555555, 'eval_f1': 0.7499905276120111, 'eval_runtime': 1.388, 'eval_samples_per_second': 64.841, 'eval_steps_per_second': 2.161, 'epoch': 200.0}




Map:   0%|          | 0/300 [00:00<?, ? examples/s]



Step,Training Loss
200,1.0498
400,0.546
600,0.2434
800,0.1496
1000,0.1266
1200,0.0944
1400,0.0825


The model 'RobertaAdapterModel' is not supported for . Supported models are ['AlbertForSequenceClassification', 'BartForSequenceClassification', 'BertForSequenceClassification', 'BigBirdForSequenceClassification', 'BigBirdPegasusForSequenceClassification', 'BioGptForSequenceClassification', 'BloomForSequenceClassification', 'CamembertForSequenceClassification', 'CanineForSequenceClassification', 'LlamaForSequenceClassification', 'ConvBertForSequenceClassification', 'CTRLForSequenceClassification', 'Data2VecTextForSequenceClassification', 'DebertaForSequenceClassification', 'DebertaV2ForSequenceClassification', 'DistilBertForSequenceClassification', 'ElectraForSequenceClassification', 'ErnieForSequenceClassification', 'ErnieMForSequenceClassification', 'EsmForSequenceClassification', 'FalconForSequenceClassification', 'FlaubertForSequenceClassification', 'FNetForSequenceClassification', 'FunnelForSequenceClassification', 'GPT2ForSequenceClassification', 'GPT2ForSequenceClassification', 

{'eval_loss': 4.313364028930664, 'eval_accuracy': 0.4666666666666667, 'eval_precision': 0.5063881368229194, 'eval_recall': 0.4666666666666667, 'eval_f1': 0.47416866658245965, 'eval_runtime': 1.3852, 'eval_samples_per_second': 64.974, 'eval_steps_per_second': 2.166, 'epoch': 200.0}




Map:   0%|          | 0/300 [00:00<?, ? examples/s]



Step,Training Loss
200,1.0845
400,0.4938
600,0.1508
800,0.0835
1000,0.0653
1200,0.0553
1400,0.0538


The model 'RobertaAdapterModel' is not supported for . Supported models are ['AlbertForSequenceClassification', 'BartForSequenceClassification', 'BertForSequenceClassification', 'BigBirdForSequenceClassification', 'BigBirdPegasusForSequenceClassification', 'BioGptForSequenceClassification', 'BloomForSequenceClassification', 'CamembertForSequenceClassification', 'CanineForSequenceClassification', 'LlamaForSequenceClassification', 'ConvBertForSequenceClassification', 'CTRLForSequenceClassification', 'Data2VecTextForSequenceClassification', 'DebertaForSequenceClassification', 'DebertaV2ForSequenceClassification', 'DistilBertForSequenceClassification', 'ElectraForSequenceClassification', 'ErnieForSequenceClassification', 'ErnieMForSequenceClassification', 'EsmForSequenceClassification', 'FalconForSequenceClassification', 'FlaubertForSequenceClassification', 'FNetForSequenceClassification', 'FunnelForSequenceClassification', 'GPT2ForSequenceClassification', 'GPT2ForSequenceClassification', 

{'eval_loss': 2.293001413345337, 'eval_accuracy': 0.7222222222222222, 'eval_precision': 0.7277925084175084, 'eval_recall': 0.7222222222222222, 'eval_f1': 0.7245942821573075, 'eval_runtime': 1.3868, 'eval_samples_per_second': 64.898, 'eval_steps_per_second': 2.163, 'epoch': 200.0}




Map:   0%|          | 0/300 [00:00<?, ? examples/s]



Step,Training Loss
200,1.0949
400,0.6861
600,0.2649
800,0.1417
1000,0.1123
1200,0.0932
1400,0.0801


The model 'RobertaAdapterModel' is not supported for . Supported models are ['AlbertForSequenceClassification', 'BartForSequenceClassification', 'BertForSequenceClassification', 'BigBirdForSequenceClassification', 'BigBirdPegasusForSequenceClassification', 'BioGptForSequenceClassification', 'BloomForSequenceClassification', 'CamembertForSequenceClassification', 'CanineForSequenceClassification', 'LlamaForSequenceClassification', 'ConvBertForSequenceClassification', 'CTRLForSequenceClassification', 'Data2VecTextForSequenceClassification', 'DebertaForSequenceClassification', 'DebertaV2ForSequenceClassification', 'DistilBertForSequenceClassification', 'ElectraForSequenceClassification', 'ErnieForSequenceClassification', 'ErnieMForSequenceClassification', 'EsmForSequenceClassification', 'FalconForSequenceClassification', 'FlaubertForSequenceClassification', 'FNetForSequenceClassification', 'FunnelForSequenceClassification', 'GPT2ForSequenceClassification', 'GPT2ForSequenceClassification', 

{'eval_loss': 2.821831226348877, 'eval_accuracy': 0.6333333333333333, 'eval_precision': 0.6575112264656203, 'eval_recall': 0.6333333333333333, 'eval_f1': 0.6410139302864712, 'eval_runtime': 1.3797, 'eval_samples_per_second': 65.231, 'eval_steps_per_second': 2.174, 'epoch': 200.0}




Map:   0%|          | 0/300 [00:00<?, ? examples/s]



Step,Training Loss
200,1.0902
400,0.4742
600,0.1435
800,0.0865
1000,0.0746
1200,0.0623
1400,0.06


The model 'RobertaAdapterModel' is not supported for . Supported models are ['AlbertForSequenceClassification', 'BartForSequenceClassification', 'BertForSequenceClassification', 'BigBirdForSequenceClassification', 'BigBirdPegasusForSequenceClassification', 'BioGptForSequenceClassification', 'BloomForSequenceClassification', 'CamembertForSequenceClassification', 'CanineForSequenceClassification', 'LlamaForSequenceClassification', 'ConvBertForSequenceClassification', 'CTRLForSequenceClassification', 'Data2VecTextForSequenceClassification', 'DebertaForSequenceClassification', 'DebertaV2ForSequenceClassification', 'DistilBertForSequenceClassification', 'ElectraForSequenceClassification', 'ErnieForSequenceClassification', 'ErnieMForSequenceClassification', 'EsmForSequenceClassification', 'FalconForSequenceClassification', 'FlaubertForSequenceClassification', 'FNetForSequenceClassification', 'FunnelForSequenceClassification', 'GPT2ForSequenceClassification', 'GPT2ForSequenceClassification', 

{'eval_loss': 3.3290584087371826, 'eval_accuracy': 0.6666666666666666, 'eval_precision': 0.6721200980392157, 'eval_recall': 0.6666666666666666, 'eval_f1': 0.6683352468427095, 'eval_runtime': 1.3842, 'eval_samples_per_second': 65.018, 'eval_steps_per_second': 2.167, 'epoch': 200.0}




# Metrics

This has been taken from the NLBSE repo, so, they are not commented.

In [14]:
print(results['label_mapping'])
for repo in repos:
    print(repo)
    print(json.dumps(results[repo]['metrics'], indent=4))

{'bug': 0, 'feature': 1, 'question': 2}
tensorflow/tensorflow
{
    "0": {
        "precision": 0.9204545454545454,
        "recall": 0.81,
        "f1-score": 0.8617021276595745,
        "support": 100
    },
    "1": {
        "precision": 0.8173076923076923,
        "recall": 0.85,
        "f1-score": 0.8333333333333334,
        "support": 100
    },
    "2": {
        "precision": 0.7962962962962963,
        "recall": 0.86,
        "f1-score": 0.826923076923077,
        "support": 100
    },
    "accuracy": 0.84,
    "macro avg": {
        "precision": 0.8446861780195114,
        "recall": 0.84,
        "f1-score": 0.840652845971995,
        "support": 300
    },
    "weighted avg": {
        "precision": 0.8446861780195114,
        "recall": 0.84,
        "f1-score": 0.840652845971995,
        "support": 300
    }
}
bitcoin/bitcoin
{
    "0": {
        "precision": 0.8958333333333334,
        "recall": 0.86,
        "f1-score": 0.8775510204081632,
        "support": 100
    },
   

In [15]:
class_metrics_sum = defaultdict(defaultdict)
labels = [key for key in results[repos[0]]['metrics'].keys() if key.isnumeric()]

for repo in repos:
    for label in labels:
        for metric in results[repo]['metrics'][label]:
            class_metrics_sum[label][metric] = class_metrics_sum[label].get(metric, 0) + results[repo]['metrics'][label][metric]

class_metrics_avg = {
    label: {
        metric: class_metrics_sum[label][metric] / len(repos)
        for metric in class_metrics_sum[label]
    }
    for label in labels
}

# add the average of the metric over all classes
class_metrics_avg['average'] = {
    metric: sum(class_metrics_avg[label][metric] for label in labels)
    / len(labels)
    for metric in class_metrics_avg[labels[0]]
}

# add to the results
results['overall'] = {
    'metrics': class_metrics_avg
}

In [16]:
results['overall']

{'metrics': {'0': {'precision': 0.897729890710135,
   'recall': 0.89,
   'f1-score': 0.8927976761371106,
   'support': 100.0},
  '1': {'precision': 0.8974590826334314,
   'recall': 0.9019999999999999,
   'f1-score': 0.8995401450835004,
   'support': 100.0},
  '2': {'precision': 0.8897309145880575,
   'recall': 0.8879999999999999,
   'f1-score': 0.8879947734479187,
   'support': 100.0},
  'average': {'precision': 0.894973295977208,
   'recall': 0.8933333333333332,
   'f1-score': 0.8934441982228432,
   'support': 100.0}}}

In [17]:
#The output json file would be created containing the results.
!mkdir Final_Results
OUTPUT_PATH = "Final_Results"
output_file_name = 'results.json'
with open(os.path.join(OUTPUT_PATH, output_file_name), 'w') as fp:
    json.dump(results, fp)

# References & Ack

This notebook uses codes from:
* https://github.com/adapter-hub/adapters/blob/main/notebooks/01_Adapter_Training.ipynb
* https://github.com/adapter-hub/adapters/blob/main/notebooks/05_Adapter_Drop_Training.ipynb
* https://huggingface.co/docs/transformers/tasks/sequence_classification
* https://github.com/nlbse2024/issue-report-classification/blob/main/2-Template-SetFit.ipynb
