# How to run

* This notebook implements the adapter approach (AdaptIRC) without Splitting the training data for evaluation and testing

* To run the notebook in Colab, just change the environment to GPU through: Runtime >> Change runtime type >> Hardware Accelerator >> GPU.

* You may require WANDB token if using newer versions of transformers lib

# Install Dependencies

In [1]:
!pip install -Uq adapters
!pip install -q datasets
!pip install -Uq accelerate
!pip install scikit-learn wandb pynvml

[0mCollecting scikit-learn
  Downloading scikit_learn-1.4.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (11 kB)
Collecting wandb
  Downloading wandb-0.17.0-py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (10 kB)
Collecting pynvml
  Downloading pynvml-11.5.0-py3-none-any.whl.metadata (7.8 kB)
Collecting scipy>=1.6.0 (from scikit-learn)
  Downloading scipy-1.13.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (60 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m60.6/60.6 kB[0m [31m855.8 kB/s[0m eta [36m0:00:00[0m [36m0:00:01[0m
[?25hCollecting joblib>=1.2.0 (from scikit-learn)
  Downloading joblib-1.4.2-py3-none-any.whl.metadata (5.4 kB)
Collecting threadpoolctl>=2.0.0 (from scikit-learn)
  Downloading threadpoolctl-3.5.0-py3-none-any.whl.metadata (13 kB)
Collecting click!=8.0.0,>=7.1 (from wandb)
  Downloading click-8.1.7-py3-none-any.whl.metadata (3.0 kB)
Coll

# Import Libraires

Here, we are importing libraries that would be used throughout the notebook. (Pandas, Json, OS, Sklearn, numpy, collections, transformers, adapters, random, torch, re [regular expression] ).

In [2]:
from collections import defaultdict

from transformers import TrainingArguments, EvalPrediction, TrainerCallback, DataCollatorWithPadding

from sklearn.metrics import classification_report, recall_score, f1_score, precision_score

# Setting Seed

These lines set the seed for reproducability for several libraries ( torch, random, numpy, transformers)

In [3]:
import torch
import random
from transformers import set_seed
import numpy as np
from pynvml import *

# Selecting a random seed
RANDOM_SEED = 42

# Setting seed for reproducability
set_seed(RANDOM_SEED)
torch.manual_seed(RANDOM_SEED)
random.seed(RANDOM_SEED)
np.random.seed(RANDOM_SEED)

# Dataset

Reading the dataset cloned from NLBSE Github repository:

In [4]:
import pandas as pd

# Reading the CSV files from the NLBSE github
train_set = pd.read_csv("https://raw.githubusercontent.com/nlbse2024/issue-report-classification/main/data/issues_train.csv")
test_set = pd.read_csv("https://raw.githubusercontent.com/nlbse2024/issue-report-classification/main/data/issues_test.csv")

# Dataset Processing

In [5]:
# There were some Nan values that causes some issues, so, they are replaced with a single space
train_set=train_set.fillna(' ')
test_set=test_set.fillna(' ')

This function is used to pre-process the issues with various steps (
  
  * removing strings between triple Quotes
  * Remove new lines
  * Remove Links
  * Remove digits
  * Remove special characters except the question mark
  * Remove multiple spaces


In [6]:
import re

def preprocess(issues):
    processed_issues = []

    for issue in issues:

        # Remove strings between triple quotes
        issue = re.sub(r'```.*?```', ' ', issue, flags=re.DOTALL)

        # Remove new lines
        issue = re.sub(r'\n', ' ', issue)

        # Remove links
        issue = re.sub(r'http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\\(\\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+', ' ', issue)

        # Remove digits
        issue = re.sub(r'\d+', ' ', issue)

        # Remove special characters except the question marks
        issue = re.sub(r'[^a-zA-Z0-9?\s]', ' ', issue)
        issue = re.sub(r'\s+', ' ', issue)

        processed_issues.append(issue)

    return processed_issues

In [7]:
# Apply the pre-process function for both train and testing sets on both the title and body.
train_set['title'] = preprocess(train_set['title'])
train_set['body'] = preprocess(train_set['body'])

test_set['title'] = preprocess(train_set['title'])
test_set['body'] = preprocess(train_set['body'])

In [8]:
# This code is taken from NLBSE
# creating the dataset with grouping it via repositry (repo)

from datasets import Dataset

repos = list(set(train_set["repo"].unique()))

train_set.groupby(["repo", "label"]).size().unstack(fill_value=0)

# Combining the title and body for a new field called text.
def process_dataset(dataset):
    dataset['text'] = dataset['title'] + " " + str(dataset['body'])
    dataset = dataset[['text', 'label', 'repo']]
    return dataset

train_set = process_dataset(train_set)
test_set = process_dataset(test_set)

group_by_repo = lambda dataset: {
    repo: Dataset.from_pandas(dataset[dataset["repo"] == repo]).class_encode_column("label")
    for repo in dataset["repo"].unique()
}

train_sets = group_by_repo(train_set)
test_sets = group_by_repo(test_set)

datasets = {
    repo: {'train': train_sets[repo], 'test': test_sets[repo]} for repo in train_sets.keys()
}

Casting to class labels:   0%|          | 0/300 [00:00<?, ? examples/s]

Casting to class labels:   0%|          | 0/300 [00:00<?, ? examples/s]

Casting to class labels:   0%|          | 0/300 [00:00<?, ? examples/s]

Casting to class labels:   0%|          | 0/300 [00:00<?, ? examples/s]

Casting to class labels:   0%|          | 0/300 [00:00<?, ? examples/s]

Casting to class labels:   0%|          | 0/300 [00:00<?, ? examples/s]

Casting to class labels:   0%|          | 0/300 [00:00<?, ? examples/s]

Casting to class labels:   0%|          | 0/300 [00:00<?, ? examples/s]

Casting to class labels:   0%|          | 0/300 [00:00<?, ? examples/s]

Casting to class labels:   0%|          | 0/300 [00:00<?, ? examples/s]

# Model Configuration

Here is the new important code: Setting the configurations of the adapters and transformer model.

In [9]:
from transformers import RobertaTokenizer, RobertaConfig, TextClassificationPipeline
from adapters import RobertaAdapterModel

def create_model(model_name="roberta-base", max_length=256, truncation=True, padding="max_length", device="cuda"):
  # The tokenizer is based on Roberta. The configurations are: Max_length = 256, truncation = true, padding = max_length.
  tokenizer = RobertaTokenizer.from_pretrained(model_name, device=device, max_length=max_length, truncation=truncation, padding=padding)

  # Configuration: We have 3 labels: Bug, Enhancment, Question.
  config = RobertaConfig.from_pretrained(model_name, device=device, num_labels=3)

  # Configuration of the Adapter model.
  model = RobertaAdapterModel.from_pretrained(model_name, config=config)

  # This part is for inferencing
  classifier = TextClassificationPipeline(model=model, tokenizer=tokenizer, device=device, max_length=max_length, padding=padding, truncation=truncation)

  return tokenizer, model, classifier

# Creating Training and Infering Adapters

The training occurs at every repository:
* A classfication head is attached to the model defining the number of labels to be 3 and defining the labels.
* Initilaising the training of the Adapter
* Using Adapter Droput Trainer as the Callback.
* Configuring the adapter configuarion.
* Configure the trainer
* Adding the callback.
* Start training the adapter
* Evalauting the adapter

In [10]:
def show_gpu():
    nvmlInit()
    device_count = nvmlDeviceGetCount()
    total_memory = 0
    total_used = 0

    for i in range(device_count):
        handle = nvmlDeviceGetHandleByIndex(i)
        info = nvmlDeviceGetMemoryInfo(handle)

        total_memory += (info.total // 1048576)
        total_used += (info.used // 1048576)

    gpu_benchmark = {
    "name": nvmlDeviceGetName(nvmlDeviceGetHandleByIndex(0)),
    "num:": device_count ,
    "total_memory": total_memory,
    "total_used": total_used}

    nvmlShutdown()

    return gpu_benchmark

In [11]:
from adapters import AdapterTrainer
import wandb
from datetime import datetime
import time


group = datetime.utcnow().replace(microsecond=0).isoformat()

references = {}
predictions = {}

# Parameter used for Training
learning_rate=1e-4
epochs=200
batch_size=32

for repo in datasets.keys():
  wandb.init(
    project="IRC RoBERTa Adapters",
    group=group,
    name=repo,   )

  dataset = datasets[repo]
  tokenizer, model, classifier = create_model()

  # Training Data
  train_set = dataset['train']

  id2label = {x: train_set.features["label"].int2str(x) for x in range(train_set.features["label"].num_classes)}

  # Tokenizing the training set
  train_set = train_set.shuffle(seed=RANDOM_SEED)
  train_set = train_set.map(lambda batch: tokenizer(batch["text"]), batched=True)
  train_set.set_format(type="torch", columns=["input_ids", "attention_mask", "label"])

  # test_set = dataset['test']
  # test_set = test_set.shuffle(seed=RANDOM_SEED)
  # test_set = test_set.map(lambda batch: tokenizer(batch["text"]), batched=True)
  # test_set.set_format(type="torch", columns=["input_ids", "attention_mask", "label"])

  # Adapter Name and Saving Direc
  adapter_name = f"irc-{repo.replace('/','-')}"

  # Add a new adapter
  model.add_adapter(adapter_name, overwrite_ok=True)

  # Add a matching classification head
  model.add_classification_head(
    adapter_name,
    num_labels=3,
    id2label=id2label,
    overwrite_ok=True
  )

  # Initilaize the adapter training
  model.train_adapter(adapter_name)

  # Metrics used for evaluation (accuracy, precision, recall and F1)
  def compute_metrics(p: EvalPrediction):
    labels = p.label_ids
    preds = np.argmax(p.predictions, axis=1)
    recall = recall_score(y_true=labels, y_pred=preds,average="weighted")
    precision = precision_score(y_true=labels, y_pred=preds,average="weighted")
    f1 = f1_score(y_true=labels, y_pred=preds,average="weighted")
    return {"precision": precision, "recall": recall, "f1": f1}

  # Configure Training arguements
  training_args = TrainingArguments(
    report_to="wandb",
    run_name=repo,
    learning_rate=learning_rate,
    num_train_epochs=epochs,
    per_device_train_batch_size=batch_size,
    logging_steps=100,
    output_dir=f"training_output/{adapter_name}",
    overwrite_output_dir=True,
    remove_unused_columns=False,
    save_strategy="no",
    evaluation_strategy="no",
    seed=RANDOM_SEED
  )

  # Having a data Collator
  data_collator = DataCollatorWithPadding(tokenizer=tokenizer)

  # Configure the Adapter Trainer
  trainer = AdapterTrainer(
    model=model,
    args=training_args,
    #eval_dataset=test_set,
    train_dataset=train_set,
    compute_metrics=compute_metrics,
    data_collator=data_collator
  )

  # Create an Adapter Callback
  class AdapterDropTrainerCallback(TrainerCallback):
      def on_step_begin(self, args, state, control, **kwargs):
        skip_layers = list(range(np.random.randint(0, 11)))
        kwargs['model'].set_active_adapters(adapter_name, skip_layers=skip_layers)

      def on_evaluate(self, args, state, control, **kwargs):
        kwargs['model'].set_active_adapters(adapter_name, skip_layers=None)

  # Add the callback to the trainer
  trainer.add_callback(AdapterDropTrainerCallback())

  is_show_train_gpu = True;

  # Start training the adapter
  train_output = trainer.train()
  print(train_output)

  #evaluation = trainer.evaluate()
  #display(evaluation)

  gpu_benchmark = show_gpu()
  is_show_train_gpu = False
  print(gpu_benchmark)

  # Save the adapter
  model.save_adapter(f"training_output/{adapter_name}", adapter_name)

  # Merging the Repo
  model.merge_adapter(adapter_name)

  test_set = dataset['test']

  # Calcualting and Adding the metrics
  references[repo] = [model.config.id2label[id] for id in test_set['label']]

  start = time.time()
  predictions[repo] = [prediction['label'] for prediction in classifier(test_set['text'])]
  end = time.time()
  print ("Prediction Time")
  print(end-start)

  print(model.adapter_summary())

  # If you want to save the adapters to HF.
  # model.push_adapter_to_hub(
  #   adapter_name,
  #   adapter_name,
  #   adapterhub_tag="IRC_Adapters",
  #   datasets_tag="IRC")

  wandb.finish()


[34m[1mwandb[0m: Logging into wandb.ai. (Learn how to deploy a W&B server locally: https://wandb.me/wandb-server)
[34m[1mwandb[0m: You can find your API key in your browser here: https://wandb.ai/authorize
[34m[1mwandb[0m: Paste an API key from your profile and hit enter, or press ctrl+c to quit:

  ········


[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc




tokenizer_config.json:   0%|          | 0.00/25.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

config.json:   0%|          | 0.00/481 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/499M [00:00<?, ?B/s]

Some weights of RobertaAdapterModel were not initialized from the model checkpoint at roberta-base and are newly initialized: ['heads.default.3.bias', 'roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
The model 'RobertaAdapterModel' is not supported for . Supported models are ['AlbertForSequenceClassification', 'BartForSequenceClassification', 'BertForSequenceClassification', 'BigBirdForSequenceClassification', 'BigBirdPegasusForSequenceClassification', 'BioGptForSequenceClassification', 'BloomForSequenceClassification', 'CamembertForSequenceClassification', 'CanineForSequenceClassification', 'LlamaForSequenceClassification', 'ConvBertForSequenceClassification', 'CTRLForSequenceClassification', 'Data2VecTextForSequenceClassification', 'DebertaForSequenceClassification', 'DebertaV2ForSequenceClassification', 'DistilBertForSequenceClassification', 'ElectraForSequenceCl

Map:   0%|          | 0/300 [00:00<?, ? examples/s]

dataloader_config = DataLoaderConfiguration(dispatch_batches=None, split_batches=False, even_batches=True, use_seedable_sampler=True)


Step,Training Loss
100,1.0894
200,0.6463
300,0.4607
400,0.3229
500,0.2337
600,0.1744
700,0.1443
800,0.1035
900,0.1021
1000,0.0726


TrainOutput(global_step=2000, training_loss=0.19330977535247804, metrics={'train_runtime': 220.1753, 'train_samples_per_second': 272.51, 'train_steps_per_second': 9.084, 'total_flos': 8965355621435424.0, 'train_loss': 0.19330977535247804, 'epoch': 200.0})
{'name': 'NVIDIA GeForce RTX 4090', 'num:': 1, 'total_memory': 24564, 'total_used': 10304}
Prediction Time
2.0355935096740723
Name                     Architecture         #Param      %Param  Active   Train
--------------------------------------------------------------------------------
irc-facebook-react       bottleneck          894,528       0.718       1       1
--------------------------------------------------------------------------------
Full model                               124,645,632     100.000               0


VBox(children=(Label(value='0.004 MB of 0.004 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
train/epoch,▁▁▂▂▂▃▃▄▄▄▅▅▅▆▆▇▇▇███
train/global_step,▁▁▂▂▂▃▃▄▄▄▅▅▅▆▆▇▇▇███
train/grad_norm,▂▃▃▅▃█▄▅▁▆▁▁▁▁▂▁▂▁▁▁
train/learning_rate,██▇▇▇▆▆▅▅▅▄▄▄▃▃▂▂▂▁▁
train/loss,█▅▄▃▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁

0,1
total_flos,8965355621435424.0
train/epoch,200.0
train/global_step,2000.0
train/grad_norm,0.00108
train/learning_rate,0.0
train/loss,0.0331
train_loss,0.19331
train_runtime,220.1753
train_samples_per_second,272.51
train_steps_per_second,9.084


[34m[1mwandb[0m: Currently logged in as: [33mfahad-ebrahim[0m. Use [1m`wandb login --relogin`[0m to force relogin


VBox(children=(Label(value='Waiting for wandb.init()...\r'), FloatProgress(value=0.011112186602420276, max=1.0…

Some weights of RobertaAdapterModel were not initialized from the model checkpoint at roberta-base and are newly initialized: ['heads.default.3.bias', 'roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
The model 'RobertaAdapterModel' is not supported for . Supported models are ['AlbertForSequenceClassification', 'BartForSequenceClassification', 'BertForSequenceClassification', 'BigBirdForSequenceClassification', 'BigBirdPegasusForSequenceClassification', 'BioGptForSequenceClassification', 'BloomForSequenceClassification', 'CamembertForSequenceClassification', 'CanineForSequenceClassification', 'LlamaForSequenceClassification', 'ConvBertForSequenceClassification', 'CTRLForSequenceClassification', 'Data2VecTextForSequenceClassification', 'DebertaForSequenceClassification', 'DebertaV2ForSequenceClassification', 'DistilBertForSequenceClassification', 'ElectraForSequenceCl

Map:   0%|          | 0/300 [00:00<?, ? examples/s]

dataloader_config = DataLoaderConfiguration(dispatch_batches=None, split_batches=False, even_batches=True, use_seedable_sampler=True)


Step,Training Loss
100,1.1017
200,1.0761
300,0.8372
400,0.7086
500,0.5493
600,0.3963
700,0.3153
800,0.2439
900,0.2265
1000,0.1703


TrainOutput(global_step=2000, training_loss=0.34420750665664673, metrics={'train_runtime': 231.0775, 'train_samples_per_second': 259.653, 'train_steps_per_second': 8.655, 'total_flos': 9324783175504320.0, 'train_loss': 0.34420750665664673, 'epoch': 200.0})
{'name': 'NVIDIA GeForce RTX 4090', 'num:': 1, 'total_memory': 24564, 'total_used': 10248}
Prediction Time
2.0244033336639404
Name                     Architecture         #Param      %Param  Active   Train
--------------------------------------------------------------------------------
irc-tensorflow-tensorflowbottleneck          894,528       0.718       1       1
--------------------------------------------------------------------------------
Full model                               124,645,632     100.000               0


VBox(children=(Label(value='0.004 MB of 0.004 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
train/epoch,▁▁▂▂▂▃▃▄▄▄▅▅▅▆▆▇▇▇███
train/global_step,▁▁▂▂▂▃▃▄▄▄▅▅▅▆▆▇▇▇███
train/grad_norm,▁▂▃▃▃▂▃▂▂▅▄█▇▁▁▁▁▁▁▁
train/learning_rate,██▇▇▇▆▆▅▅▅▄▄▄▃▃▂▂▂▁▁
train/loss,██▆▅▄▃▃▂▂▂▂▁▁▁▁▁▁▁▁▁

0,1
total_flos,9324783175504320.0
train/epoch,200.0
train/global_step,2000.0
train/grad_norm,1.65932
train/learning_rate,0.0
train/loss,0.1069
train_loss,0.34421
train_runtime,231.0775
train_samples_per_second,259.653
train_steps_per_second,8.655


VBox(children=(Label(value='Waiting for wandb.init()...\r'), FloatProgress(value=0.011112090034617318, max=1.0…

Some weights of RobertaAdapterModel were not initialized from the model checkpoint at roberta-base and are newly initialized: ['heads.default.3.bias', 'roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
The model 'RobertaAdapterModel' is not supported for . Supported models are ['AlbertForSequenceClassification', 'BartForSequenceClassification', 'BertForSequenceClassification', 'BigBirdForSequenceClassification', 'BigBirdPegasusForSequenceClassification', 'BioGptForSequenceClassification', 'BloomForSequenceClassification', 'CamembertForSequenceClassification', 'CanineForSequenceClassification', 'LlamaForSequenceClassification', 'ConvBertForSequenceClassification', 'CTRLForSequenceClassification', 'Data2VecTextForSequenceClassification', 'DebertaForSequenceClassification', 'DebertaV2ForSequenceClassification', 'DistilBertForSequenceClassification', 'ElectraForSequenceCl

Map:   0%|          | 0/300 [00:00<?, ? examples/s]

dataloader_config = DataLoaderConfiguration(dispatch_batches=None, split_batches=False, even_batches=True, use_seedable_sampler=True)


Step,Training Loss
100,1.1027
200,1.0806
300,0.7103
400,0.4917
500,0.3082
600,0.212
700,0.1729
800,0.1122
900,0.1279
1000,0.1054


TrainOutput(global_step=2000, training_loss=0.25574445617198943, metrics={'train_runtime': 219.1549, 'train_samples_per_second': 273.779, 'train_steps_per_second': 9.126, 'total_flos': 8929947319045920.0, 'train_loss': 0.25574445617198943, 'epoch': 200.0})
{'name': 'NVIDIA GeForce RTX 4090', 'num:': 1, 'total_memory': 24564, 'total_used': 9542}
Prediction Time
2.0362226963043213
Name                     Architecture         #Param      %Param  Active   Train
--------------------------------------------------------------------------------
irc-microsoft-vscode     bottleneck          894,528       0.718       1       1
--------------------------------------------------------------------------------
Full model                               124,645,632     100.000               0


VBox(children=(Label(value='0.004 MB of 0.004 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
train/epoch,▁▁▂▂▂▃▃▄▄▄▅▅▅▆▆▇▇▇███
train/global_step,▁▁▂▂▂▃▃▄▄▄▅▅▅▆▆▇▇▇███
train/grad_norm,▂▃▅█▃▁▅▁▂▃▁▁▂▁▅▁▁▁▁▁
train/learning_rate,██▇▇▇▆▆▅▅▅▄▄▄▃▃▂▂▂▁▁
train/loss,██▅▄▃▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁

0,1
total_flos,8929947319045920.0
train/epoch,200.0
train/global_step,2000.0
train/grad_norm,0.3269
train/learning_rate,0.0
train/loss,0.0571
train_loss,0.25574
train_runtime,219.1549
train_samples_per_second,273.779
train_steps_per_second,9.126


VBox(children=(Label(value='Waiting for wandb.init()...\r'), FloatProgress(value=0.011112057748768065, max=1.0…

Some weights of RobertaAdapterModel were not initialized from the model checkpoint at roberta-base and are newly initialized: ['heads.default.3.bias', 'roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
The model 'RobertaAdapterModel' is not supported for . Supported models are ['AlbertForSequenceClassification', 'BartForSequenceClassification', 'BertForSequenceClassification', 'BigBirdForSequenceClassification', 'BigBirdPegasusForSequenceClassification', 'BioGptForSequenceClassification', 'BloomForSequenceClassification', 'CamembertForSequenceClassification', 'CanineForSequenceClassification', 'LlamaForSequenceClassification', 'ConvBertForSequenceClassification', 'CTRLForSequenceClassification', 'Data2VecTextForSequenceClassification', 'DebertaForSequenceClassification', 'DebertaV2ForSequenceClassification', 'DistilBertForSequenceClassification', 'ElectraForSequenceCl

Map:   0%|          | 0/300 [00:00<?, ? examples/s]

dataloader_config = DataLoaderConfiguration(dispatch_batches=None, split_batches=False, even_batches=True, use_seedable_sampler=True)


Step,Training Loss
100,1.1028
200,1.0947
300,0.8981
400,0.6845
500,0.4768
600,0.3379
700,0.261
800,0.1931
900,0.1788
1000,0.1459


TrainOutput(global_step=2000, training_loss=0.3211271402835846, metrics={'train_runtime': 218.7149, 'train_samples_per_second': 274.33, 'train_steps_per_second': 9.144, 'total_flos': 8936296713584352.0, 'train_loss': 0.3211271402835846, 'epoch': 200.0})
{'name': 'NVIDIA GeForce RTX 4090', 'num:': 1, 'total_memory': 24564, 'total_used': 10466}
Prediction Time
2.0090291500091553
Name                     Architecture         #Param      %Param  Active   Train
--------------------------------------------------------------------------------
irc-bitcoin-bitcoin      bottleneck          894,528       0.718       1       1
--------------------------------------------------------------------------------
Full model                               124,645,632     100.000               0


VBox(children=(Label(value='0.004 MB of 0.004 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
train/epoch,▁▁▂▂▂▃▃▄▄▄▅▅▅▆▆▇▇▇███
train/global_step,▁▁▂▂▂▃▃▄▄▄▅▅▅▆▆▇▇▇███
train/grad_norm,▂▁▂▃▃▃▄▂▅▁▂█▂▁▆▁▁▂▁▁
train/learning_rate,██▇▇▇▆▆▅▅▅▄▄▄▃▃▂▂▂▁▁
train/loss,██▇▅▄▃▂▂▂▂▂▁▁▁▁▂▁▁▁▁

0,1
total_flos,8936296713584352.0
train/epoch,200.0
train/global_step,2000.0
train/grad_norm,0.04039
train/learning_rate,0.0
train/loss,0.0957
train_loss,0.32113
train_runtime,218.7149
train_samples_per_second,274.33
train_steps_per_second,9.144


VBox(children=(Label(value='Waiting for wandb.init()...\r'), FloatProgress(value=0.011112079603804483, max=1.0…

Some weights of RobertaAdapterModel were not initialized from the model checkpoint at roberta-base and are newly initialized: ['heads.default.3.bias', 'roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
The model 'RobertaAdapterModel' is not supported for . Supported models are ['AlbertForSequenceClassification', 'BartForSequenceClassification', 'BertForSequenceClassification', 'BigBirdForSequenceClassification', 'BigBirdPegasusForSequenceClassification', 'BioGptForSequenceClassification', 'BloomForSequenceClassification', 'CamembertForSequenceClassification', 'CanineForSequenceClassification', 'LlamaForSequenceClassification', 'ConvBertForSequenceClassification', 'CTRLForSequenceClassification', 'Data2VecTextForSequenceClassification', 'DebertaForSequenceClassification', 'DebertaV2ForSequenceClassification', 'DistilBertForSequenceClassification', 'ElectraForSequenceCl

Map:   0%|          | 0/300 [00:00<?, ? examples/s]

dataloader_config = DataLoaderConfiguration(dispatch_batches=None, split_batches=False, even_batches=True, use_seedable_sampler=True)


Step,Training Loss
100,1.1035
200,1.0894
300,0.7504
400,0.4709
500,0.2966
600,0.186
700,0.1491
800,0.1493
900,0.1178
1000,0.1032


TrainOutput(global_step=2000, training_loss=0.2561299571990967, metrics={'train_runtime': 229.3964, 'train_samples_per_second': 261.556, 'train_steps_per_second': 8.719, 'total_flos': 9252955386459072.0, 'train_loss': 0.2561299571990967, 'epoch': 200.0})
{'name': 'NVIDIA GeForce RTX 4090', 'num:': 1, 'total_memory': 24564, 'total_used': 13254}
Prediction Time
1.996664047241211
Name                     Architecture         #Param      %Param  Active   Train
--------------------------------------------------------------------------------
irc-opencv-opencv        bottleneck          894,528       0.718       1       1
--------------------------------------------------------------------------------
Full model                               124,645,632     100.000               0


VBox(children=(Label(value='0.004 MB of 0.004 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
train/epoch,▁▁▂▂▂▃▃▄▄▄▅▅▅▆▆▇▇▇███
train/global_step,▁▁▂▂▂▃▃▄▄▄▅▅▅▆▆▇▇▇███
train/grad_norm,▁▂▃▆▄▃██▃▃▁▁▁▁▁▂▁▂▁▁
train/learning_rate,██▇▇▇▆▆▅▅▅▄▄▄▃▃▂▂▂▁▁
train/loss,██▆▄▃▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁

0,1
total_flos,9252955386459072.0
train/epoch,200.0
train/global_step,2000.0
train/grad_norm,0.094
train/learning_rate,0.0
train/loss,0.0636
train_loss,0.25613
train_runtime,229.3964
train_samples_per_second,261.556
train_steps_per_second,8.719


# Metrics

In [12]:
# Setting the metrics and labels
metrics = ['precision', 'recall', 'f1-score']
labels = ['bug', 'feature', 'question']

In [13]:
# A function to get the metric results
def get_results (repos):
  results = defaultdict(dict)

  for repo in repos:
    results[repo] = classification_report(references[repo], predictions[repo], output_dict=True)
    results[repo]['average'] = results[repo]['weighted avg']
    results[repo] = {label: {metric: results[repo][label][metric] for metric in metrics} for label in labels + ['average']}

  results['overall'] = {label: {metric: np.mean([results[repo][label][metric] for repo in repos]) for metric in metrics} for label in labels + ['average']}

  return results

In [14]:
# A function to write to a json file
import json

def write_json_file (results):
  #The output json file would be created containing the results.
  output_file_name = 'results.json'
  with open(output_file_name, 'w') as fp:
    json.dump(results, fp, indent=2)

In [15]:
# A function to print the results

def print_results (results):

  print(f"Repository{' '*15}Label     Precision  Recall     F1")
  for repo in repos + ['overall']:
    print("-"*63)
    for label in labels + ['average']:
      out = f"{repo:<25}{label:<10}"
      for metric in metrics:
        out += f"{results[repo][label][metric]:<10.4f} "
      print(out)

In [16]:

# Call the function to get the results
results = get_results (repos)

# This function call create the Json file with the results
write_json_file(results)

# This function prints the results
print_results(results)

Repository               Label     Precision  Recall     F1
---------------------------------------------------------------
bitcoin/bitcoin          bug       1.0000     1.0000     1.0000     
bitcoin/bitcoin          feature   1.0000     1.0000     1.0000     
bitcoin/bitcoin          question  1.0000     1.0000     1.0000     
bitcoin/bitcoin          average   1.0000     1.0000     1.0000     
---------------------------------------------------------------
tensorflow/tensorflow    bug       1.0000     0.9800     0.9899     
tensorflow/tensorflow    feature   0.9901     1.0000     0.9950     
tensorflow/tensorflow    question  0.9901     1.0000     0.9950     
tensorflow/tensorflow    average   0.9934     0.9933     0.9933     
---------------------------------------------------------------
microsoft/vscode         bug       1.0000     1.0000     1.0000     
microsoft/vscode         feature   1.0000     1.0000     1.0000     
microsoft/vscode         question  1.0000     1.0000     1

In [None]:
print(references["tensorflow/tensorflow"])

In [None]:
print(predictions["tensorflow/tensorflow"])

# References & Ack

This notebook uses codes from:
* https://github.com/adapter-hub/adapters/blob/main/notebooks/01_Adapter_Training.ipynb
* https://github.com/adapter-hub/adapters/blob/main/notebooks/05_Adapter_Drop_Training.ipynb
* https://huggingface.co/docs/transformers/tasks/sequence_classification
* https://github.com/nlbse2024/issue-report-classification/blob/main/2-Template-SetFit.ipynb
