# Fine-Tune a BERT model for classifying code quality

## Setup

### Install Dependencies

- Weights&Biases 
- Huggingface Transformer + Datasets library

In [None]:
%load_ext autoreload
%autoreload 2

In [None]:
!pip install wandb -qU

In [None]:
# install datasets
!pip install datasets

# Make sure that we have a recent version of pyarrow in the session before we continue - otherwise reboot Colab to activate it
import pyarrow
if int(pyarrow.__version__.split('.')[1]) < 16 and int(pyarrow.__version__.split('.')[0]) == 0:
    import os
    os.kill(os.getpid(), 9)

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [None]:
!pip install transformers

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [None]:
from huggingface_hub import notebook_login

notebook_login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

In [None]:
import wandb
wandb.login()

[34m[1mwandb[0m: Currently logged in as: [33mjohannesha[0m ([33mmrubis[0m). Use [1m`wandb login --relogin`[0m to force relogin


True

## Prepare Data

### Load "Good" Code Quality data

Load [StackOverflow Question Code Dataset ](https://github.com/LittleYUYU/StackOverflow-Question-Code-Dataset)

-> 148K Python question-code pairs

Download dataset and load into transformer dataset library
I reuploaded the datasets onto my personal drive from here:
- Code snippets Download: [https://github.com/LittleYUYU/StackOverflow-Question-Code-Dataset/blob/master/annotation_tool/data/code_solution_labeled_data/source/python_how_to_do_it_qid_by_classifier_unlabeled_single_code_answer_qid_to_code.pickle](https://github.com/LittleYUYU/StackOverflow-Question-Code-Dataset/blob/master/annotation_tool/data/code_solution_labeled_data/source/python_how_to_do_it_qid_by_classifier_unlabeled_single_code_answer_qid_to_code.pickle)
- Question titles: [https://github.com/LittleYUYU/StackOverflow-Question-Code-Dataset/blob/master/annotation_tool/data/code_solution_labeled_data/source/python_how_to_do_it_qid_by_classifier_unlabeled_single_code_answer_qid_to_title.pickle](https://github.com/LittleYUYU/StackOverflow-Question-Code-Dataset/blob/master/annotation_tool/data/code_solution_labeled_data/source/python_how_to_do_it_qid_by_classifier_unlabeled_single_code_answer_qid_to_title.pickle)

In [None]:
import pickle
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
!ls /content/drive/MyDrive/Deep\ Learning\ Project

python_how_to_do_it_qid_by_classifier_unlabeled_single_code_answer_qid_to_code.pickle.txt
python_how_to_do_it_qid_by_classifier_unlabeled_single_code_answer_qid_to_title.pickle.txt


### Artificially create "bad"/"buggy" code

artifically creating bad/buggy code by, for example randomly:
- Breaking/Swapping the Question/Answer relationship
- Introduing python anti-patterns, i.e. "non-pythonic" ways of writing code. e.g:
  - add non-explicit variable names -> e.g. one etter variable names
  - remove all comments
  - turn function names into camel case:
    ```
    # bad practice
    def computeNetValue(price, tax):
    ```
  - Not iterating directly over the elements of an iterator:
    ```
    # bad practice

    for i in range(len(list_of_fruits)):
        fruit = list_of_fruits[i]
        process_fruit(fruit)
        
    # good practice

    for fruit in list_of_fruits:
        process_fruit(fruit)
    ```
  - Passing mutable default arguments to functions (i.e. an empty list):
  ```
  # bad practice
    def append_to(element, to=[]):
        to.append(element)
        return to

    >>> my_list = append_to("a") 
    >>> print(my_list)
    >>> ["a"]

    >>> my_second_list = append_to("b") 
    >>> print(my_second_list)
    >>> ["a", "b"]

    # good practice 
    def append_to(element, to=None):
        if to is None:
            to = []
        to.append(element)
        return to
  ```
    -> List of more Python anti-patterns: https://towardsdatascience.com/18-common-python-anti-patterns-i-wish-i-had-known-before-44d983805f0f

In [None]:
# load pickle files as pandas dataframes
import pandas as pd
from datasets import Dataset

data_path = "/content/drive/MyDrive/Deep Learning Project"

# Only load the code snippets into a dataframe for now and create a huggingface dataset with them:
questions_dict = pd.read_pickle(data_path+'/python_how_to_do_it_qid_by_classifier_unlabeled_single_code_answer_qid_to_title.pickle.txt')
code_snippets_dict = pd.read_pickle(data_path+'/python_how_to_do_it_qid_by_classifier_unlabeled_single_code_answer_qid_to_code.pickle.txt')

In [None]:
good_code_snippets = []

for idx, ((id_question, question), (id_code_snippet, code_snippet)) in enumerate(zip(questions_dict.items(), code_snippets_dict.items())):
    if id_question == id_code_snippet:
      question_code_snippet_pair = f"# {question}\n\n{code_snippet}"
      good_code_snippets.append({"question_code_snippet_pair": question_code_snippet_pair, "idx": idx, "label": 1})
    else:
        print("ID's are not the same!")


In [None]:
# Helper functions for creating "artificially" bad code
import re
import random
import string

# Regex for getting the function name
def get_function_names(code_snippet):
  function_names = re.findall('def\s(\w+)\(', code_snippet)
  return function_names

# Regex for getting the function name
def get_varible_names(code_snippet):
  variable_names = re.findall('\s(\w+)\ =', code_snippet)
  return variable_names

def replace_funtion_names_with_one_letter_names(code_snippet):
  all_function_names = get_function_names(code_snippet)

  if all_function_names:
    for function_name in all_function_names:
      # replace all occurences of this function name in the code snippet
      random_letter = random.choice(string.ascii_letters)
      code_snippet = code_snippet.replace(function_name, random_letter)
  return code_snippet

def replace_funtion_names_with_camel_case(code_snippet):
  all_function_names = get_function_names(code_snippet)

  if all_function_names:
    for function_name in all_function_names:
      # replace all occurences of this function name in the code snippet
      code_snippet = code_snippet.replace(function_name, into_camel_case(function_name))
  return code_snippet

def replace_variables_with_one_letter_names(code_snippet):
  variable_names = get_varible_names(code_snippet) # Get all variables
  
  for variable_name in variable_names:
    random_letter = random.choice(string.ascii_letters)
    code_snippet = code_snippet.replace(variable_name, random_letter)
  return code_snippet

def replace_variables_with_camel_case(code_snippet):  
  variable_names = get_varible_names(code_snippet) # Get all variables

  for variable_name in variable_names:
    code_snippet =  code_snippet.replace(variable_name, into_camel_case(variable_name))
  return code_snippet

def into_camel_case(name):
    output = ''.join(x for x in name.title() if x.isalnum())
    if output:
      return output[0].lower() + (output[1:] if len(output) > 1 else "")
    else:
      return name

def remove_all_comments_from_a_code_snippet(code_snippet):
  lines = code_snippet.splitlines()
  lines_without_comments = []
  for line in lines:
    if line.startswith('#'):
      continue 
    lines_without_comments.append(line)
  return "\n".join(lines_without_comments)
  

In [None]:
# Examples of how we are "artifically" making code worse

example_code_snippet = """
from bs4 import BeautifulSoup
import json

# Function to grab json in selenium
def grab_json_in_selenium(driver) -> dict:
  soup = BeautifulSoup(driver.page_source)
  dict_from_json = json.loads(soup.find("body").text)
  return dict_from_json
"""
print("Original \"Good\" Code Snippet: ", example_code_snippet)

print("Replacing Function names with one letter names:")
print(replace_funtion_names_with_one_letter_names(example_code_snippet))

print("Turn Function names into camel case:")
print(replace_funtion_names_with_camel_case(example_code_snippet))

print("Replace variables with one letter names:")
print(replace_variables_with_one_letter_names(example_code_snippet))

print("Turn variables into camel case:")
print(replace_variables_with_camel_case(example_code_snippet))

print("Remove all comments:")
print(remove_all_comments_from_a_code_snippet(example_code_snippet))

Original "Good" Code Snippet:  
from bs4 import BeautifulSoup
import json

# Function to grab json in selenium
def grab_json_in_selenium(driver) -> dict:
  soup = BeautifulSoup(driver.page_source)
  dict_from_json = json.loads(soup.find("body").text)
  return dict_from_json

Replacing Function names with one letter names:

from bs4 import BeautifulSoup
import json

# Function to grab json in selenium
def J(driver) -> dict:
  soup = BeautifulSoup(driver.page_source)
  dict_from_json = json.loads(soup.find("body").text)
  return dict_from_json

Turn Function names into camel case:

from bs4 import BeautifulSoup
import json

# Function to grab json in selenium
def grabJsonInSelenium(driver) -> dict:
  soup = BeautifulSoup(driver.page_source)
  dict_from_json = json.loads(soup.find("body").text)
  return dict_from_json

Replace variables with one letter names:

from bs4 import BeautifulSoup
import json

# Function to grab json in selenium
def grab_json_in_selenium(driver) -> dict:
  l = Be

In [None]:
# Create "Artificially" buggy code

bad_code_snippets = []

code_snippets_as_list = list(code_snippets_dict.items())

# For 25% of the code snippets, just break the question/answer relationship by swapping them
# Add those as bad code snippets
first_quarter_of_questions = list(questions_dict.items())[:int(len(questions_dict)/4)]


for idx, (id_question, question) in enumerate(first_quarter_of_questions):
  # Loop through question id items and randomly select a code snippet that does not has the same ID
  id_code_snippet, code_snippet = random.choice(code_snippets_as_list)

  if id_question != id_code_snippet:
      question_code_snippet_pair = f"# {question}\n\n{code_snippet}"
      # Add to list of bad code snippets
      bad_code_snippets.append({"question_code_snippet_pair": question_code_snippet_pair, "idx": idx, "label": 0})

# For the other 75% randomly apply some of the techniques to "artificially" create buggy code
last_three_quarter_of_questions = list(questions_dict.values())[int(len(questions_dict)/4):]
last_three_quarter_of_code_snippets = list(code_snippets_dict.values())[int(len(code_snippets_dict)/4):]

for (question, code_snippet) in zip(last_three_quarter_of_questions, last_three_quarter_of_code_snippets):
    # Track if one of the augmentation was applied
    did_get_augmented = False

    # For 50% of the cases apply replacing functions and variables with one letter names, for the other half turn them into camel case
    apply_one_letter_name_augmentation = random.choice([True, False])
    if apply_one_letter_name_augmentation:
      augmented_code_snippet = replace_funtion_names_with_one_letter_names(code_snippet)
      augmented_code_snippet = replace_variables_with_one_letter_names(code_snippet)
    else:
      augmented_code_snippet = replace_funtion_names_with_camel_case(code_snippet)
      augmented_code_snippet = replace_variables_with_camel_case(code_snippet)

    # Do not add to list if the code snippet did not change
    if code_snippet == augmented_code_snippet:
      continue
    code_snippet = augmented_code_snippet

    # with 50% chance also apply "remove comments" augmentation
    if random.choice([True, False]):
      code_snippet = remove_all_comments_from_a_code_snippet(code_snippet)

    question_code_snippet_pair = f"# {question}\n\n{code_snippet}"
    # Add to list of bad code snippets
    bad_code_snippets.append({"question_code_snippet_pair": question_code_snippet_pair, "idx": idx, "label": 0})


In [None]:
print("Number of good code snippets: ", len(good_code_snippets))
print("Number of bad code snippets: ", len(bad_code_snippets))

Number of good code snippets:  85294
Number of bad code snippets:  50169


In [None]:
# Merge good and bad good snippets into a dataset
merged_code_snippets = good_code_snippets + bad_code_snippets
ds = Dataset.from_pandas(pd.DataFrame(merged_code_snippets))

In [None]:
# create transformer dataset from the question and code snippet dictionary
from datasets import Dataset, DatasetDict

train_testvalid = ds.train_test_split(test_size=0.1)
# Split the 10% test + valid in half test, half valid
test_valid = train_testvalid['test'].train_test_split(test_size=0.5)
# gather everyone if you want to have a single DatasetDict
train_test_valid_dataset = DatasetDict({
    'train': train_testvalid['train'],
    'test': test_valid['test'],
    'valid': test_valid['train']})


In [None]:
train_test_valid_dataset

DatasetDict({
    train: Dataset({
        features: ['question_code_snippet_pair', 'idx', 'label'],
        num_rows: 121916
    })
    test: Dataset({
        features: ['question_code_snippet_pair', 'idx', 'label'],
        num_rows: 6774
    })
    valid: Dataset({
        features: ['question_code_snippet_pair', 'idx', 'label'],
        num_rows: 6773
    })
})

In [None]:
train_test_valid_dataset["train"][0]

{'question_code_snippet_pair': '# Python reading and writing to tty\n\nclass VISA:\n    def __init__(self, tty_name):\n        self.ser = serial.Serial()\n        self.ser.port = tty_name\n        # If it breaks try the below\n        #self.serConf() # Uncomment lines here till it works\n\n        self.ser.open()\n        self.ser.flushInput()\n        self.ser.flushOutput()\n\n        self.addr = None\n        self.setAddress(0)\n\n    def cmd(self, cmd_str):\n        self.ser.write(cmd_str + "\\n")\n        sleep(0.5)\n        return self.ser.readline()\n\n    def serConf(self):\n        self.ser.baudrate = 9600\n        self.ser.bytesize = serial.EIGHTBITS\n        self.ser.parity = serial.PARITY_NONE\n        self.ser.stopbits = serial.STOPBITS_ONE\n        self.ser.timeout = 0 # Non-Block reading\n        self.ser.xonxoff = False # Disable Software Flow Control\n        self.ser.rtscts = False # Disable (RTS/CTS) flow Control\n        self.ser.dsrdtr = False # Disable (DSR/DTR) fl

In [None]:
# Randomly picked examples of the dataset:
import datasets
import random
import pandas as pd
from IPython.display import display, HTML

def pretty_print(df):
    # Make snippets in dataframe left alligned and 
    return display( HTML( df.to_html().replace("\\n","<br>") ) )


def show_random_elements(dataset, num_examples=10):
    assert num_examples <= len(dataset), "Can't pick more elements than there are in the dataset."
    picks = []
    for _ in range(num_examples):
        pick = random.randint(0, len(dataset)-1)
        while pick in picks:
            pick = random.randint(0, len(dataset)-1)
        picks.append(pick)
    
    df = pd.DataFrame(dataset[picks])
    for column, typ in dataset.features.items():
        if isinstance(typ, datasets.ClassLabel):
            df[column] = df[column].transform(lambda i: typ.names[i])
    pretty_print(df)

In [None]:
show_random_elements(train_test_valid_dataset["train"])

Unnamed: 0,question_code_snippet_pair,idx,label
0,# Python manager.dict() is very slow compared to regular dict jobs = {} job = Job() jobs[job.name] = job # insert other jobs in the normal dictionary mgr = multiprocessing.Manager() mgr_jobs = mgr.dict() mgr_jobs.update(jobs),65602,1
1,"# Python match a string with regex >>> line = 'This,is,a,sample,string' >>> ""sample"" in line  True",22642,1
2,"# Slicing string values in each dict from list of dicts? def find_between( s, first, last ):  try:  start = s.index( first ) + len( first )  end = s.index( last, start )  return s[start:end]  except ValueError:  return """" original = [{1: ""xxx [pear] yyy"", 2: ""xxx [apple] zzz""}, {0: ""aaa [cat] yyy"", 1: ""bbb [dog] zzz""}] for dct in original:  for key in dct:  dct[key] = find_between(dct[key], ""["", ""]"")",67878,1
3,"# Reading and writing a tuple to file (numpy.random.RandomState.get_state) import numpy import pickle randomStateFile = 'random.bin' def save_random_state():  with open(randomStateFile, 'wb') as f:  pickle.dump(numpy.random.RandomState.get_state(), f) def load_random_state():  with open(randomStateFile, 'rb') as f:  numpy.random.RandomState.set_state(pickle.load(f))",21322,0
4,"# filtering a pytables table on pandas import elif event.type == MOUSEBUTTONUP and event.button == 1 and isinstance(page, MainPage):  x, y = pygame.mouse.get_pos()  control = originalScroll_y - scroll_y  control2 = 0 # set to 0 here  if control != control2: # is 0 here",19709,0
5,"# Extract Options From Dropdown List Extracted From Website gnames1 = list(A.objects.values_list('gname',flat=True).distinct()) gnames2 = list(B.objects.values_list('gname',flat=True).distinct()) gnames = list(set(gnames1+gnames2)) render(request, 'sampletemplate.html', {'gnames':gnames})",19752,0
6,# Slicing a Pandas Dataframe Using Two Strings # data # ================================== df  Region MachineNumber 0 EU Machine1 1 EU Machine1 2 EU Machine1 3 EU Machine1 4 EU Machine1 5 EU Machine1 6 EU Machine1 7 EU Machine1 .. ... ... 17 NA Machine1 18 NA Machine1 19 NA Machine1 20 NA Machine1 21 NA Machine1 22 NA Machine1 23 NA Machine1 24 NA Machine1 [25 rows x 2 columns] # processing # =============================== df[(df['Region']=='NA') & (df['MachineNumber']=='Machine1')]  Region MachineNumber 16 NA Machine1 17 NA Machine1 18 NA Machine1 19 NA Machine1 20 NA Machine1 21 NA Machine1 22 NA Machine1 23 NA Machine1 24 NA Machine1,44756,1
7,"# Permalinks with Russian/Cyrillic news articles py> urllib.quote(u""articles/2009/Заглавная_страница"".encode(""utf-8"")) 'articles/2009/%D0%97%D0%B0%D0%B3%D0%BB%D0%B0%D0%B2%D0%BD%D0%B0%D1%8F_%D1%81%D1%82%D1%80%D0%B0%D0%BD%D0%B8%D1%86%D0%B0'",16437,1
8,"# Overwrite create method of res.users to add users upon creation to a group @api.model def create(self, vals):  f = super(ResUsers, self).create(vals)  f.add_to_group()  return f",21322,0
9,"# In python - Find the maximum date in a nested dictionary max(my_dict.items(), key=lambda x: x[1]['last_event'])[0]",46573,1


### Load Tokenizer

In [None]:
model_checkpoint = "distilbert-base-uncased"

In [None]:
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint, use_fast=True)

In [None]:
tokenizer("print('Hello World')", "import pandas as pd")

{'input_ids': [101, 6140, 1006, 1005, 7592, 2088, 1005, 1007, 102, 12324, 25462, 2015, 2004, 22851, 102], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]}

In [None]:
def preprocess_function(examples):
    return tokenizer(examples["question_code_snippet_pair"], truncation=True)

In [None]:
preprocess_function(train_test_valid_dataset['train'][:5])

{'input_ids': [[101, 1001, 18750, 3752, 1998, 3015, 2000, 23746, 2100, 2465, 9425, 1024, 13366, 1035, 1035, 1999, 4183, 1035, 1035, 1006, 2969, 1010, 23746, 2100, 1035, 2171, 1007, 1024, 2969, 1012, 14262, 1027, 7642, 1012, 7642, 1006, 1007, 2969, 1012, 14262, 1012, 3417, 1027, 23746, 2100, 1035, 2171, 1001, 2065, 2009, 7807, 3046, 1996, 2917, 1001, 2969, 1012, 14262, 8663, 2546, 1006, 1007, 1001, 4895, 9006, 3672, 3210, 2182, 6229, 2009, 2573, 2969, 1012, 14262, 1012, 2330, 1006, 1007, 2969, 1012, 14262, 1012, 13862, 2378, 18780, 1006, 1007, 2969, 1012, 14262, 1012, 13862, 5833, 18780, 1006, 1007, 2969, 1012, 5587, 2099, 1027, 3904, 2969, 1012, 2275, 4215, 16200, 4757, 1006, 1014, 1007, 13366, 4642, 2094, 1006, 2969, 1010, 4642, 2094, 1035, 2358, 2099, 1007, 1024, 2969, 1012, 14262, 1012, 4339, 1006, 4642, 2094, 1035, 2358, 2099, 1009, 1000, 1032, 1050, 1000, 1007, 3637, 1006, 1014, 1012, 1019, 1007, 2709, 2969, 1012, 14262, 1012, 3191, 4179, 1006, 1007, 13366, 14262, 8663, 2546, 1006

In [None]:
# Tokenize complete dataset
encoded_dataset = train_test_valid_dataset.map(preprocess_function, batched=True)

  0%|          | 0/122 [00:00<?, ?ba/s]

  0%|          | 0/7 [00:00<?, ?ba/s]

  0%|          | 0/7 [00:00<?, ?ba/s]

## Fine-tune model

Fine-tune a BERT model on that dataset we just prepared based on this tutorial: https://huggingface.co/docs/transformers/training

In [None]:
# Load pre-trained BERT model
from transformers import AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained(model_checkpoint, num_labels=2)

Some weights of the model checkpoint at distilbert-base-uncased were not used when initializing DistilBertForSequenceClassification: ['vocab_transform.bias', 'vocab_layer_norm.weight', 'vocab_projector.bias', 'vocab_projector.weight', 'vocab_layer_norm.bias', 'vocab_transform.weight']
- This IS expected if you are initializing DistilBertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing DistilBertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['pre_classifier.weight', 'classifier.weight', 'classifi

In [None]:
# Set hyperparameters
from transformers import TrainingArguments
model_name = model_checkpoint.split("/")[-1]
batch_size = 16

args = TrainingArguments(
    f"{model_name}-finetuned-code-snippet-quality-scoring",
    evaluation_strategy = "steps",
    save_strategy = "epoch",
    eval_steps=1000,
    learning_rate=2e-5,
    per_device_train_batch_size=batch_size,
    per_device_eval_batch_size=batch_size,
    num_train_epochs=4,
    weight_decay=0.01,
    load_best_model_at_end=False,
    metric_for_best_model="accuracy",
    push_to_hub=True,
    report_to="wandb",
    run_name="003"
)

In [None]:
# Load metric for evaluate model performance during training.
import numpy as np
from datasets import load_metric

metric = load_metric("accuracy")

In [None]:
def compute_metrics(eval_pred):
    logits, labels = eval_pred
    predictions = np.argmax(logits, axis=-1)
    return metric.compute(predictions=predictions, references=labels)

In [None]:
from transformers import Trainer

trainer = Trainer(
    model,
    args,
    train_dataset=encoded_dataset["train"],
    eval_dataset=encoded_dataset["valid"],
    tokenizer=tokenizer,
    compute_metrics=compute_metrics
)

/content/distilbert-base-uncased-finetuned-code-snippet-quality-scoring is already a clone of https://huggingface.co/Johannes/distilbert-base-uncased-finetuned-code-snippet-quality-scoring. Make sure you pull the latest changes with `repo.git_pull()`.


In [None]:
# Actual fine-tuning step:
trainer.train()

The following columns in the training set don't have a corresponding argument in `DistilBertForSequenceClassification.forward` and have been ignored: question_code_snippet_pair, idx. If question_code_snippet_pair, idx are not expected by `DistilBertForSequenceClassification.forward`,  you can safely ignore this message.
***** Running training *****
  Num examples = 121916
  Num Epochs = 4
  Instantaneous batch size per device = 16
  Total train batch size (w. parallel, distributed & accumulation) = 16
  Gradient Accumulation steps = 1
  Total optimization steps = 30480
Automatic Weights & Biases logging enabled, to disable set os.environ["WANDB_DISABLED"] = "true"


Step,Training Loss,Validation Loss,Accuracy
1000,0.5353,0.511034,0.757419
2000,0.4686,0.433912,0.785915
3000,0.4517,0.424023,0.800236
4000,0.4263,0.390631,0.81692
5000,0.4053,0.393399,0.819135
6000,0.3867,0.385884,0.825336
7000,0.3906,0.393579,0.833456
8000,0.3418,0.361515,0.838033
9000,0.3418,0.358502,0.839953
10000,0.3307,0.352031,0.843201


The following columns in the evaluation set don't have a corresponding argument in `DistilBertForSequenceClassification.forward` and have been ignored: question_code_snippet_pair, idx. If question_code_snippet_pair, idx are not expected by `DistilBertForSequenceClassification.forward`,  you can safely ignore this message.
***** Running Evaluation *****
  Num examples = 6773
  Batch size = 16
The following columns in the evaluation set don't have a corresponding argument in `DistilBertForSequenceClassification.forward` and have been ignored: question_code_snippet_pair, idx. If question_code_snippet_pair, idx are not expected by `DistilBertForSequenceClassification.forward`,  you can safely ignore this message.
***** Running Evaluation *****
  Num examples = 6773
  Batch size = 16
The following columns in the evaluation set don't have a corresponding argument in `DistilBertForSequenceClassification.forward` and have been ignored: question_code_snippet_pair, idx. If question_code_snippet_

TrainOutput(global_step=30480, training_loss=0.311988140466645, metrics={'train_runtime': 22980.7689, 'train_samples_per_second': 21.221, 'train_steps_per_second': 1.326, 'total_flos': 5.577215896225104e+16, 'train_loss': 0.311988140466645, 'epoch': 4.0})

In [None]:
# Save model to the huggingface hub
trainer.push_to_hub()