# Tweet Turing Test: Detecting Disinformation on Twitter  

|          | Group #2 - Disinformation Detectors                     |
|---------:|---------------------------------------------------------|
| Members  | John Johnson, Katy Matulay, Justin Minnion, Jared Rubin |
| Notebook | `xx_fine_tuner.ipynb`                                   |
| Purpose  | A notebook to fine-tune BERT models.                    |

(todo: description)

Based on tutorial from: https://huggingface.co/docs/transformers/training

# 1 - Setup

In [1]:
# imports from Python standard library
import os

# imports requiring installation
#   connection to Google Cloud Storage
from google.cloud import storage            # pip install google-cloud-storage
from google.oauth2 import service_account   # pip install google-auth

#  data science packages
import numpy as np
import pandas as pd
import tensorflow as tf

import evaluate
from datasets import Dataset, ClassLabel
from transformers import pipeline
from transformers import AutoTokenizer, AutoModelForSequenceClassification
from transformers import TrainingArguments, Trainer

In [2]:
# imports from tweet_turing.py
import tweet_turing as tur      # note - different import approach from prior notebooks

# imports from tweet_turing_paths.py
from tweet_turing_paths import local_data_paths, local_snapshot_paths, gcp_data_paths, \
    gcp_snapshot_paths, gcp_project_name, gcp_bucket_name, gcp_key_file

In [3]:
# pandas options
pd.set_option('display.max_colwidth', None)

## Local or Cloud?

Decide here whether to run notebook with local data or GCP bucket data
 - if the working directory of this notebook has a "../data/" folder with data loaded (e.g. working on local computer or have data files loaded to a cloud VM) then use the "local files" option and comment out the "gcp bucket files" option
 - if this notebook is being run from a GCP VM (preferrably in the `us-central1` location) then use the "gcp bucket files" option and comment out the "local files" option

In [4]:
# option: local files
local_or_cloud: str = "local"   # comment/uncomment this line or next

# option: gcp bucket files
#local_or_cloud: str = "cloud"   # comment/uncomment this line or previous

# don't comment/uncomment for remainder of cell
if (local_or_cloud == "local"):
    data_paths = local_data_paths
    snapshot_paths = local_snapshot_paths
elif (local_or_cloud == "cloud"):
    data_paths = gcp_data_paths
    snapshot_paths = gcp_snapshot_paths
else:
    raise ValueError("Variable 'local_or_cloud' can only take on one of two values, 'local' or 'cloud'.")
    # subsequent cells will not do this final "else" check

In [5]:
# this cell only needs to run its code if local_or_cloud=="cloud"
#   (though it is harmless if run when local_or_cloud=="local")
gcp_storage_client: storage.Client = None
gcp_bucket: storage.Bucket = None

if (local_or_cloud == "cloud"):
    #gcp_storage_client = tur.get_gcp_storage_client(project_name=gcp_project_name, key_file=gcp_key_file)
    gcp_storage_client = tur.get_gcp_storage_client(project_name=gcp_project_name)
    gcp_bucket = tur.get_gcp_bucket(storage_client=gcp_storage_client, bucket_name=gcp_bucket_name)

# 2 - Load Dataset

Starting with the ten-percent sample with NLP-preprocessing completed from notebook **`04_nlp_preprocess.ipynb`**.

In [6]:
# note this cell requires package `pyarrow` to be installed in environment
parq_filename: str = "data_sample_ten_percent_NLP_preprocessed.parquet.gz"
parq_path: str = f"{snapshot_paths['parq_snapshot']}{parq_filename}"

if (local_or_cloud == "local"):
    df = pd.read_parquet(parq_path, engine='pyarrow')
elif (local_or_cloud == "cloud"):
    df = tur.get_gcp_object_from_parq_as_df(bucket=gcp_bucket, object_name=parq_path)

In [7]:
df.head(3)

Unnamed: 0,external_author_id,author,content,region,language,following,followers,updates,post_type,is_retweet,...,has_url,emoji_text,emoji_count,publish_date,class,following_ratio,class_numeric,RUS_lett_count,content_demoji,content_no_emoji
0,23785050,radiowoody,"To live dangerously on Friday the 13th, we're doing the radio show from the UNLUCKIEST place on earth! The @TennesseeTitans Locker Room!",Nashville Tennessee,en,2585,5710,2,,0.0,...,0,[],0,2013-12-13 10:03:43+00:00,Verified,0.452635,0,0,"To live dangerously on Friday the 13th, we're doing the radio show from the UNLUCKIEST place on earth! The @TennesseeTitans Locker Room!","To live dangerously on Friday the 13th, we're doing the radio show from the UNLUCKIEST place on earth! The @TennesseeTitans Locker Room!"
1,59020162,matthewpouliot,@legsanity I like it. Almost like a free Gio. Pujols is still about as good of a bet as Gonzalez the rest of the way.,Florida,en,999,12637,0,replied_to,0.0,...,0,[],0,2015-04-26 20:13:58+00:00,Verified,0.079047,0,0,@legsanity I like it. Almost like a free Gio. Pujols is still about as good of a bet as Gonzalez the rest of the way.,@legsanity I like it. Almost like a free Gio. Pujols is still about as good of a bet as Gonzalez the rest of the way.
2,1656024374,IMISSOBAMA,Man servants can have a good purpose as long as they come with cash and don't touch me ever.,United States,en,473,760,4122,RETWEET,1.0,...,0,[],0,2016-12-24 13:12:00+00:00,Troll,0.621551,1,0,Man servants can have a good purpose as long as they come with cash and don't touch me ever.,Man servants can have a good purpose as long as they come with cash and don't touch me ever.


In [8]:
df.info(memory_usage='deep')

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 362314 entries, 0 to 362313
Data columns (total 24 columns):
 #   Column              Non-Null Count   Dtype              
---  ------              --------------   -----              
 0   external_author_id  362314 non-null  string             
 1   author              362314 non-null  string             
 2   content             362314 non-null  string             
 3   region              344249 non-null  string             
 4   language            362314 non-null  category           
 5   following           362314 non-null  uint64             
 6   followers           362314 non-null  uint64             
 7   updates             362314 non-null  uint64             
 8   post_type           154729 non-null  category           
 9   is_retweet          362314 non-null  float64            
 10  account_category    362314 non-null  category           
 11  tweet_id            362314 non-null  string             
 12  tco1_step1      

In [9]:
df['class'].unique()

['Verified', 'Troll']
Categories (2, object): ['Troll', 'Verified']

In [10]:
df.loc[df['emoji_count'] > 0, ['content', 'content_demoji', 'content_no_emoji']].sample(5)

Unnamed: 0,content,content_demoji,content_no_emoji
362027,5 things everyone will be talking about today: http://t.co/YGDhdQ90cP Top of the list is 🍻 https://t.co/xLFTRscFeZ,5 things everyone will be talking about today: http://t.co/YGDhdQ90cP \n\nTop of the list is :clinking beer mugs: https://t.co/xLFTRscFeZ,5 things everyone will be talking about today: http://t.co/YGDhdQ90cP \n\nTop of the list is https://t.co/xLFTRscFeZ
175143,"RT ZakZale: 😂OMG is this real? Canadian PM💃Justin Trudeau started stuttering ""Umm Umm"" and paused for a long time … https://t.co/UjpYiuUiVd","RT ZakZale: :face with tears of joy:OMG is this real? Canadian PM:woman dancing:Justin Trudeau started stuttering ""Umm Umm"" and paused for a long time … https://t.co/UjpYiuUiVd","RT ZakZale: OMG is this real? Canadian PMJustin Trudeau started stuttering ""Umm Umm"" and paused for a long time … https://t.co/UjpYiuUiVd"
58733,"#top RT American1765: #MondayMotivation Dear #Fatboy Kim Jong-Un, Didja see our THAAD in action? Hmmmm 🤔 ... Wh… https://t.co/iaan8hp6GO","#top RT American1765: #MondayMotivation Dear #Fatboy Kim Jong-Un, Didja see our THAAD in action? Hmmmm :thinking face: ... Wh… https://t.co/iaan8hp6GO","#top RT American1765: #MondayMotivation Dear #Fatboy Kim Jong-Un, Didja see our THAAD in action? Hmmmm ... Wh… https://t.co/iaan8hp6GO"
281261,It's such a huge compliment when people copy me.. thanks guys! 👍💗😘💕😉,It's such a huge compliment when people copy me.. thanks guys! :thumbs up::growing heart::face blowing a kiss::two hearts::winking face:,It's such a huge compliment when people copy me.. thanks guys!
318472,The future depends on what you do today. 🙏🏻❤️ https://t.co/y0CpN3yAZX,The future depends on what you do today. :folded hands: light skin tone::red heart: https://t.co/y0CpN3yAZX,The future depends on what you do today. https://t.co/y0CpN3yAZX


# 3 - Choose Dataset Fields and Model

## 3.1 - Set Args

To make the subsequent encoding/training code more modular, set as many args as we can within this cell.

In [11]:
# where to store the output of the model (as a subfolder of ../data/models/)
output_dir_name = 'distilbert-base-uncased-200k'
output_description = ''

# which columns from dataframe will be used
content_column = 'content_no_emoji'
class_column = 'class_numeric'        # Assumes: 0=authentic, 1=troll

# select pre-trained model
#pretrained_model_name = 'Twitter/twhin-bert-base'    # https://huggingface.co/Twitter/twhin-bert-base
#pretrained_model_name = 'bert-base-uncased'            # https://huggingface.co/bert-base-uncased
pretrained_model_name = 'distilbert-base-uncased'    # https://huggingface.co/distilbert-base-uncased

# these are passed on to tokenizer object as keyword args
extra_tokenizer_args = {
    'padding': 'max_length', 
    'truncation': True, 
    'return_tensors': 'pt', 
    'max_length': 256,
}

# these are passed on to model object as keyword args
extra_model_args = {
    'num_labels': 2
}

# these are passed on to trainer object as keyword args
extra_train_args = {
    'output_dir': f'../data/models/{output_dir_name}',
    'num_train_epochs': 3,
}

# maximum tweets (per class) used for fine tuning (set to None for no limit)
#  e.g. if this value is 5000, a maximum of 5000 troll and 5000 authentic tweets will be used
#       for a total of 10,000 tweets used for fine tuning
max_tweets_per_class = 100000
sampling_random_seed = 42

# for train/test split
train_test_random_seed = 3    # for reproducability, and "the number of the counting shall be three"
test_fraction = 0.20          # within range (0.0, 1.0)

In [12]:
# for model summary we can track how long it took to encode and train
time_encoding = None
time_training = None

In [8]:
# # testing for twhin-bert
# os.environ['PYTORCH_CUDA_ALLOC_CONF'] = "garbage_collection_threshold:0.6,max_split_size_mb:128"

## 3.2 - Convert Pandas Dataframe to 🤗 Dataset

In [13]:
# create a view (not a copy) of dataframe
if (max_tweets_per_class is None):
    df_view = df[[content_column, class_column]]
else:
    df_view = pd.concat(
        [
            df.loc[df[class_column] == 1, [content_column, class_column]].sample(n=max_tweets_per_class, random_state=sampling_random_seed),
            df.loc[df[class_column] == 0, [content_column, class_column]].sample(n=max_tweets_per_class, random_state=sampling_random_seed)
        ], 
        ignore_index=True
    )

# convert to 🤗 Dataset object
dataset = Dataset.from_pandas(df_view) \
            .rename_columns({content_column: "text", class_column: "label"}) \
            .cast_column("label", ClassLabel(names=['authentic', 'troll']))

# check results
assert (dataset.features['label'].str2int('authentic') == 0) and (dataset.features['label'].str2int('troll') == 1), 'class labels mismatched'
dataset[0]

Casting the dataset:   0%|          | 0/200 [00:00<?, ?ba/s]

{'text': 'On #MuslimWomensDay, these women are empowering themselves and fighting back against Islamophobia. https://t.co/Y5NXaTHjZi',
 'label': 1}

## 3.3 - Train/Test Split

In [14]:
dataset_split = dataset.train_test_split(
    test_size=test_fraction,
    seed=train_test_random_seed,
)

# check output
dataset_split

DatasetDict({
    train: Dataset({
        features: ['text', 'label'],
        num_rows: 160000
    })
    test: Dataset({
        features: ['text', 'label'],
        num_rows: 40000
    })
})

## 3.4 - Tokenize / Encode

In [15]:
# # create the tokenizer to prepare text for model
tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name)

In [16]:
# create a tokenizer function
def tokenize_function(examples):
    return tokenizer(examples['text'], **extra_tokenizer_args)

In [17]:
time_encoding_start = pd.Timestamp.now()

# encode the training and test sets
#tokenized_datasets = dataset_split.map(tokenize_function, batched=True, fn_kwargs={'tokenizer': tokenizer})
tokenized_datasets = dataset_split.map(tokenize_function, batched=True)

time_encoding_stop = pd.Timestamp.now()
time_encoding = time_encoding_stop - time_encoding_start

  0%|          | 0/160 [00:00<?, ?ba/s]

  0%|          | 0/40 [00:00<?, ?ba/s]

In [18]:
tokenized_datasets

DatasetDict({
    train: Dataset({
        features: ['text', 'label', 'input_ids', 'attention_mask'],
        num_rows: 160000
    })
    test: Dataset({
        features: ['text', 'label', 'input_ids', 'attention_mask'],
        num_rows: 40000
    })
})

## 3.5 - Model

In [19]:
# create the model
model = AutoModelForSequenceClassification.from_pretrained(
    pretrained_model_name,
    **extra_model_args,
)

Some weights of the model checkpoint at distilbert-base-uncased were not used when initializing DistilBertForSequenceClassification: ['vocab_layer_norm.weight', 'vocab_transform.bias', 'vocab_projector.bias', 'vocab_projector.weight', 'vocab_transform.weight', 'vocab_layer_norm.bias']
- This IS expected if you are initializing DistilBertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing DistilBertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['pre_classifier.bias', 'pre_classifier.weight', 'classi

In [20]:
# setup the training arguments
training_args = TrainingArguments(
    **extra_train_args
)

In [21]:
# setup training metric
metric = evaluate.load('accuracy')   # TODO -> study this more

def compute_metrics(eval_pred):    # TODO -> convert to pure function
    logits, labels = eval_pred
    predictions = np.argmax(logits, axis=-1)    # TODO -> study this more
    return metric.compute(predictions=predictions, references=labels)

In [22]:
time_training_start = pd.Timestamp.now()

# setup the trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_datasets['train'],
    eval_dataset=tokenized_datasets['test'],
    compute_metrics=compute_metrics,
)

# execute the training
trainer.train()

time_training_stop = pd.Timestamp.now()
time_training = time_training_stop - time_training_start

print(time_training)

The following columns in the training set don't have a corresponding argument in `DistilBertForSequenceClassification.forward` and have been ignored: text. If text are not expected by `DistilBertForSequenceClassification.forward`,  you can safely ignore this message.
***** Running training *****
  Num examples = 160000
  Num Epochs = 3
  Instantaneous batch size per device = 8
  Total train batch size (w. parallel, distributed & accumulation) = 8
  Gradient Accumulation steps = 1
  Total optimization steps = 60000
  Number of trainable parameters = 66955010


  0%|          | 0/60000 [00:00<?, ?it/s]

Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-500\config.json


{'loss': 0.4475, 'learning_rate': 4.958333333333334e-05, 'epoch': 0.03}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-1000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-1000\config.json


{'loss': 0.3716, 'learning_rate': 4.9166666666666665e-05, 'epoch': 0.05}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-1000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-1500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-1500\config.json


{'loss': 0.3549, 'learning_rate': 4.875e-05, 'epoch': 0.07}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-1500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-2000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-2000\config.json


{'loss': 0.3585, 'learning_rate': 4.8333333333333334e-05, 'epoch': 0.1}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-2000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-2500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-2500\config.json


{'loss': 0.3405, 'learning_rate': 4.791666666666667e-05, 'epoch': 0.12}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-2500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-3000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-3000\config.json


{'loss': 0.3687, 'learning_rate': 4.75e-05, 'epoch': 0.15}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-3000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-3500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-3500\config.json


{'loss': 0.3363, 'learning_rate': 4.708333333333334e-05, 'epoch': 0.17}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-3500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-4000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-4000\config.json


{'loss': 0.3447, 'learning_rate': 4.666666666666667e-05, 'epoch': 0.2}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-4000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-4500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-4500\config.json


{'loss': 0.319, 'learning_rate': 4.6250000000000006e-05, 'epoch': 0.23}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-4500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-5000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-5000\config.json


{'loss': 0.3134, 'learning_rate': 4.5833333333333334e-05, 'epoch': 0.25}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-5000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-5500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-5500\config.json


{'loss': 0.3321, 'learning_rate': 4.541666666666667e-05, 'epoch': 0.28}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-5500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-6000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-6000\config.json


{'loss': 0.3074, 'learning_rate': 4.5e-05, 'epoch': 0.3}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-6000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-6500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-6500\config.json


{'loss': 0.296, 'learning_rate': 4.458333333333334e-05, 'epoch': 0.33}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-6500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-7000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-7000\config.json


{'loss': 0.3206, 'learning_rate': 4.4166666666666665e-05, 'epoch': 0.35}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-7000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-7500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-7500\config.json


{'loss': 0.3114, 'learning_rate': 4.375e-05, 'epoch': 0.38}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-7500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-8000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-8000\config.json


{'loss': 0.3002, 'learning_rate': 4.3333333333333334e-05, 'epoch': 0.4}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-8000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-8500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-8500\config.json


{'loss': 0.3322, 'learning_rate': 4.291666666666667e-05, 'epoch': 0.42}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-8500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-9000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-9000\config.json


{'loss': 0.3237, 'learning_rate': 4.25e-05, 'epoch': 0.45}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-9000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-9500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-9500\config.json


{'loss': 0.3142, 'learning_rate': 4.208333333333334e-05, 'epoch': 0.47}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-9500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-10000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-10000\config.json


{'loss': 0.3309, 'learning_rate': 4.166666666666667e-05, 'epoch': 0.5}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-10000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-10500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-10500\config.json


{'loss': 0.3287, 'learning_rate': 4.125e-05, 'epoch': 0.53}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-10500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-11000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-11000\config.json


{'loss': 0.3115, 'learning_rate': 4.0833333333333334e-05, 'epoch': 0.55}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-11000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-11500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-11500\config.json


{'loss': 0.3224, 'learning_rate': 4.041666666666667e-05, 'epoch': 0.57}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-11500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-12000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-12000\config.json


{'loss': 0.3035, 'learning_rate': 4e-05, 'epoch': 0.6}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-12000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-12500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-12500\config.json


{'loss': 0.336, 'learning_rate': 3.958333333333333e-05, 'epoch': 0.62}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-12500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-13000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-13000\config.json


{'loss': 0.3316, 'learning_rate': 3.9166666666666665e-05, 'epoch': 0.65}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-13000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-13500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-13500\config.json


{'loss': 0.3009, 'learning_rate': 3.875e-05, 'epoch': 0.68}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-13500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-14000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-14000\config.json


{'loss': 0.311, 'learning_rate': 3.8333333333333334e-05, 'epoch': 0.7}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-14000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-14500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-14500\config.json


{'loss': 0.3187, 'learning_rate': 3.791666666666667e-05, 'epoch': 0.72}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-14500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-15000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-15000\config.json


{'loss': 0.311, 'learning_rate': 3.7500000000000003e-05, 'epoch': 0.75}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-15000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-15500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-15500\config.json


{'loss': 0.2929, 'learning_rate': 3.708333333333334e-05, 'epoch': 0.78}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-15500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-16000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-16000\config.json


{'loss': 0.3285, 'learning_rate': 3.6666666666666666e-05, 'epoch': 0.8}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-16000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-16500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-16500\config.json


{'loss': 0.3202, 'learning_rate': 3.625e-05, 'epoch': 0.82}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-16500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-17000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-17000\config.json


{'loss': 0.3261, 'learning_rate': 3.5833333333333335e-05, 'epoch': 0.85}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-17000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-17500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-17500\config.json


{'loss': 0.3117, 'learning_rate': 3.541666666666667e-05, 'epoch': 0.88}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-17500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-18000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-18000\config.json


{'loss': 0.31, 'learning_rate': 3.5e-05, 'epoch': 0.9}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-18000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-18500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-18500\config.json


{'loss': 0.3003, 'learning_rate': 3.458333333333333e-05, 'epoch': 0.93}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-18500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-19000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-19000\config.json


{'loss': 0.3106, 'learning_rate': 3.4166666666666666e-05, 'epoch': 0.95}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-19000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-19500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-19500\config.json


{'loss': 0.3068, 'learning_rate': 3.375000000000001e-05, 'epoch': 0.97}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-19500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-20000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-20000\config.json


{'loss': 0.2854, 'learning_rate': 3.3333333333333335e-05, 'epoch': 1.0}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-20000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-20500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-20500\config.json


{'loss': 0.2947, 'learning_rate': 3.291666666666667e-05, 'epoch': 1.02}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-20500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-21000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-21000\config.json


{'loss': 0.2728, 'learning_rate': 3.2500000000000004e-05, 'epoch': 1.05}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-21000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-21500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-21500\config.json


{'loss': 0.2754, 'learning_rate': 3.208333333333334e-05, 'epoch': 1.07}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-21500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-22000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-22000\config.json


{'loss': 0.2622, 'learning_rate': 3.1666666666666666e-05, 'epoch': 1.1}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-22000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-22500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-22500\config.json


{'loss': 0.2652, 'learning_rate': 3.125e-05, 'epoch': 1.12}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-22500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-23000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-23000\config.json


{'loss': 0.2504, 'learning_rate': 3.0833333333333335e-05, 'epoch': 1.15}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-23000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-23500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-23500\config.json


{'loss': 0.2698, 'learning_rate': 3.0416666666666666e-05, 'epoch': 1.18}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-23500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-24000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-24000\config.json


{'loss': 0.2397, 'learning_rate': 3e-05, 'epoch': 1.2}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-24000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-24500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-24500\config.json


{'loss': 0.2442, 'learning_rate': 2.9583333333333335e-05, 'epoch': 1.23}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-24500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-25000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-25000\config.json


{'loss': 0.2722, 'learning_rate': 2.916666666666667e-05, 'epoch': 1.25}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-25000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-25500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-25500\config.json


{'loss': 0.2584, 'learning_rate': 2.8749999999999997e-05, 'epoch': 1.27}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-25500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-26000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-26000\config.json


{'loss': 0.2438, 'learning_rate': 2.8333333333333335e-05, 'epoch': 1.3}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-26000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-26500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-26500\config.json


{'loss': 0.2724, 'learning_rate': 2.791666666666667e-05, 'epoch': 1.32}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-26500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-27000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-27000\config.json


{'loss': 0.2512, 'learning_rate': 2.7500000000000004e-05, 'epoch': 1.35}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-27000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-27500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-27500\config.json


{'loss': 0.2507, 'learning_rate': 2.7083333333333332e-05, 'epoch': 1.38}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-27500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-28000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-28000\config.json


{'loss': 0.2715, 'learning_rate': 2.6666666666666667e-05, 'epoch': 1.4}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-28000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-28500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-28500\config.json


{'loss': 0.2733, 'learning_rate': 2.625e-05, 'epoch': 1.43}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-28500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-29000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-29000\config.json


{'loss': 0.2513, 'learning_rate': 2.5833333333333336e-05, 'epoch': 1.45}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-29000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-29500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-29500\config.json


{'loss': 0.2443, 'learning_rate': 2.5416666666666667e-05, 'epoch': 1.48}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-29500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-30000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-30000\config.json


{'loss': 0.2558, 'learning_rate': 2.5e-05, 'epoch': 1.5}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-30000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-30500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-30500\config.json


{'loss': 0.2608, 'learning_rate': 2.4583333333333332e-05, 'epoch': 1.52}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-30500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-31000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-31000\config.json


{'loss': 0.2505, 'learning_rate': 2.4166666666666667e-05, 'epoch': 1.55}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-31000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-31500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-31500\config.json


{'loss': 0.2412, 'learning_rate': 2.375e-05, 'epoch': 1.57}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-31500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-32000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-32000\config.json


{'loss': 0.2362, 'learning_rate': 2.3333333333333336e-05, 'epoch': 1.6}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-32000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-32500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-32500\config.json


{'loss': 0.2535, 'learning_rate': 2.2916666666666667e-05, 'epoch': 1.62}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-32500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-33000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-33000\config.json


{'loss': 0.2451, 'learning_rate': 2.25e-05, 'epoch': 1.65}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-33000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-33500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-33500\config.json


{'loss': 0.2431, 'learning_rate': 2.2083333333333333e-05, 'epoch': 1.68}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-33500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-34000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-34000\config.json


{'loss': 0.2581, 'learning_rate': 2.1666666666666667e-05, 'epoch': 1.7}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-34000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-34500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-34500\config.json


{'loss': 0.2569, 'learning_rate': 2.125e-05, 'epoch': 1.73}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-34500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-35000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-35000\config.json


{'loss': 0.2395, 'learning_rate': 2.0833333333333336e-05, 'epoch': 1.75}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-35000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-35500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-35500\config.json


{'loss': 0.2395, 'learning_rate': 2.0416666666666667e-05, 'epoch': 1.77}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-35500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-36000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-36000\config.json


{'loss': 0.2557, 'learning_rate': 2e-05, 'epoch': 1.8}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-36000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-36500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-36500\config.json


{'loss': 0.2365, 'learning_rate': 1.9583333333333333e-05, 'epoch': 1.82}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-36500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-37000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-37000\config.json


{'loss': 0.2442, 'learning_rate': 1.9166666666666667e-05, 'epoch': 1.85}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-37000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-37500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-37500\config.json


{'loss': 0.2333, 'learning_rate': 1.8750000000000002e-05, 'epoch': 1.88}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-37500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-38000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-38000\config.json


{'loss': 0.2391, 'learning_rate': 1.8333333333333333e-05, 'epoch': 1.9}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-38000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-38500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-38500\config.json


{'loss': 0.2367, 'learning_rate': 1.7916666666666667e-05, 'epoch': 1.93}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-38500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-39000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-39000\config.json


{'loss': 0.2392, 'learning_rate': 1.75e-05, 'epoch': 1.95}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-39000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-39500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-39500\config.json


{'loss': 0.2298, 'learning_rate': 1.7083333333333333e-05, 'epoch': 1.98}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-39500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-40000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-40000\config.json


{'loss': 0.2505, 'learning_rate': 1.6666666666666667e-05, 'epoch': 2.0}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-40000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-40500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-40500\config.json


{'loss': 0.181, 'learning_rate': 1.6250000000000002e-05, 'epoch': 2.02}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-40500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-41000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-41000\config.json


{'loss': 0.1817, 'learning_rate': 1.5833333333333333e-05, 'epoch': 2.05}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-41000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-41500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-41500\config.json


{'loss': 0.1691, 'learning_rate': 1.5416666666666668e-05, 'epoch': 2.08}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-41500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-42000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-42000\config.json


{'loss': 0.1984, 'learning_rate': 1.5e-05, 'epoch': 2.1}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-42000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-42500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-42500\config.json


{'loss': 0.1834, 'learning_rate': 1.4583333333333335e-05, 'epoch': 2.12}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-42500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-43000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-43000\config.json


{'loss': 0.174, 'learning_rate': 1.4166666666666668e-05, 'epoch': 2.15}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-43000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-43500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-43500\config.json


{'loss': 0.1968, 'learning_rate': 1.3750000000000002e-05, 'epoch': 2.17}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-43500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-44000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-44000\config.json


{'loss': 0.1966, 'learning_rate': 1.3333333333333333e-05, 'epoch': 2.2}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-44000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-44500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-44500\config.json


{'loss': 0.172, 'learning_rate': 1.2916666666666668e-05, 'epoch': 2.23}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-44500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-45000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-45000\config.json


{'loss': 0.1755, 'learning_rate': 1.25e-05, 'epoch': 2.25}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-45000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-45500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-45500\config.json


{'loss': 0.1879, 'learning_rate': 1.2083333333333333e-05, 'epoch': 2.27}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-45500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-46000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-46000\config.json


{'loss': 0.182, 'learning_rate': 1.1666666666666668e-05, 'epoch': 2.3}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-46000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-46500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-46500\config.json


{'loss': 0.1668, 'learning_rate': 1.125e-05, 'epoch': 2.33}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-46500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-47000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-47000\config.json


{'loss': 0.1828, 'learning_rate': 1.0833333333333334e-05, 'epoch': 2.35}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-47000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-47500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-47500\config.json


{'loss': 0.1877, 'learning_rate': 1.0416666666666668e-05, 'epoch': 2.38}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-47500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-48000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-48000\config.json


{'loss': 0.183, 'learning_rate': 1e-05, 'epoch': 2.4}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-48000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-48500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-48500\config.json


{'loss': 0.1876, 'learning_rate': 9.583333333333334e-06, 'epoch': 2.42}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-48500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-49000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-49000\config.json


{'loss': 0.1801, 'learning_rate': 9.166666666666666e-06, 'epoch': 2.45}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-49000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-49500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-49500\config.json


{'loss': 0.1842, 'learning_rate': 8.75e-06, 'epoch': 2.48}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-49500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-50000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-50000\config.json


{'loss': 0.1814, 'learning_rate': 8.333333333333334e-06, 'epoch': 2.5}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-50000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-50500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-50500\config.json


{'loss': 0.1869, 'learning_rate': 7.916666666666667e-06, 'epoch': 2.52}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-50500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-51000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-51000\config.json


{'loss': 0.1704, 'learning_rate': 7.5e-06, 'epoch': 2.55}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-51000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-51500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-51500\config.json


{'loss': 0.1879, 'learning_rate': 7.083333333333334e-06, 'epoch': 2.58}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-51500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-52000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-52000\config.json


{'loss': 0.1774, 'learning_rate': 6.666666666666667e-06, 'epoch': 2.6}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-52000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-52500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-52500\config.json


{'loss': 0.1799, 'learning_rate': 6.25e-06, 'epoch': 2.62}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-52500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-53000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-53000\config.json


{'loss': 0.1862, 'learning_rate': 5.833333333333334e-06, 'epoch': 2.65}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-53000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-53500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-53500\config.json


{'loss': 0.1863, 'learning_rate': 5.416666666666667e-06, 'epoch': 2.67}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-53500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-54000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-54000\config.json


{'loss': 0.1715, 'learning_rate': 5e-06, 'epoch': 2.7}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-54000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-54500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-54500\config.json


{'loss': 0.1994, 'learning_rate': 4.583333333333333e-06, 'epoch': 2.73}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-54500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-55000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-55000\config.json


{'loss': 0.1629, 'learning_rate': 4.166666666666667e-06, 'epoch': 2.75}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-55000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-55500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-55500\config.json


{'loss': 0.18, 'learning_rate': 3.75e-06, 'epoch': 2.77}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-55500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-56000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-56000\config.json


{'loss': 0.1769, 'learning_rate': 3.3333333333333333e-06, 'epoch': 2.8}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-56000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-56500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-56500\config.json


{'loss': 0.1617, 'learning_rate': 2.916666666666667e-06, 'epoch': 2.83}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-56500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-57000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-57000\config.json


{'loss': 0.1837, 'learning_rate': 2.5e-06, 'epoch': 2.85}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-57000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-57500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-57500\config.json


{'loss': 0.1768, 'learning_rate': 2.0833333333333334e-06, 'epoch': 2.88}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-57500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-58000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-58000\config.json


{'loss': 0.1813, 'learning_rate': 1.6666666666666667e-06, 'epoch': 2.9}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-58000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-58500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-58500\config.json


{'loss': 0.1674, 'learning_rate': 1.25e-06, 'epoch': 2.92}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-58500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-59000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-59000\config.json


{'loss': 0.1728, 'learning_rate': 8.333333333333333e-07, 'epoch': 2.95}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-59000\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-59500
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-59500\config.json


{'loss': 0.1736, 'learning_rate': 4.1666666666666667e-07, 'epoch': 2.98}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-59500\pytorch_model.bin
Saving model checkpoint to ../data/models/distilbert-base-uncased-200k\checkpoint-60000
Configuration saved in ../data/models/distilbert-base-uncased-200k\checkpoint-60000\config.json


{'loss': 0.1884, 'learning_rate': 0.0, 'epoch': 3.0}


Model weights saved in ../data/models/distilbert-base-uncased-200k\checkpoint-60000\pytorch_model.bin


Training completed. Do not forget to share your model on huggingface.co/models =)




{'train_runtime': 7812.8922, 'train_samples_per_second': 61.437, 'train_steps_per_second': 7.68, 'train_loss': 0.2526954105377197, 'epoch': 3.0}
0 days 02:10:14.608768


## 3.6 - Save fine-tuned model

In [23]:
trainer.save_model()    # defaults to self.args.output_dir

Saving model checkpoint to ../data/models/distilbert-base-uncased-200k
Configuration saved in ../data/models/distilbert-base-uncased-200k\config.json
Model weights saved in ../data/models/distilbert-base-uncased-200k\pytorch_model.bin


## 3.7 - Evaluate fine-tuned model

In [24]:
# if evaluating immediately after fine-tuning
trainer.evaluate()

The following columns in the evaluation set don't have a corresponding argument in `DistilBertForSequenceClassification.forward` and have been ignored: text. If text are not expected by `DistilBertForSequenceClassification.forward`,  you can safely ignore this message.
***** Running Evaluation *****
  Num examples = 40000
  Batch size = 8


  0%|          | 0/5000 [00:00<?, ?it/s]

{'eval_loss': 0.3871361315250397,
 'eval_accuracy': 0.8995,
 'eval_runtime': 204.6593,
 'eval_samples_per_second': 195.447,
 'eval_steps_per_second': 24.431,
 'epoch': 3.0}

In [77]:
predictions = trainer.predict(tokenized_datasets['test'])

The following columns in the test set don't have a corresponding argument in `DistilBertForSequenceClassification.forward` and have been ignored: text. If text are not expected by `DistilBertForSequenceClassification.forward`,  you can safely ignore this message.
***** Running Prediction *****
  Num examples = 2000
  Batch size = 8


In [81]:
predictions.metrics

{'test_loss': 0.4639328718185425,
 'test_accuracy': 0.8565,
 'test_runtime': 30.4455,
 'test_samples_per_second': 65.691,
 'test_steps_per_second': 8.211}

In [74]:
# reload model, if evaluation is being performed separately from training
model_dir = "../data/models/dist-test1"

model = AutoModelForSequenceClassification.from_pretrained(model_dir)

loading configuration file ../data/models/dist-test1/config.json
Model config DistilBertConfig {
  "_name_or_path": "../data/models/dist-test1",
  "activation": "gelu",
  "architectures": [
    "DistilBertForSequenceClassification"
  ],
  "attention_dropout": 0.1,
  "dim": 768,
  "dropout": 0.1,
  "hidden_dim": 3072,
  "initializer_range": 0.02,
  "max_position_embeddings": 512,
  "model_type": "distilbert",
  "n_heads": 12,
  "n_layers": 6,
  "pad_token_id": 0,
  "problem_type": "single_label_classification",
  "qa_dropout": 0.1,
  "seq_classif_dropout": 0.2,
  "sinusoidal_pos_embds": false,
  "tie_weights_": true,
  "torch_dtype": "float32",
  "transformers_version": "4.26.1",
  "vocab_size": 30522
}

loading weights file ../data/models/dist-test1/pytorch_model.bin
All model checkpoint weights were used when initializing DistilBertForSequenceClassification.

All the weights of DistilBertForSequenceClassification were initialized from the model checkpoint at ../data/models/dist-test1.

In [None]:
# TODO - setup eval for re-loaded model

TODO: setup way to archive the saved model files.

For now:
`tar -czvf dist-test1.tar.gz --exclude='*checkpoint*' dist-test1`