# Large-language models for AI-detection: distilBERT

In this notebook and the next we'll use the data we explored in the previous notebook to train two large-language models available in the huggingface transformers library. This notebook will look at deBERTa v3, and the next will look at distilBERT, which is a smaller model.

I'll first use a train-validation split on the training data to tune the learning rate and number of epochs. I'll use a scheduled learning rate that starts at 10$^{-5}$ and then progressively reduces with each epoch until the validation accuracy ceases to increase. After that, we can train the model on the full training data for the number of epochs and learning rate schedule determined from the validation set.

In [1]:
#pip install --upgrade transformers

In [6]:
#!pip install tf-keras
import os
os.environ['TF_USE_LEGACY_KERAS'] = '1'

In [2]:
import pandas as pd

samples = pd.read_csv('https://raw.githubusercontent.com/tommyliphysics/tommyli-ml/main/ai_detector/notebooks/samples.csv')
samples

Unnamed: 0,text,source,topic,TTV split,label
0,I can't honestly believe that this is a sequel...,imdb,movie review,2.0,0
1,LL Cool J performed much better in this movie ...,imdb,movie review,0.0,0
2,It would be unwise to judge that that either n...,imdb,movie review,-1.0,0
3,20th Century Fox's ROAD HOUSE 1948) is not onl...,imdb,movie review,3.0,0
4,"I am a fan of Jess Franco's bizarre style, and...",imdb,movie review,-1.0,0
...,...,...,...,...,...
24175,The Louisville Cardinals men's soccer team is ...,wikipedia by GPT,Louisville Cardinals men's soccer,4.0,1
24176,"KFC Yum! Center, also known as the Yum! Center...",wikipedia by GPT,KFC Yum! Center,4.0,1
24177,The 2020–21 Louisville Cardinals men's basketb...,wikipedia by GPT,2020–21 Louisville Cardinals men's basketball ...,4.0,1
24178,Conte Forum is a multi-purpose indoor arena lo...,wikipedia by GPT,Conte Forum,4.0,1


In [4]:
train = samples[samples['TTV split'] > 0]
val = samples[samples['TTV split'] == 0]

We will now define functions to import the pre-trained model from the huggingface transformers library.

In [3]:
from transformers import TFDistilBertForSequenceClassification, DistilBertConfig
from transformers import DistilBertTokenizerFast

def get_tokenizer_model():
    tokenizer = DistilBertTokenizerFast.from_pretrained('distilbert-base-cased')
    model = TFDistilBertForSequenceClassification.from_pretrained('distilbert-base-cased', num_labels=2)
    return tokenizer,model

2024-05-29 11:59:26.580519: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-05-29 11:59:26.580625: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-05-29 11:59:26.690432: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered


In [4]:
tokenizer,model = get_tokenizer_model()

tokenizer_config.json:   0%|          | 0.00/49.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/213k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/436k [00:00<?, ?B/s]

config.json:   0%|          | 0.00/465 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/263M [00:00<?, ?B/s]

Some weights of the PyTorch model were not used when initializing the TF 2.0 model TFDistilBertForSequenceClassification: ['vocab_layer_norm.bias', 'vocab_transform.bias', 'vocab_transform.weight', 'vocab_layer_norm.weight', 'vocab_projector.bias']
- This IS expected if you are initializing TFDistilBertForSequenceClassification from a PyTorch model trained on another task or with another architecture (e.g. initializing a TFBertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing TFDistilBertForSequenceClassification from a PyTorch model that you expect to be exactly identical (e.g. initializing a TFBertForSequenceClassification model from a BertForSequenceClassification model).
Some weights or buffers of the TF 2.0 model TFDistilBertForSequenceClassification were not initialized from the PyTorch model and are newly initialized: ['pre_classifier.weight', 'pre_classifier.bias', 'classifier.weight', 'classifier.bias']
You should 

We will use keras to fine-tune the pre-trained models on our human and AI-generated texts. Let's define a function that prepares text samples for training, and then prepare the training and validation data.

In [5]:
import tensorflow as tf

def tokenize_encode(text_list):
    return tokenizer(text_list, truncation=True, padding=True)

def create_dataset(samples):
    encodings = tokenize_encode(samples['text'].tolist())
    return tf.data.Dataset.from_tensor_slices((
      dict(encodings),
      samples['label'].tolist()
    )).shuffle(len(samples)).batch(16)

In [8]:
train_dataset = create_dataset(train)
val_dataset = create_dataset(val)

Next we can define our functions to compile and train the model. I'll create a custom callback that saves the model to a specified directory.

In [7]:
import numpy as np

from tensorflow.keras.optimizers import RMSprop
from tensorflow.keras.callbacks import EarlyStopping,ModelCheckpoint,LearningRateScheduler

import math

import keras
from keras.callbacks import Callback

last_epoch = 0

def compile_model(model):
    model.compile(optimizer=RMSprop(learning_rate=learning_rate),
                  metrics = ['accuracy'])
    model.config.id2label = {0: 'human', 1: 'AI'}

def fit_model(model):
    history = model.fit(train_dataset,
        epochs=epochs,
        batch_size=batch_size,
        callbacks=[LearningRateScheduler(lr_scheduler)],
        validation_data=val_dataset,
        verbose=1)



We can now compile and fine tune the keras model. I'll start training at a learning rate of 10$^{-5}$ then reduce it by a factor of 10 for 3 more epochs.

In [10]:
learning_rate = 1e-5
compile_model(model)

In [None]:
def lr_scheduler(epoch, lr):
    return learning_rate

In [11]:
epochs = 1
batch_size=128

In [12]:
fit_model(model)

Cause: for/else statement not yet supported


I0000 00:00:1716962043.720819    5128 device_compiler.h:186] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.




In [13]:
learning_rate = 1e-6

In [14]:
fit_model(model)



In [15]:
learning_rate = 1e-7

In [16]:
fit_model(model)



In [17]:
learning_rate = 1e-8

In [18]:
fit_model(model)



In the last epoch, the training loss slightly decreased, but there was no decrease in the validation loss. I'll continue training for an additional epoch at the same learning rate.

In [19]:
fit_model(model)



We can see that the training loss has only improved by 1% and the validation loss has not improved at all. I'll slightly increase the learning rate in the next epoch and see if there is an improvement.

In [20]:
learning_rate = 2e-8
fit_model(model)



We again see that there is no improvement, so I'll increase the learning rate again:

In [21]:
learning_rate = 5e-8
fit_model(model)



We see that the training loss has decreased slightly while the validation loss has increased. It appears we are now beginning to overfit. To confirm this, I'll train for another epoch with the same learning rate:

In [22]:
fit_model(model)



It seems that we have found the optimal model, so we will use the learning rate scheduling we found to train a new model on the full training set.

In [8]:
train = samples[samples['TTV split'] >= 0]
val = samples[samples['TTV split'] == 0]
train_dataset = create_dataset(train)

train_dataset = create_dataset(train)
val_dataset = create_dataset(val)

In [9]:
learning_rates = [1e-5, 1e-6, 1e-7]

def lr_scheduler(epoch, lr):
    return learning_rates[epoch]

In [10]:
learning_rate = 1e-5
compile_model(model)

In [11]:
epochs = 3
batch_size=128
fit_model(model)

Epoch 1/3
Cause: for/else statement not yet supported


I0000 00:00:1716984207.814080     116 device_compiler.h:186] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.


Epoch 2/3
Epoch 3/3


We can now push our trained model to the huggingface hub:

In [22]:
#from huggingface_hub import notebook_login
#notebook_login()
#model.push_to_hub('ai-detector-distilbert')
#push_to_hub('ai-detector-distilbert')

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…