# Experiments 3, 4 & 5 - Dataset 1

For the dataset [GPT vs. Human: A Corpus of Research Abstracts](https://www.kaggle.com/datasets/heleneeriksen/gpt-vs-human-a-corpus-of-research-abstracts) we run experiments 3, 4 & 5:

All experiments consist of training four ML models:
- [Random Forest](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html)
- [Logistic Regression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html)
- [Naive Bayes](https://scikit-learn.org/stable/modules/generated/sklearn.naive_bayes.MultinomialNB.html)
- [Support Vector Machine](https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html#sklearn.svm.SVC)

They are trained under a [Cross-Validation Grid Search](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html) for finding optimal parameters for each model.

Each experiment varies the 'ngram_range' paramater for Bag of Words and TF-IDF features

Main difference between each experiment is:
- Experiment 3 uses Bag of Words feature ([CountVectorizer](https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.CountVectorizer.html)) for bigrams and trigrams
- Experiment 4 uses TF-IDF feature ([TfidfVectorizer](https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.TfidfVectorizer.html)) for bigrams and trigrams
- Experiment 5 uses both BOW and TF-IDF each including all three unigrams, bigrams and trigrams features

### Notebook setup

In [None]:
# Core libraries
import os
import sys
import pickle

import pandas as pd
import kagglehub

import numpy as np
import torch

# ML libraries
from sklearn.pipeline import Pipeline
from sklearn.model_selection import train_test_split, KFold, GridSearchCV
from sklearn.metrics import confusion_matrix, classification_report

from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer

from sklearn.ensemble import RandomForestClassifier
from sklearn.naive_bayes import MultinomialNB
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC

# Plotting libraries
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import plotly.colors
plotly_colors = plotly.colors.qualitative.Plotly

## Data prepping

### Dataset download

In [None]:
path = kagglehub.dataset_download("heleneeriksen/gpt-vs-human-a-corpus-of-research-abstracts")
dataset_path = os.path.join(path, "data_set.csv")

data = pd.read_csv(dataset_path)
data.drop(columns=['title', 'ai_generated'], inplace=True)

# Get longest label amount of data
data_size = max(len(data[data['is_ai_generated'] == 0]), len(data[data['is_ai_generated'] == 1]))
data_size += 0.1*data_size

# Peek into the data
print("\nPeek into the dataset: heleneeriksen/gpt-vs-human-a-corpus-of-research-abstracts\n")
display(data)


fig = make_subplots(rows=1, cols=2, subplot_titles=('Original Dataset', 'Balanced Dataset'),
                    horizontal_spacing=0.3)

fig.add_trace(go.Histogram(x=data['is_ai_generated'], name='Original Dataset', marker_color=[plotly_colors[2], plotly_colors[1]]), row=1, col=1)

# Remove data to balance the dataset and speed up training
data = data.drop(data[data['is_ai_generated'] == 1].sample(53).index)
data = data.drop(data[data['is_ai_generated'] == 0].sample(200).index)


fig.add_trace(go.Histogram(x=data['is_ai_generated'], name='Balanced Dataset', marker_color=[plotly_colors[2], plotly_colors[1]]), row=1, col=2)

fig.update_layout(showlegend=False, width=700, bargap=0.4,
                  plot_bgcolor="rgba(0,0,0,0)", paper_bgcolor="rgba(0,0,0,0)", font_color="white",
                  xaxis=dict(tickmode='array', tickvals=[0, 1], ticktext=['human', 'ai']), xaxis2=dict(tickmode='array', tickvals=[0, 1], ticktext=['human', 'ai']),
                  yaxis=dict(title='Count'), yaxis2=dict(title='Count'), yaxis_range=[0, data_size], yaxis2_range=[0, data_size])
fig.show()

Downloading from https://www.kaggle.com/api/v1/datasets/download/heleneeriksen/gpt-vs-human-a-corpus-of-research-abstracts?dataset_version_number=1...


100%|██████████| 1.10M/1.10M [00:00<00:00, 91.0MB/s]

Extracting files...

Peek into the dataset: heleneeriksen/gpt-vs-human-a-corpus-of-research-abstracts






Unnamed: 0,abstract,is_ai_generated
0,Advanced electromagnetic potentials are indi...,0
1,This research paper investigates the question ...,1
2,We give an algorithm for finding network enc...,0
3,The paper presents an efficient centralized bi...,1
4,We introduce an exponential random graph mod...,0
...,...,...
4048,This research paper investigates the vortex dy...,1
4049,Given a remarkable representation of the gen...,0
4050,The Veldkamp space of two-qubits is a mathemat...,1
4051,The equilibration of macroscopic degrees of ...,0


In [None]:
# Split the data into training (80) and testing (20)
X_train, X_test, y_train, y_test = train_test_split(data['abstract'], data['is_ai_generated'], test_size=0.2, random_state=0)

# Plot the distribution of the split
fig = go.Figure()

fig.add_trace(go.Histogram(x=y_train, name="training", marker_color='lightslategray'))
fig.add_trace(go.Histogram(x=y_test, name="testing", marker_color='crimson'))

fig.update_layout(title_text='Split Dataset', xaxis_title_text='Labels', yaxis_title_text='Count',
                  barmode='overlay', bargap=0.4,
                  plot_bgcolor="rgba(0,0,0,0)", paper_bgcolor="rgba(0,0,0,0)", font_color="white",
                  xaxis=dict(tickmode='array', tickvals=[0, 1], ticktext=['human', 'ai']),
                  xaxis2=dict(tickmode='array', tickvals=[0, 1], ticktext=['human', 'ai']),
                  width=500, height=500)
fig.show()

## Methodology

### Libraries and experiments setup

In [None]:
# Cross-validation folds
cv = KFold(n_splits=5)

# Parameter grid for GridSearchCV
rf_param = {'classifier__n_estimators': [10, 50, 100, 200], 'classifier__max_depth': [None, 10, 50, 100]}
nb_param = {'classifier__alpha': [0.1, 0.5, 1.0, 2.0]}
lr_param = {'classifier__C': [0.1, 0.5, 1.0, 2.0], 'classifier__max_iter': [100, 200, 300, 400]}
svc_param = {'classifier__C': [0.1, 0.5, 1.0, 2.0], 'classifier__kernel': ['linear', 'poly', 'rbf', 'sigmoid']}

# Models to use
models = {'RandomForest': RandomForestClassifier(), 'NaiveBayes': MultinomialNB(), 'LogisticRegression': LogisticRegression(), 'SVC': SVC()}
parameters = {'RandomForest': rf_param, 'NaiveBayes': nb_param, 'LogisticRegression': lr_param, 'SVC': svc_param}

### Experiment 3

In [None]:
experiment_3_bigrams = dict()
experiment_3_trigrams = dict()
predictions_3_bigrams = dict()
predictions_3_trigrams = dict()

for model in models:
  # Create the model pipeline for bigrams
  experiment_3_bigrams[model+'_pip'] = Pipeline([('vectorizer', CountVectorizer(ngram_range=(2, 2))), ('classifier', models[model])])
  # Create the model pipeline for trigrams
  experiment_3_trigrams[model+'_pip'] = Pipeline([('vectorizer', CountVectorizer(ngram_range=(3, 3))), ('classifier', models[model])])

  # Create the grid search model for each pipeline and parameters
  experiment_3_bigrams[model] = GridSearchCV(experiment_3_bigrams[model+'_pip'], parameters[model], cv=cv, scoring='accuracy', verbose=2)
  experiment_3_trigrams[model] = GridSearchCV(experiment_3_trigrams[model+'_pip'], parameters[model], cv=cv, scoring='accuracy', verbose=2)

  # Train & predict
  print(f"\n\tTraining the '{model}' model for bigrams...")
  experiment_3_bigrams[model].fit(X_train, y_train)
  predictions_3_bigrams[model] = experiment_3_bigrams[model].predict(X_test)

  print(f"\n\tTraining the '{model}' model for trigrams...")
  experiment_3_trigrams[model].fit(X_train, y_train)
  predictions_3_trigrams[model] = experiment_3_trigrams[model].predict(X_test)


	Training the 'RandomForest' model for bigrams...
Fitting 5 folds for each of 16 candidates, totalling 80 fits
[CV] END classifier__max_depth=None, classifier__n_estimators=10; total time=   2.5s
[CV] END classifier__max_depth=None, classifier__n_estimators=10; total time=   1.5s
[CV] END classifier__max_depth=None, classifier__n_estimators=10; total time=   0.6s
[CV] END classifier__max_depth=None, classifier__n_estimators=10; total time=   0.7s
[CV] END classifier__max_depth=None, classifier__n_estimators=10; total time=   0.6s
[CV] END classifier__max_depth=None, classifier__n_estimators=50; total time=   1.1s
[CV] END classifier__max_depth=None, classifier__n_estimators=50; total time=   1.1s
[CV] END classifier__max_depth=None, classifier__n_estimators=50; total time=   1.2s
[CV] END classifier__max_depth=None, classifier__n_estimators=50; total time=   1.2s
[CV] END classifier__max_depth=None, classifier__n_estimators=50; total time=   1.5s
[CV] END classifier__max_depth=None, c

### Experiment 4

In [None]:
experiment_4_bigrams = dict()
experiment_4_trigrams = dict()
predictions_4_bigrams = dict()
predictions_4_trigrams = dict()

for model in models:
  # Create the model pipeline for bigrams
  experiment_4_bigrams[model+'_pip'] = Pipeline([('vectorizer', TfidfVectorizer(ngram_range=(2, 2))), ('classifier', models[model])])
  # Create the model pipeline for trigrams
  experiment_4_trigrams[model+'_pip'] = Pipeline([('vectorizer', TfidfVectorizer(ngram_range=(3, 3))), ('classifier', models[model])])

  # Create the grid search model for each pipeline and parameters
  experiment_4_bigrams[model] = GridSearchCV(experiment_4_bigrams[model+'_pip'], parameters[model], cv=cv, scoring='accuracy', verbose=2)
  experiment_4_trigrams[model] = GridSearchCV(experiment_4_trigrams[model+'_pip'], parameters[model], cv=cv, scoring='accuracy', verbose=2)

  # Train & predict
  print(f"\n\tTraining the '{model}' model for bigrams...")
  experiment_4_bigrams[model].fit(X_train, y_train)
  predictions_4_bigrams[model] = experiment_4_bigrams[model].predict(X_test)

  print(f"\n\tTraining the '{model}' model for trigrams...")
  experiment_4_trigrams[model].fit(X_train, y_train)
  predictions_4_trigrams[model] = experiment_4_trigrams[model].predict(X_test)


	Training the 'RandomForest' model for bigrams...
Fitting 5 folds for each of 16 candidates, totalling 80 fits
[CV] END classifier__max_depth=None, classifier__n_estimators=10; total time=   0.7s
[CV] END classifier__max_depth=None, classifier__n_estimators=10; total time=   0.7s
[CV] END classifier__max_depth=None, classifier__n_estimators=10; total time=   0.6s
[CV] END classifier__max_depth=None, classifier__n_estimators=10; total time=   0.6s
[CV] END classifier__max_depth=None, classifier__n_estimators=10; total time=   0.6s
[CV] END classifier__max_depth=None, classifier__n_estimators=50; total time=   1.1s
[CV] END classifier__max_depth=None, classifier__n_estimators=50; total time=   1.1s
[CV] END classifier__max_depth=None, classifier__n_estimators=50; total time=   1.1s
[CV] END classifier__max_depth=None, classifier__n_estimators=50; total time=   1.1s
[CV] END classifier__max_depth=None, classifier__n_estimators=50; total time=   1.5s
[CV] END classifier__max_depth=None, c

### Experiment 5

In [None]:
experiment_5_bow = dict()
experiment_5_tfidf = dict()
predictions_5_bow = dict()
predictions_5_tfidf = dict()

for model in models:
  # Create the model pipeline for BOW with unigrams, bigrams and trigrams
  experiment_5_bow[model+'_pip'] = Pipeline([('vectorizer', CountVectorizer(ngram_range=(1, 3))), ('classifier', models[model])])
  # Create the model pipeline for TF-IDF with unigrams, bigrams and trigrams
  experiment_5_tfidf[model+'_pip'] = Pipeline([('vectorizer', TfidfVectorizer(ngram_range=(1, 3))), ('classifier', models[model])])

  # Create the grid search model for each pipeline and parameters
  experiment_5_bow[model] = GridSearchCV(experiment_5_bow[model+'_pip'], parameters[model], cv=cv, scoring='accuracy', verbose=2)
  experiment_5_tfidf[model] = GridSearchCV(experiment_5_tfidf[model+'_pip'], parameters[model], cv=cv, scoring='accuracy', verbose=2)

  # Train & predict
  print(f"\n\tTraining the '{model}' model for BOW...")
  experiment_5_bow[model].fit(X_train, y_train)
  predictions_5_bow[model] = experiment_5_bow[model].predict(X_test)

  print(f"\n\tTraining the '{model}' model for TF-IDF...")
  experiment_5_tfidf[model].fit(X_train, y_train)
  predictions_5_tfidf[model] = experiment_5_tfidf[model].predict(X_test)


	Training the 'RandomForest' model for BOW...
Fitting 5 folds for each of 16 candidates, totalling 80 fits
[CV] END classifier__max_depth=None, classifier__n_estimators=10; total time=   2.3s
[CV] END classifier__max_depth=None, classifier__n_estimators=10; total time=   1.6s
[CV] END classifier__max_depth=None, classifier__n_estimators=10; total time=   1.7s
[CV] END classifier__max_depth=None, classifier__n_estimators=10; total time=   1.8s
[CV] END classifier__max_depth=None, classifier__n_estimators=10; total time=   1.7s
[CV] END classifier__max_depth=None, classifier__n_estimators=50; total time=   2.7s
[CV] END classifier__max_depth=None, classifier__n_estimators=50; total time=   2.8s
[CV] END classifier__max_depth=None, classifier__n_estimators=50; total time=   2.6s
[CV] END classifier__max_depth=None, classifier__n_estimators=50; total time=   2.4s
[CV] END classifier__max_depth=None, classifier__n_estimators=50; total time=   2.5s
[CV] END classifier__max_depth=None, class

## Metrics and reports

### Classification Reports

In [None]:
cm_3_bigrams = dict()
cm_3_trigrams = dict()

cm_4_bigrams = dict()
cm_4_trigrams = dict()

cm_5_bow = dict()
cm_5_tfidf = dict()

for model in models:
  # Print the classification report
  print(f"\tExperiment 3 - '{model}' BOW Bigrams:")
  print(classification_report(y_test, predictions_3_bigrams[model]))
  print(f"\tExperiment 3 - '{model}' BOW Trigrams:")
  print(classification_report(y_test, predictions_3_trigrams[model]))

  print(f"\tExperiment 4 - '{model}' TF-IDF Bigrams:")
  print(classification_report(y_test, predictions_4_bigrams[model]))
  print(f"\tExperiment 4 - '{model}' TF-IDF Trigrams:")
  print(classification_report(y_test, predictions_4_trigrams[model]))

  print(f"\tExperiment 5 - '{model}' BOW:")
  print(classification_report(y_test, predictions_5_bow[model]))
  print(f"\tExperiment 5 - '{model}' TF-IDF:")
  print(classification_report(y_test, predictions_5_tfidf[model]), "\n\n\n")

  # Confusion Matrix plot
  cm_3_bigrams[model] = confusion_matrix(y_test,  predictions_3_bigrams[model])
  cm_3_trigrams[model] = confusion_matrix(y_test, predictions_3_trigrams[model])

  cm_4_bigrams[model] = confusion_matrix(y_test,  predictions_4_bigrams[model])
  cm_4_trigrams[model] = confusion_matrix(y_test, predictions_4_trigrams[model])

  cm_5_bow[model] = confusion_matrix(y_test, predictions_5_bow[model])
  cm_5_tfidf[model] = confusion_matrix(y_test, predictions_5_tfidf[model])

	Experiment 3 - 'RandomForest' BOW Bigrams:
              precision    recall  f1-score   support

           0       1.00      0.99      0.99       382
           1       0.99      1.00      0.99       378

    accuracy                           0.99       760
   macro avg       0.99      0.99      0.99       760
weighted avg       0.99      0.99      0.99       760

	Experiment 3 - 'RandomForest' BOW Trigrams:
              precision    recall  f1-score   support

           0       0.99      0.99      0.99       382
           1       0.99      0.99      0.99       378

    accuracy                           0.99       760
   macro avg       0.99      0.99      0.99       760
weighted avg       0.99      0.99      0.99       760

	Experiment 4 - 'RandomForest' TF-IDF Bigrams:
              precision    recall  f1-score   support

           0       1.00      0.99      0.99       382
           1       0.99      1.00      0.99       378

    accuracy                           0.99   

### Experiment 3 - Confussion matrixes

#### BOW Bigrams

In [None]:
fig = make_subplots(rows=2, cols=2, subplot_titles=('Random Forest', 'Naive Bayes', 'Logistic Regression', 'Support Vector'),
                    horizontal_spacing=0.3, vertical_spacing=0.25)

for model in models:
  match model:
    case 'RandomForest':
      pos = [1, 1]
    case 'NaiveBayes':
      pos = [1, 2]
    case 'LogisticRegression':
      pos = [2, 1]
    case 'SVC':
      pos = [2, 2]

  fig.add_trace(go.Heatmap(z=cm_3_bigrams[model], x=['AI', 'Human'], y=['AI', 'Human'], coloraxis='coloraxis', text=cm_3_bigrams[model], texttemplate="%{text}"),
                row=pos[0], col=pos[1])

fig.update_layout(title="Experiment 3 - Bag of words Bigrams", coloraxis=dict(colorscale='Burgyl'), showlegend=False, plot_bgcolor="rgba(0,0,0,0)", paper_bgcolor="rgba(0,0,0,0)", font_color="white", width=700)
fig.show()

#### BOW Trigrams

In [None]:
fig = make_subplots(rows=2, cols=2, subplot_titles=('Random Forest', 'Naive Bayes', 'Logistic Regression', 'Support Vector'),
                    horizontal_spacing=0.3, vertical_spacing=0.25)

for model in models:
  match model:
    case 'RandomForest':
      pos = [1, 1]
    case 'NaiveBayes':
      pos = [1, 2]
    case 'LogisticRegression':
      pos = [2, 1]
    case 'SVC':
      pos = [2, 2]

  fig.add_trace(go.Heatmap(z=cm_3_trigrams[model], x=['AI', 'Human'], y=['AI', 'Human'], coloraxis='coloraxis', text=cm_3_trigrams[model], texttemplate="%{text}"),
                row=pos[0], col=pos[1])

fig.update_layout(title="Experiment 3 - Bag of words Trigrams", coloraxis=dict(colorscale='Burgyl'), showlegend=False, plot_bgcolor="rgba(0,0,0,0)", paper_bgcolor="rgba(0,0,0,0)", font_color="white", width=700)
fig.show()

### Experiment 4 - Confussion matrixes

#### TF-IDF Bigrams

In [None]:
fig = make_subplots(rows=2, cols=2, subplot_titles=('Random Forest', 'Naive Bayes', 'Logistic Regression', 'Support Vector'),
                    horizontal_spacing=0.3, vertical_spacing=0.25)

for model in models:
  match model:
    case 'RandomForest':
      pos = [1, 1]
    case 'NaiveBayes':
      pos = [1, 2]
    case 'LogisticRegression':
      pos = [2, 1]
    case 'SVC':
      pos = [2, 2]

  fig.add_trace(go.Heatmap(z=cm_4_bigrams[model], x=['AI', 'Human'], y=['AI', 'Human'], coloraxis='coloraxis', text=cm_4_bigrams[model], texttemplate="%{text}"),
                row=pos[0], col=pos[1])

fig.update_layout(title="Experiment 4 - TF-IDF Bigrams", coloraxis=dict(colorscale='Burgyl'), showlegend=False, plot_bgcolor="rgba(0,0,0,0)", paper_bgcolor="rgba(0,0,0,0)", font_color="white", width=700)
fig.show()

#### TF-IDF Trigrams

In [None]:
fig = make_subplots(rows=2, cols=2, subplot_titles=('Random Forest', 'Naive Bayes', 'Logistic Regression', 'Support Vector'),
                    horizontal_spacing=0.3, vertical_spacing=0.25)

for model in models:
  match model:
    case 'RandomForest':
      pos = [1, 1]
    case 'NaiveBayes':
      pos = [1, 2]
    case 'LogisticRegression':
      pos = [2, 1]
    case 'SVC':
      pos = [2, 2]

  fig.add_trace(go.Heatmap(z=cm_4_trigrams[model], x=['AI', 'Human'], y=['AI', 'Human'], coloraxis='coloraxis', text=cm_4_trigrams[model], texttemplate="%{text}"),
                row=pos[0], col=pos[1])

fig.update_layout(title="Experiment 4 - TF-IDF Trigrams", coloraxis=dict(colorscale='Burgyl'), showlegend=False, plot_bgcolor="rgba(0,0,0,0)", paper_bgcolor="rgba(0,0,0,0)", font_color="white", width=700)
fig.show()

### Experiment 5 - Confussion matrixes

#### Bag Of Words

In [None]:
fig = make_subplots(rows=2, cols=2, subplot_titles=('Random Forest', 'Naive Bayes', 'Logistic Regression', 'Support Vector'),
                    horizontal_spacing=0.3, vertical_spacing=0.25)

for model in models:
  match model:
    case 'RandomForest':
      pos = [1, 1]
    case 'NaiveBayes':
      pos = [1, 2]
    case 'LogisticRegression':
      pos = [2, 1]
    case 'SVC':
      pos = [2, 2]

  fig.add_trace(go.Heatmap(z=cm_5_bow[model], x=['AI', 'Human'], y=['AI', 'Human'], coloraxis='coloraxis', text=cm_5_bow[model], texttemplate="%{text}"),
                row=pos[0], col=pos[1])

fig.update_layout(title="Experiment 5 - Bag of words 1-gram to 3-grams", coloraxis=dict(colorscale='Burgyl'), showlegend=False, plot_bgcolor="rgba(0,0,0,0)", paper_bgcolor="rgba(0,0,0,0)", font_color="white", width=700)
fig.show()

#### TF-IDF

In [None]:
fig = make_subplots(rows=2, cols=2, subplot_titles=('Random Forest', 'Naive Bayes', 'Logistic Regression', 'Support Vector'),
                    horizontal_spacing=0.3, vertical_spacing=0.25)

for model in models:
  match model:
    case 'RandomForest':
      pos = [1, 1]
    case 'NaiveBayes':
      pos = [1, 2]
    case 'LogisticRegression':
      pos = [2, 1]
    case 'SVC':
      pos = [2, 2]

  fig.add_trace(go.Heatmap(z=cm_5_tfidf[model], x=['AI', 'Human'], y=['AI', 'Human'], coloraxis='coloraxis', text=cm_5_tfidf[model], texttemplate="%{text}"),
                row=pos[0], col=pos[1])

fig.update_layout(title="Experiment 5 - TF-IDF 1-gram to 3-grams", coloraxis=dict(colorscale='Burgyl'), showlegend=False, plot_bgcolor="rgba(0,0,0,0)", paper_bgcolor="rgba(0,0,0,0)", font_color="white", width=700)
fig.show()

## Custom input test

In [None]:
test_abstract = ['This study investigates the effects of increased atmospheric fluoride emissions from an aluminium smelter, on the reproductive processes of three native species, Banksia aemula, Bossiaea heterophylla and Actinotus helianthi. Elaboration of purpose Attention has also been paid to the soil seed reserve as an important resource for the replacement of adult plants within the community.']
results = []

for model in models:
       prediction_3_1 = experiment_3_bigrams[model].predict(test_abstract)[0]
       prediction_3_2 = experiment_3_trigrams[model].predict(test_abstract)[0]
       prediction_4_1 = experiment_4_bigrams[model].predict(test_abstract)[0]
       prediction_4_2 = experiment_4_trigrams[model].predict(test_abstract)[0]
       prediction_5_1 = experiment_5_bow[model].predict(test_abstract)[0]
       prediction_5_2 = experiment_5_tfidf[model].predict(test_abstract)[0]
       results.append([model, prediction_3_1, prediction_3_2, prediction_4_1, prediction_4_2, prediction_5_1, prediction_5_2])

df = pd.DataFrame(results, columns=['Model', 'Experiment 3 - Bigrams', 'Experiment 3 - Trigrams', 'Experiment 4 - Bigrams', 'Experiment 4 - Trigrams', 'Experiment 5 - BOW', 'Experiment 5 - TF-IDF'])
df.replace({0: 'Human', 1: 'AI'}, inplace=True)
display(df)

Unnamed: 0,Model,Experiment 3 - Bigrams,Experiment 3 - Trigrams,Experiment 4 - Bigrams,Experiment 4 - Trigrams,Experiment 5 - BOW,Experiment 5 - TF-IDF
0,RandomForest,Human,Human,Human,Human,Human,Human
1,NaiveBayes,AI,AI,AI,AI,AI,AI
2,LogisticRegression,Human,Human,Human,Human,Human,Human
3,SVC,Human,Human,Human,Human,Human,Human


In [None]:
# Save the models
with open('naive_bayes_model.pkl', 'wb') as f:
    pickle.dump(nb_model, f)

with open('random_forest_model.pkl', 'wb') as f:
    pickle.dump(rf_model, f)

with open('logistic_regression_model.pkl', 'wb') as f:
    pickle.dump(lr_model, f)

with open('support_vector_model.pkl', 'wb') as f:
    pickle.dump(svc_model, f)

NameError: name 'nb_model' is not defined