# Heuristics

This notebook will try some heuristics to solve the problem.

In [2]:
# Importing libraries
import pandas as pd 
import numpy as np 
from sklearn.metrics import roc_auc_score
from tqdm import tqdm

tqdm.pandas()

In [3]:
# Getting the data
training = pd.read_csv('../../data/train.csv')
valid = pd.read_csv('../../data/validation.csv')

## Grammar Errors

Papers state that human evaluators are able to identify human essays by looking at the grammatical errors. In my EDA, I found that there was a big discrepancy in the grammatical errors between the student essays and LLM essays. Thus, I am going to make a simple rule: if there are at least 7 grammatical errors, the essay will be classified as a student essay and LLM otherwise.

In [4]:
def grammar_heuristic(count:int) -> int:
    if count >= 7:
        return 0
    else:
        return 1

training['prediction'] = training['grammar_errors'].progress_apply(grammar_heuristic)
valid['prediction'] = valid['grammar_errors'].progress_apply(grammar_heuristic)

100%|██████████| 44733/44733 [00:00<00:00, 136769.19it/s]
100%|██████████| 5195/5195 [00:00<00:00, 126009.34it/s]


In [5]:
# Checking the scores
print('Predictions for Grammar Counts Heuristic')
train_score = roc_auc_score(training['LLM_written'].values,training['prediction'].values)
valid_score = roc_auc_score(valid['LLM_written'].values,valid['prediction'].values)
print(f'Training ROC AUC: {train_score}')
print(f'Validation ROC AUC: {valid_score}')

Predictions for Grammar Counts Heuristic
Training ROC AUC: 0.8584432769906309
Validation ROC AUC: 0.9286537943641512


## Word Counts

I found in EDA that students tend to write essays with more words than LLMs. I want to see if this heuristic is suitable

In [6]:
def word_count_heuristic(count:int) -> int:
    if count >= 500:
        return 0
    else:
        return 1

training['prediction'] = training['word_count'].progress_apply(word_count_heuristic)
valid['prediction'] = valid['word_count'].progress_apply(word_count_heuristic)

100%|██████████| 44733/44733 [00:00<00:00, 57401.01it/s]
100%|██████████| 5195/5195 [00:00<00:00, 458714.75it/s]


In [7]:
# Checking the scores
print('Predictions for Word Counts Heuristic')
train_score = roc_auc_score(training['LLM_written'].values,training['prediction'].values)
valid_score = roc_auc_score(valid['LLM_written'].values,valid['prediction'].values)
print(f'Training ROC AUC: {train_score}')
print(f'Validation ROC AUC: {valid_score}')

Predictions for Word Counts Heuristic
Training ROC AUC: 0.583361635243965
Validation ROC AUC: 0.806776728661547
