Skip to content

kevinmeng2001/Detect-AI-Generated-Text

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 

Repository files navigation

Detect-AI-Generated-Text

As artificially generated content becomes more and more realistic and widespread, there is an increasing need to discern between the two types of content. This project aims to address the challenge of distinction, using an encoder-only transformer architecture for the binary classification of AI-generated and human-written responses to a given prompt. A byte-pair encoding tokenizer with a vocabulary size of 8000 was used, learned positional embedding and sinusoidal positional embedding was tested and compared, different pooling strategies such as the use of a [CLS] token and mean pooling were compared, and various hyperparameters (attention heads, layers, layer size, dropout amount, activation function, batch size, learning rate, epochs, optimizer, betas, weight decay) were tuned to yield the highest validation accuracy.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published