COLX_585_Trends_in_compuational_linguistics_project

Abstract

In the field of Natural Language Processing (NLP), there has been a dramatic shift towards utilizing pre-trained deep language models. To perform text classification, this study uses state-of-the-art neural network models, Bidirectional Encoder Representations of Transformers (BERT) and convolutional neural networks (CNN), as well as a non-neural classifier, Logistic Regression (LR), as baseline. Specifically, the task is to distinguish human-generated text from fake text generated by the GPT-2 language model.

Results show BERT beat the baseline, achieving 90.34% F-score compared to LR’s F-score of 88.23%. BERT and LR both outperformed CNN, which attained an F-score of 80.41%. In the error analysis, this study further confirms previous research’s finding that sequence length affects neural network models performance. For example, any truncation in the input documents has a detrimental influence on the effectiveness of BERT. The primary contribution of this study is to introduce a simple but effective model in the field of fake text detection.

Final Report and Presentation Link

Presentation Link

Report Link

Jupyter Notebooks

Logistic Regression

Convolutional Neural Network

BERT

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Milestone1		Milestone1
Milestone2		Milestone2
Milestone3		Milestone3
Milestone4		Milestone4
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Milestone1

Milestone1

Milestone2

Milestone2

Milestone3

Milestone3

Milestone4

Milestone4

README.md

README.md

Repository files navigation

COLX_585_Trends_in_compuational_linguistics_project

Abstract

Final Report and Presentation Link

Jupyter Notebooks

About

Releases

Packages

Languages

MistryWoman/Classifying-human-text-from-GPT2-generated-text

Folders and files

Latest commit

History

Repository files navigation

COLX_585_Trends_in_compuational_linguistics_project

Abstract

Final Report and Presentation Link

Jupyter Notebooks

About

Resources

Stars

Watchers

Forks

Languages