Skip to content

JulianGriggs/Authorship_Attribution_On_Twitter

Repository files navigation

Authorship Attribution on Twitter: A Comparative Methods Study

Author: Julian Griggs (jgriggs@princeton.edu)

Advisor: Andrea LaPaugh (aslp@cs.princeton.edu)

####5/6/2014

Stylometric analysis is becoming an increasingly powerful tool for de-anonymizing written texts on the web. Despite the large growth in social media based text, authorship attribution studies focusing on this domain are relatively scarce. In this paper, I analyze the effectiveness of some of the most commonly used linguistic features and machine learning algorithms to quantitatively determine the best combination for authorship attribution in the Twitter domain. Empirical data suggests that across the feature-analysis method combinations tested, pairing Character 2-grams with Linear Support Vector Machines yields the best overall performance.

About

Research project on the application of machine learning in the authorship attribution task.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages