Skip to content

gmikros/Author_Multilevel_Ngram_Profiles

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Multilevel N-grams for authorship attribution and profiling

This is an R script for calculating sequential word and character n-gram vectors in a corpus of texts. These vectors are used for training SVM and Random Forests MC algorithms in order to perfrom authorship attribution or author profiling classification. AMNP method has been described in the following papers:

  • Mikros, G. K., & Perifanos, K. (2011). Authorship identification in large email collections: Experiments using features that belong to different linguistic levels Proceedings of PAN 2011 Lab, Uncovering Plagiarism, Authorship, and Social Software Misuse held in conjunction with the CLEF 2011 Conference on Multilingual and Multimodal Information Access Evaluation, 19-22 September 2011, Amsterdam.

  • Mikros, G. K., & Perifanos, K. (2013). Authorship attribution in Greek tweets using multilevel author’s n-gram profiles. In E. Hovy, V. Markman, C. H. Martell & D. Uthus (Eds.), Papers from the 2013 AAAI Spring Symposium "Analyzing Microtext", 25-27 March 2013, Stanford, California (pp. 17-23). Palo Alto, California: AAAI Press.

About

Multilevel N-grams for authorship attribution and profiling

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages