Authorship attribution with syntactic fragments
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
README.md
commonfragments.sh
devtestsplit.py
evaluate.py
ngrams.py
parsefiles.sh
parseworks.sh
runexpbooks.sh
runexpfederalist.sh
splitfed.py

README.md

authident

Authorship attribution with syntactic fragments. Cf. http://staff.science.uva.nl/~acranenb/clfl2012.pdf

Requirements:

Usage:

  • run runexpfederalist.sh to download the federalist papers and evaluate on the disputed and co-authored papers.

alternatively, evaluate on larger set of texts with cross-validation:

  • prepare a directory books/ with one directory for each author, each containing one .txt file per work in UTF-8.
  • edit parseworks.sh and parsefiles.sh to set the right paths for Java and the Stanford parser
  • edit and run runexpbooks.sh