This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Failed to load latest commit information.
INTRO: Thanks to Dr. Grigori Sidorov et al.  for sharing their evaluation corpus of 7 authors. This repository provides a parsed version of that corpus. As the process is computationally expensive, I found sharing it useful. For inquiries email m [ta] khonji [tod] org.  http://www.cic.ipn.mx/~sidorov/ PARSER: stanford-parser-full-2014-10-31 JAVA: oracle-jre-bin-18.104.22.168 COMMANDS: 'java' was executd with -xm14000m as some books contained very long sentences. To be specific, below is the modified version of Stanford's lexparser.sh script that we used to parse the books. #!/usr/bin/env bash # # Runs the English PCFG parser on one or more files, printing trees only if [ ! $# -ge 1 ]; then echo Usage: `basename $0` 'file(s)' echo exit fi scriptdir=`dirname $0` java -mx14000m -cp "$scriptdir/*:" edu.stanford.nlp.parser.lexparser.LexicalizedParser \ -outputFormat "penn,typedDependencies" edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz $*