Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Fetching latest commit…
Cannot retrieve the latest commit at this time.
|Failed to load latest commit information.|
INTRO: Thanks to Dr. Grigori Sidorov et al.  for sharing their evaluation corpus of 7 authors. This repository provides a parsed version of that corpus. As the process is computationally expensive, I found sharing it useful. For inquiries email m [ta] khonji [tod] org.  http://www.cic.ipn.mx/~sidorov/ PARSER: stanford-parser-full-2014-10-31 JAVA: oracle-jre-bin-188.8.131.52 COMMANDS: 'java' was executd with -xm14000m as some books contained very long sentences. To be specific, below is the modified version of Stanford's lexparser.sh script that we used to parse the books. #!/usr/bin/env bash # # Runs the English PCFG parser on one or more files, printing trees only if [ ! $# -ge 1 ]; then echo Usage: `basename $0` 'file(s)' echo exit fi scriptdir=`dirname $0` java -mx14000m -cp "$scriptdir/*:" edu.stanford.nlp.parser.lexparser.LexicalizedParser \ -outputFormat "penn,typedDependencies" edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz $*