Extract the comments/annotations from a Word DOC or DOCX document or a PDF file, and dump them to the console (for now).
Because I grade student papers by putting the grades in Word and PDF comments, and I wanted to be able to extract them from the command line, without running Word or Acrobat themselves, or a hack like AppleScript.
You need to have Apache Maven installed. On Mac OS X, this is just
brew install maven, and on Ubuntu you're looking for
sudo apt-get install maven. To compile the JAR file, run:
git co (this repository) mvn install java -jar target/get_comments-(VERSION)-jar-with-dependencies.jar
For everyday use, you might want to drop the JAR somewhere memorable and write a little shell script:
$!/bin/sh java -jar (PATH_TO)/get_comments.jar $?
java -jar PATH_TO_JAR_FILE [OPTIONS] FILENAME, and the comments from the file will be printed to standard output. There are two command-line options:
-q: Only print the comments themselves. By default, each comment will be prefixed by "Comment #N: "; setting this option disables that.
-l N: Only print the first N comments from the document. By default, all comments in the document will be printed.
For example, to print only the value of the document's first comment, you can call
java -jar (PATH_TO_JAR_FILE) --quiet --limit 1 (FILENAME).
Copyright (C) 2012 Charles Pence, and released under the MIT license.