Html Content / Article Extractor in Java open sourced from Gravity Labs - http://gravity.com
Pull request Compare This branch is 101 commits behind GravityLabs:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
misc/PSD
src
.gitignore
LICENSE
NOTICE
README
pom.xml

README

Try it out online!
http://jimplush.com/blog/goose


Please view the wiki pages for all the details on the project :)

Wiki can be found by clicking the Wiki link or going here: https://github.com/jiminoc/goose/wiki

If you find Goose useful or have issues please drop me a line, I'd love to hear how you're using it or what features should be improved

Goose is licensed by Gravity.com under the Apache 2.0 license, see the LICENSE file for more details

To use goose from the command line:

cd into the goose directory
mvn compile
MAVEN_OPTS="-Xms256m -Xmx2000m" mvn exec:java -Dexec.mainClass=com.jimplush.goose.TalkToMeGoose  -Dexec.args="http://techcrunch.com/2011/05/13/native-apps-or-web-apps-particle-code-wants-you-to-do-both/" -e -q > ~/Desktop/gooseresult.txt