The boilerpipe library extracts the main textual content of a web page. The original project can be found at https://code.google.com/p/boilerpipe/. This primary purpose of this project is to implement a command line frontend.
./boilerpipe.sh -u $(URL)
./boilerpipe.sh -f $(PATH)