This code is under the Apache License 2.0.
This is a ruby port of arc90's readability project
Given a html document, it pulls out the main body text and cleans it up.
Ruby port by starrhorne and iterationlabs