Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Browse files

Updated README

  • Loading branch information...
commit 462f9e8f221c8a2fe7ddaeeb98533495d5526ad7 1 parent 25bee65
@skid authored
Showing with 6 additions and 1 deletion.
  1. +6 −1 README.md
View
7 README.md
@@ -1,7 +1,12 @@
# Picksy
Picksy is a scraper that will extract the relevant text from an HTML page like a blog post, a news article or anything that has a considerable chunk of text.
-I developed it to help me scrape articles from the web that will be further used for data mining where absolutely precise extraction is not essential. I wouldn't suggest using it for projects like [Readability](http://www.readability.com/) since it will often show an extra link or gobble up an occasional table of contents.
+
+I developed it to help me scrape articles from the web that will be further used for data mining where absolutely precise extraction is not essential.
+
+I wouldn't suggest using it for projects like [Readability](http://www.readability.com/) since it will often show an extra link or gobble up an occasional table of contents.
+
+You should expect nothing useful from homepages, navigation/category pages, forums and discussion thread web applications.
Picksy depends on [node-htmlparser](https://github.com/tautologistics/node-htmlparser) to provide its input and works directly on the DOM tree constructed by htmlparser.
Please sign in to comment.
Something went wrong with that request. Please try again.