Permalink
Browse files

update README

  • Loading branch information...
1 parent cf19551 commit afd19fedfa6ed11dab12ac97ae8332f5b45cbcd9 @gurgeous committed Jun 19, 2012
Showing with 8 additions and 0 deletions.
  1. +8 −0 README.md
View
8 README.md
@@ -33,6 +33,14 @@ end
If you paste this into a file called `bestsellers.sinew` and run `sinew bestsellers.sinew`, it will create a `bestsellers.csv` file containing the url, title and img for each bestseller.
+## How does Sinew differ from Mechanize?
+
+I'm not an expert on Mechanize, but this question has come up repeatedly and I'll try to address it. Mechanize is a great toolkit and it's better for some situations. Briefly:
+
+* Sinew caches all HTTP requests on disk. That makes it possible to iterate quickly. Crawl once and then continue to work on your recipe. Run the recipe over and over while you tune your CSS selectors and regular expressions.
+* Sinew runs responses through [HTML Tidy](http://tidy.sourceforge.net). This cleans up dirty HTML and makes it easier to parse in many cases, especially if you have to fallback to regular expressions instead of Nokogiri. Unfortunately, this is a common use case in my experience.
+* Sinew outputs CSV files. It does exactly one thing and it does it well - Sinew crawls a site and outputs a CSV file. Mechanize is a more general toolkit.
+
## Full Documentation
Full docs are in the wiki:

0 comments on commit afd19fe

Please sign in to comment.