Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Browse files

Document that Mechanize::Page#search accepts an XPath or CSS expressi…

…on. Fixes #199
  • Loading branch information...
commit 30eb16188f0d7535c872009b32e65139fe1dab64 1 parent 05be267
@drbrain drbrain authored
Showing with 55 additions and 26 deletions.
  1. +3 −1 CHANGELOG.rdoc
  2. +23 −18 EXAMPLES.rdoc
  3. +10 −5 GUIDE.rdoc
  4. +19 −2 lib/mechanize/page.rb
View
4 CHANGELOG.rdoc
@@ -22,7 +22,9 @@
filename.
* Added Mechanize::DirectorySaver which saves responses in a single
directory. Issue #187 by yoshie902a.
- * Added Link#noreferrer?.
+ * Added Mechanize::Page::Link#noreferrer?
+ * The documentation for Mechanize::Page#search and #at now show that both
+ XPath and CSS expressions are allowed. Issue #199 by Shane Becker.
* Bug fixes
* Fixed handling of a HEAD request with Accept-Encoding: gzip. Issue #198
View
41 EXAMPLES.rdoc
@@ -1,8 +1,9 @@
= Mechanize examples
Note: Several examples show methods chained to the end of do/end blocks.
-Do...end is the same as curly braces ({...}). For example, do ... end.submit
-is the same as { ... }.submit.
+<code>do...end</code> is the same as curly braces (<code>{...}</code>). For
+example, <code>do ... end.submit</code> is the same as <code>{ ...
+}.submit</code>.
== Google
@@ -81,7 +82,8 @@ Upload a file to flickr.
end
== Pluggable Parsers
-Lets say you want html pages to automatically be parsed with Rubyful Soup.
+
+Lets say you want HTML pages to automatically be parsed with Rubyful Soup.
This example shows you how:
require 'rubygems'
@@ -115,10 +117,10 @@ Beautiful Soup for that page.
== The transact method
-transact runs the given block and then resets the page history. I.e. after the
-block has been executed, you're back at the original page; no need count how
-many times to call the back method at the end of a loop (while accounting for
-possible exceptions).
+Mechanize#transact runs the given block and then resets the page history. I.e.
+after the block has been executed, you're back at the original page; no need
+count how many times to call the back method at the end of a loop (while
+accounting for possible exceptions).
This example also demonstrates subclassing Mechanize.
@@ -154,17 +156,12 @@ This example also demonstrates subclassing Mechanize.
== Client Certificate Authentication (Mutual Auth)
-In most cases a client certificate is created as an additional layer of security
-for certain websites. The specific case that this was initially tested on was
-for automating the download of archived images from a banks (Wachovia) lockbox
-system. Once the certificate is installed into your browser you will have to
-export it and split the certificate and private key into separate files.
-Exported files are usually in .p12 format (IE 7 & Firefox 2.0) which stands for
-PKCS #12. You can convert them from p12 to pem format by using the following
-commands:
-
-openssl.exe pkcs12 -in input_file.p12 -clcerts -out example.key -nocerts -nodes
-openssl.exe pkcs12 -in input_file.p12 -clcerts -out example.cer -nokeys
+In most cases a client certificate is created as an additional layer of
+security for certain websites. The specific case that this was initially
+tested on was for automating the download of archived images from a banks
+(Wachovia) lockbox system. Once the certificate is installed into your
+browser you will have to export it and split the certificate and private key
+into separate files.
require 'rubygems'
require 'mechanize'
@@ -185,3 +182,11 @@ openssl.exe pkcs12 -in input_file.p12 -clcerts -out example.cer -nokeys
# submit login form
agent.submit(login_form, login_form.buttons.first)
+
+Exported files are usually in .p12 format (IE 7 & Firefox 2.0) which stands
+for PKCS #12. You can convert them from p12 to pem format by using the
+following commands:
+
+ openssl pkcs12 -in input_file.p12 -clcerts -out example.key -nocerts -nodes
+ openssl pkcs12 -in input_file.p12 -clcerts -out example.cer -nokeys
+
View
15 GUIDE.rdoc
@@ -130,7 +130,7 @@ In this section, I want to touch on using the different types in input fields
possible with a form. Password and textarea fields can be treated just like
text input fields. Select fields are very similar to text fields, but they
have many options associated with them. If you select one option, mechanize
-will deselect the other options (unless it is a multi select!).
+will de-select the other options (unless it is a multi select!).
For example, lets select an option on a list:
@@ -154,10 +154,15 @@ tell it what file name you want to upload:
== Scraping Data
-Mechanize uses nokogiri[http://nokogiri.org/] to parse
-html. What does this mean for you? You can treat a mechanize page like
-an nokogiri object. After you have used Mechanize to navigate to the page
-that you need to scrape, then scrape it using nokogiri methods:
+Mechanize uses nokogiri[http://nokogiri.org/] to parse HTML. What does this
+mean for you? You can treat a mechanize page like an nokogiri object. After
+you have used Mechanize to navigate to the page that you need to scrape, then
+scrape it using nokogiri methods:
+
+ agent.get('http://someurl.com/').search("p.posted")
+
+The expression given to Mechanize::Page#search may be a CSS expression or an
+XPath expression:
agent.get('http://someurl.com/').search(".//p[@class='posted']")
View
21 lib/mechanize/page.rb
@@ -186,9 +186,26 @@ def content_type
@meta_content_type || response['content-type']
end
- # Search through the page like HPricot
+ ##
+ # :method: search
+ #
+ # Search for +paths+ in the page using Nokogiri's #search. The +paths+ can
+ # be XPath or CSS and an optional Hash of namespaces may be appended.
+ #
+ # See Nokogiri::XML::Node#search for further details.
+
def_delegator :parser, :search, :search
- def_delegator :parser, :/, :/
+
+ alias / search
+
+ ##
+ # :method: at
+ #
+ # Search through the page for +path+ under +namespace+ using Nokogiri's #at.
+ # The +path+ may be either a CSS or XPath expression.
+ #
+ # See also Nokogiri::XML::Node#at
+
def_delegator :parser, :at, :at
##
Please sign in to comment.
Something went wrong with that request. Please try again.