Browse files

updating documentation to remove any examples that use WWW::Mechanize…

  • Loading branch information...
1 parent 5cb4ceb commit 973b5a55229972fef1f21c36cfcb9c2c578a2ee8 @tenderlove tenderlove committed Oct 20, 2008
Showing with 14 additions and 333 deletions.
  1. +1 −1 EXAMPLES.txt
  2. +10 −11 GUIDE.txt
  3. +0 −318 NOTES.txt
  4. +2 −2 lib/www/mechanize.rb
  5. +1 −1 lib/www/mechanize/page.rb
@@ -117,7 +117,7 @@ This example also demonstrates subclassing Mechanize.
search_form.words = 'WWW'
submit search_form
- page.links.with.href( %r{/projects/} ).each do |link|
+ page.links_with(:href => %r{/projects/} ).each do |link|
next if link.href =~ %r{/projects/support/}
puts 'Loading %-30s %s' % [link.href, link.text]
21 GUIDE.txt
@@ -36,17 +36,16 @@ link to click on. Lets say we wanted to click the link whose text is 'News'.
Normally, we would have to do this:
page = page.links.find { |l| l.text == 'News' }
But Mechanize gives us a shortcut. Instead we can say this:
- page = page.links.text('News')
+ page = page.link_with(:text => 'News')
That shortcut says "find all links with the name 'News'". You're probably
thinking "there could be multiple links with that text!", and you would be
-correct! If you pass a list of links to the "click" method, Mechanize will
-click on the first one. If you wanted to click on the second news link, you
-could do this:
- page.links.text('News')[1]
+correct! If you use the plural form, you can access the list.
+If you wanted to click on the second news link, you could do this:
+ page.links_with(:text => 'News')[1]
We can even find a link with a certain href like so:
- page.links.href('/something')
+ page.links_with(:href => '/something')
Or chain them together to find a link with certain text and certain href:
- page.links.text('News').href('/something')
+ page.links_with(:text => 'News', :href => '/something')
These shortcuts that mechanize provides are available on any list that you
can fetch like frames, iframes, or forms. Now that we know how to find and
@@ -102,18 +101,18 @@ have many options associated with them. If you select one option, mechanize
will deselect the other options (unless it is a multi select!).
For example, lets select an option on a list:
+ form.field_with(:name => 'list').options[0].select
Now lets take a look at checkboxes and radio buttons. To select a checkbox,
just check it like this:
+ form.checkbox_with(:name => 'box').check
Radio buttons are very similar to checkboxes, but they know how to uncheck
other radio buttons of the same name. Just check a radio button like you
would a checkbox:
+ form.radiobuttons_with(:name => 'box')[1].check
Mechanize also makes file uploads easy! Just find the file upload field, and
tell it what file name you want to upload:
- form.file_uploads.file_name = "somefile.jpg"
+ form.file_uploads.first.file_name = "somefile.jpg"
== Scraping Data
Mechanize uses hpricot[] to parse
318 NOTES.txt
@@ -1,318 +0,0 @@
-= Mechanize Release Notes
-== 0.6.4 (Gwendolyn)
-Custom request headers can now be added to Mechanize by subclassing mechanize
-and defining the Mechanize#set_headers method. For example:
- class A < WWW::Mechanize
- def set_headers(u, r, c)
- super(uri, request, cur_page)
- request.add_field('Cookie', 'name=Aaron')
- request
- end
- end
-The Mechanize#redirect_ok method has been added to that you can keep mechanize
-from following redirects.
-== 0.6.3 (Big Man)
-Mechanize 0.6.3 (Big Man) has a few bug fixes and some new features added to
-the Form class. The Form class is now more hash like. I've added an
-Form#add_field! method that will add a field to your form. Form#[]= will now
-add a field if the key doesn't exist. For example, your form doesn't have
-an input field named 'foo', the following 2 lines of code are equivalent, and
-will create a field named 'foo':
- form['foo'] = 'bar'
- form.add_field!('foo', 'bar')
-To make forms more hashlike, has_value?, and has_key? methods.
-== 0.6.2 (Bridget)
-Mechanize 0.6.2 (Bridget) is a fairly small bug fix release. You can now
-access the parsed page when a ResponseCodeError is thrown. For example, this
-loads a page that doesn't exist, but gives you access to the parsed 404 page:
- begin
- rescue WWW::Mechanize::ResponseCodeError => ex
- puts
- end
-Accessing forms is now more DSL like. When manipulating a form, for example,
-you can use the following syntax:
- page.form('formname') { |form|
- form.first_name = "Aaron"
- }.submit
-Documentation has also been updated thanks to Paul Smith.
-== 0.6.1 (Chuck)
-Mechanize version 0.6.1 (Chuck) is done, and is ready for you to use. This
-post "my trip to europe" release includes many bug fixes and a handful of
-new features.
-New features include, a submit method on forms, a click method on links, and an
-REXML pluggable parser. Now you can submit a form just by calling a method on
-the form, rather than passing the form to the submit method on the mech object.
-The click method on links lets you click the link by calling a method on the
-link rather than passing the link to the click method on the mech object.
-Lastly, the REXML pluggable parser lets you use your pre-0.6.0 code with
-0.6.1. See the CHANGELOG for more details.
-== 0.6.0 (Rufus)
-WWW::Mechanize 0.6.0 aka Rufus is ready! This hpricot flavored pie has
-finished cooling on the window sill and is ready for you to eat. But if you
-don't want to eat it, you can just download it and use it. I would
-understand that.
-The best new feature in this release in my opinion is the hpricot flavoring
-packed inside. Mechanize now uses hpricot as its html parser. This means
-mechanize gets a huge speed boost, and you can use the power of hpricot for
-scraping data. Page objects returned from mechanize will allow you to use
-hpricot search methods:
- agent.get('').search("//strong")
- agent.get('')/"strong"
-The click method on mechanize has been updated so that you can click on links
-you find using hpricot methods:
- (page/"a").first
-Or click on frames:
- (page/"frame").first
-The cookie parser has been overhauled to be more RFC 2109 compliant and to
-use WEBrick cookies. Dependencies on ruby-web and mime-types have been
-removed in favor of using hpricot and WEBrick respectively.
-attr_finder and REXML helper methods have been removed.
-== 0.5.4 (Sylvester)
-WWW::Mechanize 0.5.4 aka Sylvester is fresh out the the frying pan and in to
-the fire! It is also ready for you to download and use.
-New features include WWW::Mechanize#transact (thanks to Johan Kiviniemi) which
-lets you maintain your history state between transactions. Forms can now be
-accessed as a hash. For example, to set the value of an input field, you can
-do the following:
- form['name'] = "Aaron"
-Doing this assumes that you are setting the first field. If there are multiple
-fields with the same name, you must use a different method to set the value.
-Form file uploads will now read the file specified by FileUpload#file_name.
-The mime type will also be automatically determined for you! Take a look
-at the EXAMPLES file for a new flickr upload script.
-Lastly, gzip encoding is now supported! WWW::Mechanize now supports pages
-being sent gzip encoded. This means less network bandwidth. Yay!
-== 0.5.3 (Twan)
-Here it is. Mechanize 0.5.3 also named the "Twan" release. There are a few
-new features, a few fixed bugs, and some other stuff too!
-First, new features. WWW::Mechanize#click has been updated to operate on the
-first link if an array is passed in. How is this helpful? It allows
-syntax like this:
- page.links.first
-to be like this:
- page.links
-This trick was actually implemented in WWW::Mechanize::List. If you send a
-method to WWW::Mechanize::List, and it doesn't know how to respond, it will
-try calling that method on the first element of the list. But it only does
-that for methods with no arguments.
-Radio buttons, check boxes, and select lists can now be ticked, unticked, and
-clicked. Now to select the second radio button from a list, you can do this:
-Mechanize will handle unchecking all of the other radio buttons with the same
-Pretty printing has been added so that inspecting mechanize objects is very
-pretty. Go ahead and try it out!
- pp page
-Or even
- pp page.forms.first
-Now, bugfixes. A bug was fixed when spaces are passed in as part of the URL
-to WWW::Mechanize#get. Thanks to Eric Kolve, a bug was fixed with methods
-that conflict with rails. Thanks to Yinon Bentor for sending in a patch to
-improve Log4r support and a slight speed increase.
-== 0.5.2
-This release comes with a few cool new features. First, support for select
-lists which are "multi" has been added. This means that you can select
-multiple values for a select list that has the "multiple" attribute. See
-WWW::Mechanize::MultiSelectList for more information.
-New methods for select lists have been added. You can use the select_all
-method to select all options and select_none to select none to select no
-options. Options can now be "selected" which selects an option, "unselected",
-which unselects an option, and "clicked" which toggles the current status of
-the option. What this means is that instead of having to select the first
-option like this:
- select_list.value = select_list.options.first.value
-You can select the first option by just saying this:
-Of course you can still set the select list to an arbitrary value by just
-setting the value of the select list.
-A new method has been added to Form so that multiple fields can be set at the
-same time. To set 'foo', and 'name' at the same time on the form, you can do
-the following:
- form.set_fields( :foo => 'bar', :name => 'Aaron' )
-Or to set the second fields named 'name' you can do the following:
- form.set_fields( :name => ['Aaron', 1] )
-Finally, attr_finder has been deprecated, and all syntax like this:
- @agent.links(:text => 'foo')
-needs to be changed to:
- @agent.links.text('foo')
-With this release you will just get a warning, and the code will be removed in
-== 0.5.1
-This release is a small bugfix release. The main bug fixed in this release is
-a problem with file uploading. I have also made some performance improvements
-to cookie parsing.
-== 0.5.0
-Good News first:
-This release has many new great features! Mechanize has been updated to
-handle any content type a web server returns using a system called "Pluggable
-Parsers". Mechanize has always been able to handle any content type
-(sort of), but the pluggable parser system lets us cleanly handle any
-content type by instantiating a class for the content type returned from the
-server. For example, a web server returns type 'text/html', mechanize asks
-the pluggable parser for a class to instantiate for 'text/html'. Mechanize
-then instantiates that class and returns it. Users can define their own
-parsers, and register them with the Pluggable Parser so that mechanize will
-instantiate your class when the content type you specify is returned. This
-allows you to easily preprocess your HTML, or even use other HTML parsers.
-Content types that the pluggable parser doesn't know how to handle will
-return WWW::Mechanize::File which has basic functionality like a 'save_as'
-method. For more information, see the RDoc for
-WWW::Mechanize::PluggableParser also see the EXAMPLES file.
-A 'save_as' method has been added so that any page downloaded can be easily
-saved to a file.
-The cookie jar for mechanize can now be saved to disk and loaded back up at
-another time. If your script needs to save cookie state between executions,
-you can now use the 'save_as' and 'load' methods on WWW::Mechanize::CookieJar.
-Form fields can now be treated as accessors. This means that if you have a
-form with the fields 'username' and 'password', you could manipulate them like
- form.username = 'test'
- form.password = 'testing'
- puts "username: #{form.username}"
- puts "password: #{form.password}"
-Form fields can still be accessed in the usual way in case there are multiple
-input fields with the same name.
-Bad news second:
-In this release, the name space has been altered to be more consistent. Many
-classes used to be under WWW directly, they are now all under WWW::Mechanize.
-For example, in 0.4.7 Page was WWW::Page, in this release it is now
-WWW::Mechanize::Page. This may break your code, but if you aren't using
-class names directly, everything should be fine.
-Body filters have been removed in favor of Pluggable Parsers.
-== 0.4.7
-This release of mechanize comes with a few bug fixes including fixing a
-bug when there is no action specified in a form.
-In this version, a default user agent string is now set for mechanize. Also
-a convenience method WWW::Mechanize#get_file has been added for fetching
-non text/html files.
-== 0.4.6
-The 0.4.6 release comes with proxy support which can be enabled by calling
-the set_proxy method on your WWW::Mechanize object. Once you have set your
-proxy settings, all mechanize requests will go through the proxy.
-A new "visited?" method has been added to WWW::Mechanize so that you can see
-if any particular URL is in your history.
-Image alt text support has been added to links. If a link contains an image
-with no text, the alt text of the image will be used. For example:
- <a href="foo.html><img src="foo.gif" alt="Foo Image"></a>
-This link will contain the text "Foo Image", and can be found like this:
- link = page.links.text('Foo Image')
-Lists of things have been updated so that you can set a value without
-specifying the position in the array. It will just assume that you want to
-set the value on the first element. For example, the following two statements
-are equivalent:
-'q').first.value = 'xyz' # Old syntax
-'q').value = 'xyz' # New syntax
-This new syntax comes with a note of caution; make sure you know you want to
-set only the first value. There could be multiple fields with the name 'q'.
-== 0.4.5
-This release comes with a new filtering system. You can now manipulate the
-response body before mechanize parses it. This can be useful if you know that
-the HTML you need to parse is broken, or if you want to speed up the parsing.
-This filter can be done on a global basis, or on a per page basis. Check out
-the new examples in the EXAMPLES file for usage.
-This release is also starting to phase out the misspelled method
-WWW::Mechanize#basic_authetication. If you are using that method, please
-switch to WWW::Mechanize#basic_auth.
-The 0.4.5 release has many bug fixes, most noteably better cookie parsing and
-better form support.
-== 0.4.4
-This release of mechanize comes with a new "Option" object that can be
-accessed from select fields on forms. That means that you can figure out
-what option to set based on the text in the select field. For example:
-selectlist ='selectlist').first
-selectlist.value = selectlist.options.find { |o| o.text == 'foo'}.value
-== 0.4.3
-The new syntax for finding things like forms, fields, frames, etcetera looks
-like this:
-page.links.with.text 'Some Text'
-The preceding statement will find all links in a page with the text
-'Some Text'. This can be applied to form fields as well:
- 'email'
-These can be chained as well like this:
-'with' and 'and' can be omitted, and the old way is still supported. The
-following statements all do the same thing:
-form.fields.find_all { |f| == 'email' }'email')'email')
-form.fields(:name => 'email')
-Regular expressions are also supported:
4 lib/www/mechanize.rb
@@ -40,8 +40,8 @@ module WWW
# agent = { |a| a.log ="mech.log") }
# agent.user_agent_alias = 'Mac Safari'
# page = agent.get("")
- # search_form ="f").first
- #"q").value = "Hello"
+ # search_form = page.form_with(:name => "f")
+ # search_form.field_with(:name => "q").value = "Hello"
# search_results = agent.submit(search_form)
# puts search_results.body
class Mechanize
2 lib/www/mechanize/page.rb
@@ -55,7 +55,7 @@ def content_type
# Find a form matching +criteria+.
# Example:
- # page.form(:action => '/post/login.php') do |f|
+ # page.form_with(:action => '/post/login.php') do |f|
# ...
# end
[:form, :link, :base, :frame, :iframe].each do |type|

0 comments on commit 973b5a5

Please sign in to comment.