github / mechanize forked from tenderlove/mechanize

Mechanize is a ruby library that makes automated web interaction easy.

This URL has Read+Write access

mechanize / EXAMPLES.txt
100644 125 lines (94 sloc) 3.168 kb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
= WWW::Mechanize examples
 
== Google
  require 'rubygems'
  require 'mechanize'
  
  agent = WWW::Mechanize.new
  agent.user_agent_alias = 'Mac Safari'
  page = agent.get("http://www.google.com/")
  search_form = page.forms.with.name("f").first
  search_form.q = "Hello"
  search_results = agent.submit(search_form)
  puts search_results.body
 
== Rubyforge
  require 'mechanize'
  
  agent = WWW::Mechanize.new
  page = agent.get('http://rubyforge.org/')
  link = page.links.text(/Log In/)
  page = agent.click(link)
  form = page.forms[1]
  form.form_loginname = ARGV[0]
  form.form_pw = ARGV[1]
  page = agent.submit(form, form.buttons.first)
  
  puts page.body
 
== File Upload
This example uploads one image as two different images to flickr.
 
 require 'rubygems'
 require 'mechanize'
 
 agent = WWW::Mechanize.new
 
 # Get the flickr sign in page
 page = agent.get('http://flickr.com/signin/flickr/')
 
 # Fill out the login form
 form = page.forms.name('flickrloginform').first
 form.email = ARGV[0]
 form.password = ARGV[1]
 page = agent.submit(form)
 
 # Go to the upload page
 page = agent.click page.links.text('Upload')
 
 # Fill out the form
 form = page.forms.action('/photos_upload_process.gne').first
 form.file_uploads.name('file1').first.file_name = ARGV[2]
 agent.submit(form)
  
== Pluggable Parsers
Lets say you want html pages to automatically be parsed with Rubyful Soup.
This example shows you how:
 
  require 'rubygems'
  require 'mechanize'
  require 'rubyful_soup'
 
  class SoupParser < WWW::Mechanize::Page
    attr_reader :soup
    def initialize(uri = nil, response = nil, body = nil, code = nil)
      @soup = BeautifulSoup.new(body)
      super(uri, response, body, code)
    end
  end
 
  agent = WWW::Mechanize.new
  agent.pluggable_parser.html = SoupParser
 
Now all HTML pages will be parsed with the SoupParser class, and automatically
give you access to a method called 'soup' where you can get access to the
Beautiful Soup for that page.
 
== Using a proxy
 
  require 'rubygems'
  require 'mechanize'
  
  agent = WWW::Mechanize.new
  agent.set_proxy('localhost', '8000')
  page = agent.get(ARGV[0])
  puts page.body
 
== The transact method
 
transact runs the given block and then resets the page history. I.e. after the
block has been executed, you're back at the original page; no need count how
many times to call the back method at the end of a loop (while accounting for
possible exceptions).
 
This example also demonstrates subclassing Mechanize.
 
  require 'mechanize'
 
  class TestMech < WWW::Mechanize
    def process
      get 'http://rubyforge.org/'
      search_form = page.forms.first
      search_form.words = 'WWW'
      submit search_form
 
      page.links.with.href( %r{/projects/} ).each do |link|
        next if link.href =~ %r{/projects/support/}
 
        puts 'Loading %-30s %s' % [link.href, link.text]
        begin
          transact do
            click link
            # Do stuff, maybe click more links.
          end
          # Now we're back at the original page.
 
        rescue => e
          $stderr.puts "#{e.class}: #{e.message}"
        end
      end
    end
  end
 
  TestMech.new.process