tenderlove / nokogiri

Nokogiri (鋸) is an HTML, XML, SAX, and Reader parser with XPath and CSS selector support.

This URL has Read+Write access

nokogiri / lib / nokogiri.rb
a913af57 » flavorjones 2009-04-23 added Nokogiri::VERSION_INF... 1 # -*- coding: utf-8 -*-
7f2ae51a » tenderlove 2009-02-13 loading the native portions... 2 # Modify the PATH on windows so that the external DLLs will get loaded.
3 ENV['PATH'] = [File.expand_path(
4 File.join(File.dirname(__FILE__), "..", "ext", "nokogiri")
5 ), ENV['PATH']].compact.join(';') if RUBY_PLATFORM =~ /mswin/i
6
836e526a » flavorjones 2009-04-30 FFI branch squash-merged in... 7 if ENV['NOKOGIRI_FFI'] || RUBY_PLATFORM =~ /java/
8 gem 'ffi', '>=0.3.2' unless RUBY_PLATFORM =~ /java/
9 require 'ffi'
10 require 'nokogiri/ffi/libxml'
11 else
12 require 'nokogiri/nokogiri'
13 end
7f2ae51a » tenderlove 2009-02-13 loading the native portions... 14
08450f39 » tenderlove 2008-08-19 auto generating the IDL int... 15 require 'nokogiri/version'
597861f0 » jmhodges 2009-02-03 providing a central Nokogir... 16 require 'nokogiri/syntax_error'
34102644 » tenderlove 2008-07-18 adding the html and xml parser 17 require 'nokogiri/xml'
5ff03ff3 » tenderlove 2008-08-24 adding xslt support 18 require 'nokogiri/xslt'
34102644 » tenderlove 2008-07-18 adding the html and xml parser 19 require 'nokogiri/html'
221a44e7 » tenderlove 2008-09-15 adding decoration to nokogi... 20 require 'nokogiri/decorators'
38d3bfbd » tenderlove 2008-09-17 starting the css selector ast 21 require 'nokogiri/css'
221a44e7 » tenderlove 2008-09-15 adding decoration to nokogi... 22 require 'nokogiri/html/builder'
8a45571f » flavorjones 2008-09-15 added Nokogiri(), Nokogiri.... 23 require 'nokogiri/hpricot'
35cace6b » tenderlove 2008-10-29 fixing up win32 build, addi... 24
9d4a78eb » tenderlove 2009-03-13 adding some documentation, ... 25 # Nokogiri parses and searches XML/HTML very quickly, and also has
26 # correctly implemented CSS3 selector support as well as XPath support.
27 #
28 # Parsing a document returns either a Nokogiri::XML::Document, or a
29 # Nokogiri::HTML::Document depending on the kind of document you parse.
30 #
31 # Here is an example:
32 #
33 # require 'nokogiri'
34 # require 'open-uri'
35 #
36 # # Get a Nokogiri::HTML:Document for the page we’re interested in...
37 #
38 # doc = Nokogiri::HTML(open('http://www.google.com/search?q=tenderlove'))
39 #
40 # # Do funky things with it using Nokogiri::XML::Node methods...
41 #
42 # ####
43 # # Search for nodes by css
44 # doc.css('h3.r a.l').each do |link|
45 # puts link.content
46 # end
47 #
48 # See Nokogiri::XML::Node#css for more information about CSS searching.
49 # See Nokogiri::XML::Node#xpath for more information about XPath searching.
9eebaa08 » tenderlove 2008-07-14 breaking up nokogiri 50 module Nokogiri
e7f98b6c » tenderlove 2008-07-14 initial checkin 51 class << self
8b9daefb » tenderlove 2008-11-30 making sure that sloppy css... 52 ###
53 # Parse an HTML or XML document. +string+ contains the document.
93761f99 » tenderlove 2008-10-15 adding parser constants and... 54 def parse string, url = nil, encoding = nil, options = nil
449e7c5a » tenderlove 2008-07-18 working on new api 55 doc =
56 if string =~ /^\s*<[^Hh>]*html/i # Probably html
93761f99 » tenderlove 2008-10-15 adding parser constants and... 57 Nokogiri::HTML.parse(string, url, encoding, options || 2145)
449e7c5a » tenderlove 2008-07-18 working on new api 58 else
93761f99 » tenderlove 2008-10-15 adding parser constants and... 59 Nokogiri::XML.parse(string, url, encoding, options || 2159)
449e7c5a » tenderlove 2008-07-18 working on new api 60 end
61 yield doc if block_given?
62 doc
e7f98b6c » tenderlove 2008-07-14 initial checkin 63 end
8a45571f » flavorjones 2008-09-15 added Nokogiri(), Nokogiri.... 64
acddc4a8 » tenderlove 2009-04-25 adding an rdoc test and add... 65 ###
66 # Create a new Nokogiri::XML::DocumentFragment
93761f99 » tenderlove 2008-10-15 adding parser constants and... 67 def make input = nil, opts = {}, &blk
1c4e553b » flavorjones 2008-09-16 implemented NodeSet.wrap() ... 68 if input
0dfe0255 » tenderlove 2009-02-05 HTML.fragment now returns a... 69 Nokogiri::HTML.fragment(input).children.first
1c4e553b » flavorjones 2008-09-16 implemented NodeSet.wrap() ... 70 else
903a28d3 » tenderlove 2008-09-25 fixing warning 71 Nokogiri(&blk)
1c4e553b » flavorjones 2008-09-16 implemented NodeSet.wrap() ... 72 end
73 end
acddc4a8 » tenderlove 2009-04-25 adding an rdoc test and add... 74
8b9daefb » tenderlove 2008-11-30 making sure that sloppy css... 75 ###
76 # Parse a document and add the Slop decorator. The Slop decorator
77 # implements method_missing such that methods may be used instead of CSS
78 # or XPath. For example:
79 #
80 # doc = Nokogiri::Slop(<<-eohtml)
81 # <html>
82 # <body>
83 # <p>first</p>
84 # <p>second</p>
85 # </body>
86 # </html>
87 # eohtml
88 # assert_equal('second', doc.html.body.p[1].text)
89 #
46be2582 » jbarnette 2008-11-26 Nokogiri::Slop(xml) provide... 90 def Slop(*args, &block)
91 Nokogiri(*args, &block).slop!
92 end
8a45571f » flavorjones 2008-09-15 added Nokogiri(), Nokogiri.... 93 end
94 end
95
acddc4a8 » tenderlove 2009-04-25 adding an rdoc test and add... 96 ###
97 # Parser a document contained in +args+. Nokogiri will try to guess what
98 # type of document you are attempting to parse. For more information, see
99 # Nokogiri.parse
100 #
101 # To specify the type of document, use Nokogiri.XML or Nokogiri.HTML.
8a45571f » flavorjones 2008-09-15 added Nokogiri(), Nokogiri.... 102 def Nokogiri(*args, &block)
103 if block_given?
1c4e553b » flavorjones 2008-09-16 implemented NodeSet.wrap() ... 104 builder = Nokogiri::HTML::Builder.new(&block)
a02964e4 » tenderlove 2009-02-13 adding to_xhtml using xmlsa... 105 return builder.doc.root
8a45571f » flavorjones 2008-09-15 added Nokogiri(), Nokogiri.... 106 else
33ab946e » tenderlove 2008-11-22 delegating Nokogiri() to th... 107 Nokogiri.parse(*args)
e7f98b6c » tenderlove 2008-07-14 initial checkin 108 end
109 end