Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Newer
Older
100644 294 lines (227 sloc) 8.018 kb
293954f updated readme
Peter Ohler authored
1 # Ox gem
2 A fast XML parser and Object marshaller as a Ruby gem.
3
7ed71d2 @ohler55 release notes
authored
4 ## Installation
293954f updated readme
Peter Ohler authored
5 gem install ox
6
7ed71d2 @ohler55 release notes
authored
7 ## Documentation
ddb0352 started project
Peter Ohler authored
8
e975338 updated readme
Peter Ohler authored
9 *Documentation*: http://www.ohler.com/ox
6cef750 updated readme
Peter Ohler authored
10
7ed71d2 @ohler55 release notes
authored
11 ## Source
6cef750 updated readme
Peter Ohler authored
12
708b9ec added to readme
Peter Ohler authored
13 *GitHub* *repo*: https://github.com/ohler55/ox
36966ca added return to readme
Peter Ohler authored
14
3a9e8c9 added rubygems link
Peter Ohler authored
15 *RubyGems* *repo*: https://rubygems.org/gems/ox
ddb0352 started project
Peter Ohler authored
16
7ed71d2 @ohler55 release notes
authored
17 ## Follow @oxgem on Twitter
293954f updated readme
Peter Ohler authored
18
6cef750 updated readme
Peter Ohler authored
19 [Follow @peterohler on Twitter](http://twitter.com/#!/peterohler) for announcements and news about the Ox gem.
293954f updated readme
Peter Ohler authored
20
7ed71d2 @ohler55 release notes
authored
21 ## Build Status
293954f updated readme
Peter Ohler authored
22
969eb53 @ohler55 updated travis status image link
authored
23 [![Build Status](https://secure.travis-ci.org/ohler55/ox.png?branch=master)](http://travis-ci.org/ohler55/ox)
293954f updated readme
Peter Ohler authored
24
7ed71d2 @ohler55 release notes
authored
25 ## Links of Interest
366651a updated links
Peter Ohler authored
26
ca931d3 @JunichiIto Fix typo in README
JunichiIto authored
27 [Ruby XML Gem Comparison](http://www.ohler.com/dev/xml_with_ruby/xml_with_ruby.html) for a performance comparison between Ox, Nokogiri, and LibXML.
366651a updated links
Peter Ohler authored
28
84267d9 changed extconf.rb to support Ruby 2.0.0
Peter Ohler authored
29 [Fast Ruby XML Serialization](http://www.ohler.com/dev/ruby_object_xml_serialization/ruby_object_xml_serialization.html) to see how Ox can be used as a faster replacement for Marshal.
366651a updated links
Peter Ohler authored
30
a49e0fa changed the string encoding from base64 to escaped characters
Peter Ohler authored
31 *Fast JSON parser and marshaller on RubyGems*: https://rubygems.org/gems/oj
32
b941afd fixed link to oj
Peter Ohler authored
33 *Fast JSON parser and marshaller on GitHub*: https://github.com/ohler55/oj
a49e0fa changed the string encoding from base64 to escaped characters
Peter Ohler authored
34
7ed71d2 @ohler55 release notes
authored
35 ## Release Notes
293954f updated readme
Peter Ohler authored
36
8045a2c @ohler55 added to default options
authored
37 ### Current Release 2.2.0
38
39 - Added the SAX convert_special option to the default options.
40
41 - Added the SAX smart option to the default options.
42
43 - Other SAX options are now taken from the defaults if not specified.
44
45 ### Release 2.1.8
dfc4be5 @ohler55 fixed bug with IO.pipe use
authored
46
47 - Fixed a bug that caused all input to be read before parsing with the sax
48 parser and an IO.pipe.
49
50 ### Release 2.1.7
889d0a3 updated notes and version
Peter Ohler authored
51
52 - Empty elements such as <foo></foo> are now called back with empty text.
53
2da7356 @ohler55 added call to push cached symbols onto an array to inhibit GC
authored
54 - Fixed GC problem that occurs with the new GC in Ruby 2.2 that garbage
55 collects Symbols.
889d0a3 updated notes and version
Peter Ohler authored
56
7ed71d2 @ohler55 release notes
authored
57 ## Description
ddb0352 started project
Peter Ohler authored
58
59 Optimized XML (Ox), as the name implies was written to provide speed optimized
bc08c40 @ohler55 additional tests and made the nesting depth infinite or as much as memor...
authored
60 XML and now HTML handling. It was designed to be an alternative to Nokogiri and other Ruby
e81eef5 updated readme and gemspec
Peter Ohler authored
61 XML parsers in generic XML parsing and as an alternative to Marshal for Object
62 serialization.
ddb0352 started project
Peter Ohler authored
63
e81eef5 updated readme and gemspec
Peter Ohler authored
64 Unlike some other Ruby XML parsers, Ox is self contained. Ox uses nothing
65 other than standard C libraries so version issues with libXml are not an
66 issue.
ddb0352 started project
Peter Ohler authored
67
68 Marshal uses a binary format for serializing Objects. That binary format
69 changes with releases making Marshal dumped Object incompatible between some
70 versions. The use of a binary format make debugging message streams or file
71 contents next to impossible unless the same version of Ruby and only Ruby is
72 used for inspecting the serialize Object. Ox on the other hand uses human
1c74ead added ability to auto-define missing classes
Peter Ohler authored
73 readable XML. Ox also includes options that allow strict, tolerant, or a mode
74 that automatically defines missing classes.
ddb0352 started project
Peter Ohler authored
75
e81eef5 updated readme and gemspec
Peter Ohler authored
76 It is possible to write an XML serialization gem with Nokogiri or other XML
77 parsers but writing such a package in Ruby results in a module significantly
78 slower than Marshal. This is what triggered the start of Ox development.
ddb0352 started project
Peter Ohler authored
79
264e303 release 1.3 ready after adding encoding support
Peter Ohler authored
80 Ox handles XML documents in three ways. It is a generic XML parser and writer,
81 a fast Object / XML marshaller, and a stream SAX parser. Ox was written for
82 speed as a replacement for Nokogiri, Ruby LibXML, and for Marshal.
ddb0352 started project
Peter Ohler authored
83
84 As an XML parser it is 2 or more times faster than Nokogiri and as a generic
85 XML writer it is as much as 20 times faster than Nokogiri. Of course different
86 files may result in slightly different times.
87
88 As an Object serializer Ox is up to 6 times faster than the standard Ruby
89 Marshal.dump() and up to 3 times faster than Marshal.load().
90
40870ff updated SAX API
Peter Ohler authored
91 The SAX like stream parser is 40 times faster than Nokogiri and more than 13
92 times faster than LibXML when validating a file with minimal Ruby
264e303 release 1.3 ready after adding encoding support
Peter Ohler authored
93 callbacks. Unlike Nokogiri and LibXML, Ox can be tuned to use only the SAX
94 callbacks that are of interest to the caller. (See the perf_sax.rb file for an
95 example.)
96
602497d @ohler55 checking release 2.1.7
authored
97 Ox is compatible with Ruby 1.8.7, 1.9.3, 2.1.2, 2.2.0 and RBX.
ddb0352 started project
Peter Ohler authored
98
7bc7047 another attempt at the readme
Peter Ohler authored
99 ### Object Dump Sample:
ddb0352 started project
Peter Ohler authored
100
9667f2f @rubymaniac Fixed README
rubymaniac authored
101 ```ruby
c066041 @ohler55 reformmated code
authored
102 require 'ox'
103
104 class Sample
105 attr_accessor :a, :b, :c
106
107 def initialize(a, b, c)
108 @a = a
109 @b = b
110 @c = c
111 end
112 end
113
114 # Create Object
115 obj = Sample.new(1, "bee", ['x', :y, 7.0])
116 # Now dump the Object to an XML String.
117 xml = Ox.dump(obj)
118 # Convert the object back into a Sample Object.
119 obj2 = Ox.parse_obj(xml)
240e998 @rubymaniac Update README.md
rubymaniac authored
120 ```
ddb0352 started project
Peter Ohler authored
121
7bc7047 another attempt at the readme
Peter Ohler authored
122 ### Generic XML Writing and Parsing:
ddb0352 started project
Peter Ohler authored
123
9667f2f @rubymaniac Fixed README
rubymaniac authored
124 ```ruby
c066041 @ohler55 reformmated code
authored
125 require 'ox'
126
127 doc = Ox::Document.new(:version => '1.0')
128
129 top = Ox::Element.new('top')
130 top[:name] = 'sample'
131 doc << top
132
133 mid = Ox::Element.new('middle')
134 mid[:name] = 'second'
135 top << mid
136
137 bot = Ox::Element.new('bottom')
138 bot[:name] = 'third'
139 mid << bot
140
141 xml = Ox.dump(doc)
142
143 # xml =
144 # <top name="sample">
145 # <middle name="second">
146 # <bottom name="third"/>
147 # </middle>
148 # </top>
149
150 doc2 = Ox.parse(xml)
151 puts "Same? #{doc == doc2}"
152 # true
240e998 @rubymaniac Update README.md
rubymaniac authored
153 ```
2964597 added another example to the README file
Peter Ohler authored
154
155 ### SAX XML Parsing:
ddb0352 started project
Peter Ohler authored
156
9667f2f @rubymaniac Fixed README
rubymaniac authored
157 ```ruby
c066041 @ohler55 reformmated code
authored
158 require 'stringio'
159 require 'ox'
160
161 class Sample < ::Ox::Sax
162 def start_element(name); puts "start: #{name}"; end
163 def end_element(name); puts "end: #{name}"; end
164 def attr(name, value); puts " #{name} => #{value}"; end
165 def text(value); puts "text #{value}"; end
166 end
167
168 io = StringIO.new(%{
169 <top name="sample">
170 <middle name="second">
171 <bottom name="third"/>
172 </middle>
173 </top>
174 })
175
176 handler = Sample.new()
177 Ox.sax_parse(handler, io)
178 # outputs
179 # start: top
180 # name => sample
181 # start: middle
182 # name => second
183 # start: bottom
184 # name => third
185 # end: bottom
186 # end: middle
187 # end: top
240e998 @rubymaniac Update README.md
rubymaniac authored
188 ```
ddb0352 started project
Peter Ohler authored
189
a276f02 @monde Example of actively yielding results from the SAX parser.
monde authored
190 ### Yielding results immediately while SAX XML Parsing:
191
192 ```ruby
193 require 'stringio'
194 require 'ox'
195
196 class Yielder < ::Ox::Sax
197 def initialize(block); @yield_to = block; end
198 def start_element(name); @yield_to.call(name); end
199 end
200
201 io = StringIO.new(%{
202 <top name="sample">
203 <middle name="second">
204 <bottom name="third"/>
205 </middle>
206 </top>
207 })
208
209 proc = Proc.new { |name| puts name }
210 handler = Yielder.new(proc)
211 puts "before parse"
212 Ox.sax_parse(handler, io)
213 puts "after parse"
214 # outputs
215 # before parse
216 # top
217 # middle
218 # bottom
219 # after parse
220 ```
221
7bc7047 another attempt at the readme
Peter Ohler authored
222 ### Object XML format
708b9ec added to readme
Peter Ohler authored
223
224 The XML format used for Object encoding follows the structure of the
225 Object. Each XML element is encoded so that the XML element name is a type
226 indicator. Attributes of the element provide additional information such as
227 the Class if relevant, the Object attribute name, and Object ID if
228 necessary.
229
230 The type indicator map is:
231
a048ccb @ohler55 reformmated code
authored
232 - **a** => `Array`
8852558 @ohler55 reformmated code
authored
233 - **b** => `Base64`
234 - **c** => `Class`
235 - **f** => `Float`
236 - **g** => `Regexp`
237 - **h** => `Hash`
238 - **i** => `Fixnum`
239 - **j** => `Bignum`
240 - **l** => `Rational`
241 - **m** => `Symbol`
242 - **n** => `FalseClass`
243 - **o** => `Object`
244 - **p** => `Ref`
245 - **r** => `Range`
246 - **s** => `String`
247 - **t** => `Time`
248 - **u** => `Struct`
249 - **v** => `Complex`
250 - **x** => `Raw`
251 - **y** => `TrueClass`
252 - **z** => `NilClass`
708b9ec added to readme
Peter Ohler authored
253
254 If the type is an Object, type 'o' then an attribute named 'c' should be set
255 with the full Class name including the Module names. If the XML element
256 represents an Object then a sub-elements is included for each attribute of
257 the Object. An XML element attribute 'a' is set with a value that is the
258 name of the Ruby Object attribute. In all cases, except for the Exception
259 attribute hack the attribute names begin with an @ character. (Exception are
e6ad4d7 syncing with svn
Peter Ohler authored
260 strange in that the attributes of the Exception Class are not named with a @
708b9ec added to readme
Peter Ohler authored
261 suffix. A hack since it has to be done in C and can not be done through the
262 interpreter.)
263
264 Values are encoded as the text portion of an element or in the sub-elements
265 of the principle. For example, a Fixnum is encoded as:
a048ccb @ohler55 reformmated code
authored
266 ```xml
8852558 @ohler55 reformmated code
authored
267 <i>123</i>
a048ccb @ohler55 reformmated code
authored
268 ```
708b9ec added to readme
Peter Ohler authored
269 An Array has sub-elements and is encoded similar to this example.
a048ccb @ohler55 reformmated code
authored
270 ```xml
8852558 @ohler55 reformmated code
authored
271 <a>
272 <i>1</i>
273 <s>abc</s>
274 </a>
a048ccb @ohler55 reformmated code
authored
275 ```
708b9ec added to readme
Peter Ohler authored
276 A Hash is encoded with an even number of elements where the first element is
277 the key and the second is the value. This is repeated for each entry in the
278 Hash. An example is of { 1 => 'one', 2 => 'two' } encoding is:
a048ccb @ohler55 reformmated code
authored
279 ```xml
8852558 @ohler55 reformmated code
authored
280 <h>
281 <i>1</i>
282 <s>one</s>
283 <i>2</i>
284 <s>two</s>
285 </h>
a048ccb @ohler55 reformmated code
authored
286 ```
708b9ec added to readme
Peter Ohler authored
287 Strings with characters not allowed in XML are base64 encoded amd will be
288 converted back into a String when loaded.
289
290 Ox supports circular references where attributes of one Object can refer to
291 an Object that refers back to the first Object. When this option is used an
292 Object ID is added to each XML Object element as the value of the 'a'
293 attribute.
Something went wrong with that request. Please try again.