Skip to content

sdhull/xml-object

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

XMLObject

(This is inspired by Python’s xml_objectify)

XMLObject attempts to make the accessing of small, well-formed XML structures convenient, by providing a syntax that fits well in most Ruby programs.

Dependencies

None outside of Ruby, though some optional gems add additional features and speedups. See more below.

Compatibility

I’ve tested this release under:

Rubinius: 1.0.1 (REXML only, libxml-ruby might work if you can build it)
MRI:      1.9.2, 1.8.7, 1.8.6
REE:      1.8.7
JRuby:    1.5.2, 1.4.0 (both in 1.8.7 mode)

And with the following optionals:

version 1.1.4 of libxml-ruby   (for faster execution)
version 2.3.8 of activesupport (for inflected plurals)

Installation instructions

gem install xml-object

Example usage

recipe.xml is as follows:

<recipe name="bread" prep_time="5 mins" cook_time="3 hours">
  <title>Basic bread</title>
  <ingredient amount="8" unit="dL">Flour</ingredient>
  <ingredient amount="10" unit="grams">Yeast</ingredient>
  <ingredient amount="4" unit="dL" state="warm">Water</ingredient>
  <ingredient amount="1" unit="teaspoon">Salt</ingredient>
  <instructions easy="yes" hard="false">
    <step>Mix all ingredients together.</step>
    <step>Knead thoroughly.</step>
    <step>Cover with a cloth, and leave for one hour in warm room.</step>
    <step>Knead again.</step>
    <step>Place in a bread baking tin.</step>
    <step>Cover with a cloth, and leave for one hour in warm room.</step>
    <step>Bake in the oven at 180(degrees)C for 30 minutes.</step>
  </instructions>
</recipe>

require 'xml-object'
recipe = XMLObject.new(File.open('recipe.xml'))

recipe.name                      => "bread"
recipe.title                     => "Basic bread"

recipe.ingredients.is_a?(Array)  => true
recipe.ingredients.first.amount  => "8" # Not a Fixnum. Too hard. :(

recipe.instructions.easy?        => true

recipe.instructions.first.upcase => "MIX ALL INGREDIENTS TOGETHER."
recipe.instructions.steps.size   => 7

Motivation

XML is an extensible markup language. It is extensible because it is meant to define markup languages for any type of document, so new tags are needed depending on the problem domain.

Sometimes, however, XML ends up being used to solve a much simpler problem: the issue of passing a data-structure over the network, and/or between two different languages. Tools like JSON or YAML are a much better fit for this kind of job, but one doesn’t always have that luxury.

Features & Problems

Adapters

XMLObject supports different adapters to do the actual XML parsing. It ships with REXML, and LibXML adapters. By default, the REXML adapter is used.

To use a different adapter than the REXML default:

require 'xml-object'                  # Require XMLObject first
require 'xml-object/adapters/libxml'

Access to elements and attributes

XMLObject uses dot notation (foo.bar) for both elements and attributes, with a few rules, and with an array notation fallback for invalid method names or other tricky situations. For example, with the given file:

<outer object_id="root" name="foo">
  <name>Outer Element</name>
</outer>

outer.name is the name element. Child elements are always looked up first, then attributes. To access the attribute in case of ambiguity, use outer[:attr => ‘name’].

outer.object_id is really Object#object_id, because all of the object methods are preserved (this is on purpose). To access the attribute object_id, use outer[:attr => ‘object_id’] (or just outer, since there’s no element/attribute ambiguity there).

Question notation

Elements or attributes that look like booleans are “booleanized” if called by their question names (such as enabled?)

Collection auto-folding

Like XmlSimple, XMLObject folds same-named elements found at the same level, like so:

<student>
  <name>Bob</name>
  <course>Math</course>     |
  <course>Biology</course>  | => 'course' becomes an Array
</student>

student = XMLObject.new(xml_file)

student.course.is_a?(Array)      => true
student.course.first == 'Math'   => true
student.course.last  == 'Biology => true

Collection pluralization

With the same file as in the example above:

student.courses.first == student.course.first => true

Note that the pluralization algorithm is just tacking an ‘s’ at the end of the singular, unless ActiveSupport is required, in which case you get inflected plurals.

Collection proxy

Sometimes, collections are expressed with a container element in XML:

<author>
  <name>John</name>
  <publications>
    <book>Math 101</book>
    <book>Biology 101</book>
  </publications>
</author>

In this case, since the container element courses has no text element of its own (attributes are ok), and it only has elements of one name under it, it delegates all methods to the collection below, so you get:

author.publications == author.publications.books => true

author.publications.map { |b| b.downcase } => ['math 101', 'biology 101']

Recursive

The design of the adapters assumes parsing of the objects recursively. Deep files are bound to throw SystemStackError, but for the kinds of files I need to read, things are working fine so far. In any case, stream parsing is on the TODO list.

Incomplete

It most likely doesn’t work with a ton of features of complex XML files (see the caveats section). I’ll always try to accommodate those, as long as they don’t make the basic usage more complex. As usual, patches welcome.

Caveats

Adapter specific

LibXML adapter

The LibXML adapter will not return the ‘xmlns’ attribute.

Copyright © 2008, 2009 Jordi Bunster, released under the MIT license

About

The Rubyista's way to do quick XML sit-ups

Resources

License

Stars

Watchers

Forks

Packages

No packages published