Need to read or modify a few values from XML? Xen makes that easy and painless. It supports
- "Groovy-like" syntax with dots separating levels (see latitude part of the demo code below)
- a similar XPath-like syntax using slashes as separators (see longitude code below)
- direct programmatic navigation with methods like children(), parent(), etc...
This Example Mimics an Example from the book Making Java Groovy. [See GeocoderDemo.java] (https://github.com/MorganConrad/xen/blob/master/test/com/flyingspaniel/xen/GeocoderDemo.java) for complete code.
// Get the latitude and longitude of the San Francisco Giant's stadium. String url = "http://maps.googleapis.com/maps/api/geocode/xml?sensor=false&address=" + URLEncoder.encode("24 Willy Mays Plaza San Francisco CA"); Xen response = new XenParser().parse(url); // show a couple of options for getting at the data double latitude = response.toDouble(".result.geometry.location.lat"); double longitude = response.one("result/geometry/location/lng").toDouble();
A Xen object supports basic navigation via
children(String), parent(), and getRootElement().
It also supports a convenience API for "XPath-like" search:
get(), one(), and all(), getText(), oneText(), allText().
- get...() returns a single match, throwing a DOMException if there were multiple matches, or null if there were none
- one() is like get...(), except it throws a DOMException if none were found
- all...() returns a list of matches, possibly empty
You can also explicitly create an Xpath to do searching from an Xen. Details below.
This class implements an "XPath-like" search syntax.
All Selectors except "//" are supported
- / if at the start, move to the root Xen, else used as a delimiter
- . move to current Xen. (not very useful since there is an implied "." at the start of any path)
- .. move up to parent Xen.
- x select all children named x
- select all children
- @x select attributes named x (only allowed at the end)
- // is not supported. All children must be direct descendants.
Predicates supported (most are as-per W3C)
- [N] and [last()-N] work as per W3C, with 1 based indexing. Note: the
last()is optional. e.g. [-2] is same as [last()-2]
- [@a] selects all elements having an attribute named a
- [@a='val'] selects all elements having an attribute a with value val. Note: unlike W3C the single quotes are optional but highly recommended
- [.='val'] or [text()='val'] selects elements whose text equals val.
- Use ~ instead of = for regular expressions (non-W3C standard) e.g. [.~'.*end'] selects all elements whose text ends with "end"
If the path starts with a dot and a letter, it will be treated as a "Groovy Dot Style" path to access elements.
You lose a few options ("/", ".", and ".." are not supported) but the notation matches what you'd type in Groovy, including 0 based indexing.
How does this compare to Groovy?
If you use "Groovy Dot Style", things are nearly identical. You'll need to add a method call like get() or getText(). If you use "W3C XPath Style " style, replace the "." with "/", and adjust your indices by +1. Important Unlike Groovy,W3C XPath indexing is 1-based. The first element is , not .
records.car.make.@model.text(); // Groovy records.get(".car.make.@model").text(); // Xen "Groovy style" with 0 based indexes records.get("car/make/@model").text(); // Xen "Xpath style", note 1-based indexing!
Note: For more Groovy compatibility, Xen also has a
breadthFirst() which return a
Converters - convert to or from an Xen
GXmlParser inspired by groovy.util.XmlParser
For resding from general input types, use GXmlParser, which is an implementation of a org.xml.sax.ext.DefaultHandler2 that creates a tree of Xens using a SAXParser. e.g.
SAXParserFactory factory = SAXParserFactory.newInstance(); // play with the factory settings... SAXParser saxParser = factory.newSAXParser(); GXmlParser gxmlParser = new GXmlParser(saxParser); Xen root = gxmlParser.parse(someKindOfInput);
or, more simply, accept the defaults and go Groovy Style.
Xen root = new GXmlParser().parse(someKindOfInput);
If you already have an existing org.w3c.dom.Document (say, from a DOM parser), use this to convert to a tree of Xens.
DocumentBuilder dBuilder = DocumentBuilderFactory.newInstance().newDocumentBuilder(); Document doc = dBuilder.parse(someKindOfInput); Converter.FromDocument converter = new Converter.FromDocument(); Xen rootXen = converter.convert(doc);
Often you can simplify the last two lines with
Xen rootXen = Converter.FromDocument.DEFAULT.convert(doc);
Converts the Xen (usually the root but not necessarily) into a org.w3c.dom.Document. You must provide a blank Document of your preferred type, e.g.
Converter.ToDocument converter = new Converter.ToDocument(someDocumentBuilder.newDocument()); // or new CoreDocumentImpl() doc = converter.convert(xelent);
A reasonable conversion to XML text. (If you want something fancier, use ToDocumentConverter and apply your preferred Transformer or whatever to the Document.) Usage:
Converter.ToXML converter = new Converter.ToXML(initialIndent, indentPerLevel); String niceXML = converter.convert(xelent); // usually rootXen but not necessarily
Note: Xen.toString() uses this with
Converter.ToXML.DEFAULT, where indent and indentPerLevel both two spaces.
Xen was inspired by XPath and Groovy's XML Handling, e.g. XMLParser.
Let's face it, the standard Java
org.w3c.dom.* XML interfaces and implementations are huge, way overly complex for most users,
and do not match up well with modern Java. If one were writing them for Java today, the APIs and implementations could be greatly simplified
by using Collections, Generics, and Varargs, as well as numerous other Java syntax goodies.
Node.getChildNodes() would likely return a
Node.getAttributes() would return a
Anybody who has used (or even just read about) Groovy's XMLParser and associated classes knows how simple things could be. For example, Making Java Groovy has example code with just a few lines to parse Google Geocoder data.
This package is an attempt to rewrite XML structures as they "should be written today", with some major simplifications that hopefully make sense to 90% of users and match up with much of the Groovy capabilities. If the simplifications don't make sense for you, don't use this module! It's interesting that my design came out pretty close to Groovy's. (I then moved even closer to their design).
This package is fairly new (v0.1.0), and may well contain bugs and design flaws. The API is still subject to change.
- All text associated with a node is grouped into a single String. There are no org.w3c.dom.Text Nodes.
- Attributes are kept in a simple
Map<String,String>under their element. So there are no org.w3c.dom.Attr nodes.
- Since I have never cared about a CData section (and standard parsers just add it to your text), nor comments, nor processing instructions, nor an Entity or a Notation, those subclasses of org.w3c.dom.Node are also ignored.
- As noted in the very similar Groovy implemention, "This simple model is sufficient for most simple use cases of processing XML."
- Other than searching via
getElementsBy..., the only thing I have ever used Document for is to create Nodes and
getDocumentElement(). Since you can search perfectly well from an Element, and there is only one type of Node (Element) to create, there is no need for a org.w3c.dom.Document. If you really need a "root" reference, there is a single rootElement.
- This pretty much eliminates all Nodes other than Element. The Xen class corresponds roughly to org.w3c.dom.Element and a groovy.util.Node. That's why the name is Xen: "XML Element Node".
- To allow for some expansion, each Xen does have a
protected Map<String, Object> props. So, if you really need to track CDATA, comments, etc. you can probably tuck them away in there.
Xen.getProperty(String name)works it's way through any parents, so you can store "global for this Document information" in the props of the rootElement. JavaDocs are here