philcali edited this page Nov 21, 2011 · 3 revisions

This section covers using the library inside an application to parse LMXML text into nodes.

Using the Built-in Parsers

You can either instantiate a PlainLmxmlParser or use the DefaultLmxmlParser whose increment value is 2. Either parser is a subclass of LmxmlParsers.

The increment value is the indention value for determining if the next node is a child or sibling.

import lmxml._

val contents = ... // Some lmxml

val parser = new PlainLmxmlParser(increment = 2)

val nodes = parser.parseNodes(contents)

In the above example, nodes will be a recursive data structure consisting of ParsedNodes. The only remaining nodes after a successful parse will be the LmxmlNode and TextNode.

Parse Safely

LmxmlParsers have a method called safeParseNodes which returns an Either[LmxmlParsers.ParseResult, Seq[ParsedNode]].

This is particularly useful if you are parsing LMXML from an untrusted source.

import lmxml._

val untrusted = ... // Some lmxml from a user

DefaultLmxmlParser.safeParseNodes(untrusted).fold(println, { nodes =>
  // Work with nodes in here

Extending the parser is quite simple to do.