JDOM2 Feature End Of Line Termination

rolfl edited this page Apr 8, 2012 · 2 revisions

XMLOutputter has used a Format instance to dictate the format of the output. This Format instance has been able to specify how XMLOutputter should handle the end-of-line termination sequence. When using 'raw' output, there has never been any sequence, but when using a normalized or pretty format there has, by default, been a carriage-return/line-feed sequence ("\r\n", CRLF or CRNL) which is the standard sequence for 'DOS' and later systems (think Microsoft). This sequence is well-respected by almost all platforms just because it's use is so pervasive. On the other hand, many systems use the plain line-feed sequence ("\n", NL) which is the standard UNIX sequence (and Apple Mac since OSX). The new-line sequence is also the 'normalized' format used for XML. It is the format that all XML parsers are required to report the XML in, regardless of the line-terminating seqence in the original input files. Many (most) XML handling libraries (apart from JDOM) produce newline-only XML.

During the building of the JUnit tests it became necessary to produce simple newline output in order to compare the JDOM output with other system outputs. This ignited a debate about whether JDOM should convert to using a plain newline termination sequence by default.

There is some resistance to the change, and, given that:

  1. it is 'easy' to configure JDOM to output using any specified sequence
  2. JDOM2 would be compatible with JDOM 1.x
  3. using the CRLF sequence makes it easy to edit files in basic windows tools like 'notepad'

it is sensible to keep the existing default end-of-line sequence.

At the same time it makes sense to make it easier to change the default end-of-line sequence, instead of having to do it for every output Format instance.

With all the above being said, it is time to introduce the new JDOM2 End-Of-Line features:

  1. it is now much easier to specify the common mechanisms for the end-of-line sequence.
  2. it is now possible to change the 'default' end-of-line sequence that a new Format instance will use.

The first feature is implemented with the org.jdom2.output.LineSeparator enumeration which itemizes a number of common sequences:

  1. DOS and CRLF are aliases.
  2. UNIX and NL are aliases
  3. NONE for no sequence at all.
  4. SYSTEM for the sequence that is 'normal' for the current platform (CRLF on Windows, NL on UNIX, etc.).
  5. DEFAULT for the JDOM default (normally CRLF but may be overrided using a System property).

The Format class now supports setting the line termination sequence with the above LineSeparator enum.

The second feature is implemented with the System property org.jdom2.output.LineSeparator

java -D org.jdom2.output.LineSeparator=SYSTEM ...

See the JavaDoc for LineSeparator to get more information.