Attribute Namespace is lost during DOM Build #69

Closed
holgergrote opened this Issue Mar 14, 2012 · 1 comment

Comments

Projects
None yet
2 participants

If you build a JDOM DOM from a w3c DOM the attribute namespace is lost (if you use the default xerces DOM builder)

import java.io.StringReader;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;

import org.jdom.Document;
import org.jdom.input.SAXBuilder;
import org.jdom.output.XMLOutputter;
import org.xml.sax.InputSource;

public class JdomTest {

    public static void main(String[] args) throws Exception {
        String xml = "<?xml version=\"1.0\" encoding=\"UTF-8\"?><root xmlns:test=\"http://test.de\" test:att1=\"123456\"/>";

        final Document jdocument = new SAXBuilder().build(new StringReader(xml));

        final DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        final DocumentBuilder builder = factory.newDocumentBuilder();
        final Document jdocument_withoutNamespace = new org.jdom.input.DOMBuilder().build(builder
                .parse(new InputSource(new StringReader(xml))));

        new XMLOutputter().output(jdocument, System.out);
        new XMLOutputter().output(jdocument_withoutNamespace, System.out);
    }
}

Output:

<?xml version="1.0" encoding="UTF-8"?>
<root xmlns:test="http://test.de" test:att1="123456" />
<?xml version="1.0" encoding="UTF-8"?>
<root xmlns:test="http://test.de" att1="123456" />

Here the test Namespace ist lost .
The reason is in DOMBuilder.buildTree() line 265 ( String attURI = att.getNamespaceURI();)
Here the xercses Implementation alsways returns null and due to that the prefix is ignored in the next if statement.

This happens with JDOM 1.1.3. With JDOM2 the source is the same so the behaviour should be also the same.

Collaborator

rolfl commented Mar 14, 2012

Hi there.

THis is an issue with the 'default' settings for DOMBuilderFactory. What is happening here is that the DOMBuilder is ignoring the Namespace declarations in the input document, and it is not recording the namespaces in the resulting DOM tree. JDOM is reading the DOM tree, and it is reproducing the results accurately (which in this case means it is ignoring the namespaces too).

The fix for this is to correclty parse the DOM tree first, ensuring that it has the namespace details included.

I was initially going to say 'this is a well known problem', but after searching around a bit, I can't find an easy explanation, or well documented fix.... but, as it happens, the fix really is easy, and it is not a JDOM problem, but a DOM Factory defaulting problem.

By Default, DOMFactory is configured to ignore all namespaces in XML documents. You need to change that by adding the line

        factory.setNamespaceAware(true);

to your code, so that it looks like:

        final DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        factory.setNamespaceAware(true);
        final DocumentBuilder builder = factory.newDocumentBuilder();
        final Document jdocument_withoutNamespace = new DOMBuilder().build(builder
                .parse(new InputSource(new StringReader(xml))));

I will update the JDOM2 documentation for DOM Builder to ensure that it is more obvious.... so, for the moment, I will leave this open until I commit that change....

@rolfl rolfl closed this in 8e0d730 Mar 14, 2012

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment