# Using `minidom`

## General

`minidom` can be used to create well-formed XML output with namespaces. It can also create output that is not well formed, especially with respect to namespaces, because it is only partially namespace-aware. This matters a lot if you are going to explore a document that you construct with `minidom`. If you are going to serialize the document as soon as you’ve built it, all that matters is that the serialization can be parsed correctly, even if the document before serialization is a `minidom` mess.

## Read this first

Import just `Document`. Elements and `text()` nodes are created through methods of the `Document` instance, and not directly from the `Element` or `Text` classes.

* Create a document node with `d = Document()`
* Create an element (any element) with `d.createElement('gi')`, where `'gi'` is the element name
* Create a `text()` node with `d.createTextNode('content')`, where `'content'` is a string that contains the content of the text node
* Create an attribute on an element with `parent.setAttribute('name', 'value')`, where both `'name'` and `'value'` are strings (even if the attribute represents a numerical value)
* Place a new node (element or `text()` in the tree with `parent.appendChild(child)`, where `parent` is the parent node and `child` is the new child that is being added

## Example without namespaces

In [1]:
from xml.dom.minidom import Document

d = Document() # Create a document node
root = d.createElement('root') # Create a root element (same as creating any element)
d.appendChild(root) # Make the new element a child of the document node
new = d.createElement('new') # Create another element
root.appendChild(new) # Make the new element a child of the root
new.setAttribute('type', 'experimental') # Set an attribute on the new element
new.setAttribute('number', '1') # Set another attribute, with a string value even though it’s a number
text = d.createTextNode('hi, mom!') # Create a text() node
new.appendChild(text) # Make the text() node a child of one of the elements
print(d.toxml()) # Serialize
print("---") # Visual separator, just for appearances
print(d.toprettyxml()) # Or pretty-print the serialization

<?xml version="1.0" ?><root><new number="1" type="experimental">hi, mom!</new></root>
---
<?xml version="1.0" ?>
<root>
	<new number="1" type="experimental">hi, mom!</new>
</root>



### Working with namespaces

* Create an element in a namespace with `new = d.createElementNS('http://www.tei-c.org/ns/1.0''TEI')`. The first argument is the namespace URI and the second is the qualified element name. A prefix on the qualified name is optional. `minidom` writes the prefix when you specify it and not when you don’t. It’s up to the user to specify the prefix when it’s needed.
* Create an attribute in a namespace with `parent.setAttributeNS('http://www.obdurodon.org', 'djb:note', 'hi, mom!')`. `parent` is the parent of the new attribute; the arguments to the method are 1) namespace URI, 2) qualified attribute name, 3) attribute value (as a string). Attribute creation has the same namespace oddity about prefixes as element creation.
* `minidom` doesn’t write namespace declarations on serialization! As a result, declaring namespaces and binding them to prefixes requires a hack: `root.setAttribute("xmlns", "http://example.net/ns")` It you don’t do this, you risk using prefixes without binding them, which is not well formed.

### Two perspectives on creating XML with `minidom`

If your only goal in creating an XML document with `minidom` is to serialize it, the only strict requirement is that the serialization be well formed, and that it express your intention, so that it will be parsed correctly downstream. This means, for example, that you can create an element with a namespace prefix without actually creating the element in the namespace, e.g. by using `imposter = d.createElement('tei:imposter')` instead of `imposter = d.createElementNS('http://www.tei-c.org/ns/1.0', 'tei:imposter')`. Doing this will let you serialize your document correctly, but you won’t be able to interact with it because it will be lying about its markup.

If you need to interact with the document you are creating using `getElementsByTagNameNS()` or something similar, you must create elements (and attributes) with namespaces correctly. If you do this, you will be able to interact with your document correctly (see the examples of `getElementsByTagNameNS()`, below), but it will not be serialized correctly unless you have also taken care to write your own namespace declarations, and to use prefixes where needed.

### Example with namespaces

In [2]:
from xml.dom.minidom import Document

d = Document() # Create a document node
root = d.createElementNS('http://www.tei-c.org/ns/1.0', 'TEI') 
root.setAttribute("xmlns", "http://www.tei-c.org/ns/1.0") # minidom won’t write the namespace declaration otherwise
root.setAttribute("xmlns:tei", "http://www.tei-c.org/ns/1.0") # ditto
root.setAttribute("xmlns:djb", "http://www.obdurdon.org") # ditto
d.appendChild(root)
teiHeader = d.createElementNS('http://www.tei-c.org/ns/1.0','teiHeader') # no prefix, for no particular reason
root.appendChild(teiHeader)
fileDesc = d.createElementNS('http://www.tei-c.org/ns/1.0','tei:fileDesc') #with prefix, for no particular reason
fileDesc.setAttributeNS('http://www.obdurodon.org', 'djb:note', 'hi, mom!')
teiHeader.appendChild(fileDesc)
interloper = d.createElement('interloper') # not in a namespace
fileDesc.appendChild(interloper)
imposter = d.createElement('tei:imposter') # prefix, but not in a namespace
fileDesc.appendChild(imposter)
print(d.toprettyxml())

<?xml version="1.0" ?>
<TEI xmlns="http://www.tei-c.org/ns/1.0" xmlns:djb="http://www.obdurdon.org" xmlns:tei="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<tei:fileDesc djb:note="hi, mom!">
			<interloper/>
			<tei:imposter/>
		</tei:fileDesc>
	</teiHeader>
</TEI>



To verify that the document is namespace-aware internally, run the following. The first two should create output; the last three shoudl not.

The first two should find the namespaced element because it really is namespaced. #3 finds nothing because although we serialize the document with a declaration that makes the TEI namespace the default for elements without namespace prefixes, `<interloper>` nonetheless is not in a namespace because we didn’t create it in one. #4 finds nothing because we also didn’t create it in a namespace. We (ab)used the `tei:` prefix, and that gets serialized, but the element nonetheless isn’t really in the TEI namespace. #5 finds nothing because `getElementsByTagNameNS()` does not permit the prefix, regardless of whether it was used when the element was created.

In [3]:
print(d.getElementsByTagNameNS('http://www.tei-c.org/ns/1.0', 'teiHeader')) # no prefix in doc, but in a namespace
print(d.getElementsByTagNameNS('http://www.tei-c.org/ns/1.0', 'fileDesc')) # prefix in doc, and in a namespace
print(d.getElementsByTagNameNS('http://www.tei-c.org/ns/1.0', 'interloper')) # knows it isn't in the TEI namespace
print(d.getElementsByTagNameNS('http://www.tei-c.org/ns/1.0', 'imposter')) # knows it isn't in the TEI namespace, despite fake prefix in doc
print(d.getElementsByTagNameNS('http://www.tei-c.org/ns/1.0', 'tei:fileDesc')) # doesn’t accept the prefix here

[<DOM Element: teiHeader at 0x102439f20>]
[<DOM Element: tei:fileDesc at 0x102439df0>]
[]
[]
[]


### Namespace summary

`minidom` is only partially namespace-aware. It will create broken documents with namespace errors unless the user takes steps to prevent it. All of the peculiarities listed below can result in creating documents that are not well formed.

* Whether `minidom` writes the prefix into a serialization depends only on whether the prefix was specified when the element was created. It has nothing to do with whether the element is really in a namespace. It is possible to create and serialize an element with a namespace prefix that is not really in the namespace. It is also possible to create an element in a namespace without a prefix and when you serialize it, it will emerge without a prefix, which means that in the serialization it will be in whatever the default namespace is for its context.
* `minidom` will not write namespace declarations (default or prefix-binding) unless you create them explicitly as attributes. This means that you can create an element with a prefix and serialize output where the prefix is not actually declared, which would not be well-formed.
* `minidom` will let you create and serialize an element with a namespace prefix with `createElement()` even though the element is not really in a namespace. An element is in a namespace only if it is created with `createElementNS()`.