/
FoX_dom.html
779 lines (586 loc) · 37.3 KB
/
FoX_dom.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>FoX_dom</title>
<link rel="stylesheet" type="text/css" href="DoX.css"/>
</head>
<body>
<div class="DoX">
<h1>DOM</h1>
<h2>Overview</h2>
<p>The FoX DOM interface exposes an API as specified by the W3C DOM Working group.</p>
<p>FoX implements essentially all of DOM Core Levels 1 and 2, (there are a number of minor exceptions which are listed below) and a substantial portion of DOM Core Level 3.</p>
<ul>
<li><a href="#DomQuickOverview">Quick overview of how to map the DOM interface to Fortran</a> </li>
<li><a href="#DomDetailedInterface">More detailed explanation of Fortran interface</a> </li>
<li><a href="#DomUtilityFunctions">Additional (non-DOM) utility functions</a> </li>
<li><a href="#DomString">String handling</a> </li>
<li><a href="#DomException">Exception handling</a> </li>
<li><a href="#DomLiveNodelists">Live nodelists</a> </li>
<li><a href="#DomConfiguration">DOM Configuration</a> </li>
<li><a href="#DomMiscellanea">Miscellanea</a></li>
</ul>
<h2>Interface Mapping</h2>
<p><a name="DomQuickOverview"/></p>
<p>FoX implements all objects and methods mandated in DOM Core Level 1 and 2. (A listing of supported DOM Core Level 3 interfaces is given below.)</p>
<p>In all cases, the mapping from DOM interface to Fortran implementation is as follows:</p>
<ol>
<li>All DOM objects are available as Fortran types, and should be referenced only as pointers (though see 7 and 8 below). Thus, to use a Node, it must be declared first as: <br />
<code>type(Node), pointer :: aNode</code></li>
<li>A flat (non-inheriting) object hierarchy is used. All DOM objects which inherit from Node are represented as Node types. </li>
<li>All object method calls are modelled as functions or subroutines with the same name, whose first argument is the object. Thus: <br />
<code>aNodelist = aNode.getElementsByTagName(tagName)</code> <br />
should be converted to Fortran as: <br />
<code>aNodelist => getElementsByTagName(aNode, tagName)</code> </li>
<li>All object method calls whose return type is void are modelled as subroutines. Thus: <br />
<code>aNode.normalize()</code> <br />
becomes
<code>call normalize(aNode)</code> </li>
<li>All object attributes are modelled as a pair of get/set calls (or only get where the attribute is readonly), with the naming convention being merely to prepend get or set to the attribute name. Thus: <br />
<code>name = node.nodeName</code> <br />
<code>node.nodeValue = string</code> <br />
should be converted to Fortran as <br />
<code>name = getnodeName(node)</code> <br />
<code>call setnodeValue(string)</code> </li>
<li>Where an object method or attribute getter returns a DOM object, the relevant Fortran function must always be used as a pointer function. Thus: <br />
<code>aNodelist => getElementsByTagName(aNode, tagName)</code> </li>
<li>No special DOMString object is used - all string operations are done on the standard Fortran character strings, and all functions that return DOMStrings return Fortran character strings.</li>
<li>Exceptions are modelled by every DOM subroutine/function allowing an optional additional argument, of type DOMException. For further information see (DOM Exceptions.html) below. </li>
</ol>
<h3>String handling</h3>
<p><a name="DomString"/></p>
<p>The W3C DOM requires that a <code>DOMString</code> object exist, capable of holding Unicode strings; and that all DOM functions accept and emit DOMString objects when string data is to be transferred.</p>
<p>FoX does not follow this model. Since (as mentioned elsewhere) it is impossible to perform Unicode I/O in standard Fortran, it would be obtuse to require users to manipulate additional objects merely to transfer strings. Therefore, wherever the DOM mandates use of a <code>DOMString</code>, FoX merely uses standard Fortran character strings.</p>
<p>All functions or subroutines which expect DOMString input arguments should be used with normal character strings. <br />
All functions which should return DOMString objects will return Fortran character strings.</p>
<h3>Using the FoX DOM library.</h3>
<p>All functions are exposed through the module <code>FoX_DOM</code>. <code>USE</code> this in your program:</p>
<pre><code>program dom_example
use FoX_DOM
type(Node) :: myDoc
myDoc => parseFile("fileIn.xml")
call serialize(myDoc, "fileOut.xml")
end program dom_example
</code></pre>
<h2>Documenting DOM functions</h2>
<p><a name="DomDetailedInterface"/></p>
<p>This manual will not exhaustively document the functions available through the <code>Fox_DOM</code> interface. Primary documentation may be found in the W3C DOM specifications:`</p>
<ul>
<li><a href="http://www.w3.org/TR/REC-DOM-Level-1/">DOM Core Level 1</a></li>
<li><a href="http://www.w3.org/TR/DOM-Level-2-Core/">DOM Core Level 2</a></li>
<li><a href="http://www.w3.org/TR/DOM-Level-3-Core/">DOM Core Level 3</a></li>
</ul>
<p>The systematic rules for translating the DOM interfaces to Fortran are given in the previous section. For completeness, though, there is a list here. The W3C specifications should be consulted for the use of each.</p>
<p>DOMImplementation: <br />
<code>type(DOMImplementation), pointer</code></p>
<ul>
<li><code>hasFeature(impl, feature, version)</code> </li>
<li><code>createDocumentType(impl, qualifiedName, publicId, systemId)</code> </li>
<li><code>createDocument(impl, qualifiedName, publicId, systemId)</code> </li>
</ul>
<p>Document:
<code>type(Node), pointer</code></p>
<ul>
<li><code>getDocType(doc)</code></li>
<li><code>getImplementation(doc)</code> </li>
<li><code>getDocumentElement(doc)</code> </li>
<li><code>createElement(doc, tagname)</code> </li>
<li><code>createDocumentFragment(doc)</code> </li>
<li><code>createTextNode(doc, data)</code> </li>
<li><code>createComment(doc, data)</code> </li>
<li><code>createCDataSection(doc, data)</code> </li>
<li><code>createProcessingInstruction(doc, target, data)</code> </li>
<li><code>createAttribute(doc, name)</code> </li>
<li><code>createEntityReference(doc, name)</code> </li>
<li><code>getElementsByTagName(doc, tagname)</code> </li>
<li><code>importNode(doc, importedNode, deep)</code> </li>
<li><code>createElementNS(doc, namespaceURI, qualifiedName)</code> </li>
<li><code>createAttributeNS(doc, namespaceURI, qualifiedName)</code> </li>
<li><code>getElementsByTagNameNS(doc, namespaceURI, qualifiedName)</code> </li>
<li><code>getElementById(doc, elementId)</code></li></li>
</ul>
<p>Node: <br />
<code>type(Node), pointer</code></p>
<ul>
<li><code>getNodeName(arg)</code> </li>
<li><code>getNodeValue(arg)</code> </li>
<li><code>setNodeValue(arg, value)</code> </li>
<li><code>getNodeType(arg)</code> </li>
<li><code>getParentNode(arg)</code> </li>
<li><code>getChildNodes(arg)</code> </li>
<li><code>getFirstChild(arg)</code> </li>
<li><code>getLastChild(arg)</code> </li>
<li><code>getPreviousSibling(arg)</code> </li>
<li><code>getNextSibling(arg)</code> </li>
<li><code>getAttributes(arg)</code> </li>
<li><code>getOwnerDocument(arg)</code> </li>
<li><code>insertBefore(arg, newChild, refChild)</code> </li>
<li><code>replaceChild(arg, newChild, refChild)</code> </li>
<li><code>removeChild(arg, oldChild)</code> </li>
<li><code>appendChild(arg, newChild)</code> </li>
<li><code>hasChildNodes(arg)</code> </li>
<li><code>cloneNode(arg, deep)</code> </li>
<li><code>normalize</code> </li>
<li><code>isSupported(arg, feature, version)</code> </li>
<li><code>getNamespaceURI(arg)</code> </li>
<li><code>getPrefix(arg)</code> </li>
<li><code>setPrefix(arg, prefix)</code> </li>
<li><code>getLocalName(arg)</code> </li>
<li><code>hasAttributes(arg)</code> <br />
</ul></li>
</ul>
<p>NodeList: <br />
<code>type(NodeList), pointer</code></p>
<ul>
<li><code>item(arg, index)</code> </li>
<li><code>getLength(arg)</code> </li>
</ul>
<p>NamedNodeMap: <br />
<code>type(NamedNodeMap), pointer</code></p>
<ul>
<li><code>getNamedItem(map, name)</code> </li>
<li><code>setNamedItem(map, arg)</code> </li>
<li><code>removeNamedItem(map, name)</code> </li>
<li><code>item(map, index)</code> </li>
<li><code>getLength(map)</code> </li>
<li><code>getNamedItemNS(map, namespaceURI, qualifiedName)</code> </li>
<li><code>setNamedItemNS(map, arg)</code> </li>
<li><code>removeNamedItemNS(map, namespaceURI, qualifiedName)</code> </li>
</ul>
<p>CharacterData: <br />
<code>type(Node), pointer</code></p>
<ul>
<li><code>getData(np)</code> </li>
<li><code>setData(np, data)</code> </li>
<li><code>getLength(np)</code> </li>
<li><code>substringData(np, offset, count)</code> </li>
<li><code>appendData(np, arg)</code> </li>
<li><code>deleteData(np, offset, count)</code> </li>
<li><code>replaceData(np, offset, count, arg)</code></li>
</ul>
<p>Attr: <br />
<code>type(Node), pointer</code></p>
<ul>
<li><code>getName(np)</code> </li>
<li><code>getSpecified(np)</code> </li>
<li><code>getValue(np)</code> </li>
<li><code>setValue(np, value)</code> </li>
<li><code>getOwnerElement(np)</code></li>
</ul>
<p>Element: <br />
<code>type(Node), pointer</code></p>
<ul>
<li><code>getTagName(np)</code> </li>
<li><code>getAttribute(np, name)</code> </li>
<li><code>setAttribute(np, name, value)</code> </li>
<li><code>removeAttribute(np, name)</code> </li>
<li><code>getAttributeNode(np, name)</code> </li>
<li><code>setAttributeNode(np, newAttr)</code> </li>
<li><code>removeAttributeNode(np, oldAttr)</code> </li>
<li><code>getElementsByTagName(np, name)</code> </li>
<li><code>getAttributeNS(np, namespaceURI, qualifiedName)</code> </li>
<li><code>setAttributeNS(np, namespaceURI, qualifiedName, value)</code> </li>
<li><code>removeAttributeNS(np, namespaceURI, qualifiedName)</code> </li>
<li><code>getAttributeNode(np, namespaceURI, qualifiedName)</code> </li>
<li><code>setAttributeNode(np, newAttr)</code> </li>
<li><code>removeAttributeNode(np, oldAttr)</code> </li>
<li><code>getElementsByTagNameNS(np, namespaceURI, qualifiedName)</code> </li>
<li><code>hasAttribute(np, name)</code> </li>
<li><code>hasAttributeNS(np, namespaceURI, qualifiedName)</code> </li>
</ul>
<p>Text: <br />
<code>type(Node), pointer</code></p>
<ul>
<li><code>splitText(np, offset)</code> </li>
</ul>
<p>DocumentType: <br />
<code>type(Node), pointer</code></p>
<ul>
<li><code>getName(np)</code> </li>
<li><code>getEntites(np)</code> </li>
<li><code>getNotations(np)</code> </li>
<li><code>getPublicId(np)</code> </li>
<li><code>getSystemId(np)</code> </li>
<li><code>getInternalSubset(np)</code> </li>
</ul>
<p>Notation: <br />
<code>type(Node), pointer</code> </p>
<ul>
<li><code>getPublicId(np)</code> </li>
<li><code>getSystemId(np)</code> </li>
</ul>
<p>Entity: <br />
<code>type(Node), pointer</code></p>
<ul>
<li><code>getPublicId(np)</code> </li>
<li><code>getSystemId(np)</code> </li>
<li><code>getNotationName(np)</code> </li>
</ul>
<p>ProcessingInstruction: <br />
<code>type(Node), pointer</code></p>
<ul>
<li><code>getTarget(np)</code> </li>
<li><code>getData(np)</code> </li>
<li><code>setData(np, data)</code> </li>
</ul>
<p>In addition, the following DOM Core Level 3 functions are available:</p>
<p>Document:</p>
<ul>
<li><code>getDocumentURI(np)</code> </li>
<li><code>setDocumentURI(np, documentURI)</code> </li>
<li><code>getDomConfig(np)</code> </li>
<li><code>getInputEncoding(np)</code> </li>
<li><code>getStrictErrorChecking(np)</code> </li>
<li><code>setStrictErrorChecking(np, strictErrorChecking)</code> </li>
<li><code>getXmlEncoding(np)</code> </li>
<li><code>getXmlStandalone(np)</code> </li>
<li><code>setXmlStandalone(np, xmlStandalone)</code> </li>
<li><code>getXmlVersion(np)</code> </li>
<li><code>setXmlVersion(np, xmlVersion)</code> </li>
<li><code>adoptNode(np, source)</code> </li>
<li><code>normalizeDocument(np)</code> </li>
<li><code>renameNode(np, namespaceURI, qualifiedName)</code> </li>
</ul>
<p>Node:</p>
<ul>
<li><code>getBaseURI(np)</code> </li>
<li><code>getTextContent(np)</code> </li>
<li><code>setTextContent(np, textContent)</code> </li>
<li><code>isEqualNode(np, other)</code></li>
<li><code>isSameNode(np)</code> </li>
<li><code>isDefaultNamespace(np, namespaceURI)</code> </li>
<li><code>lookupPrefix(np, namespaceURI)</code> </li>
<li><code>lookupNamespaceURI(np, prefix)</code> </li>
</ul>
<p>Attr:</p>
<ul>
<li><code>getIsId(np)</code></li>
</ul>
<p>Entity: </p>
<ul>
<li><code>getInputEncoding(np)</code> </li>
<li><code>getXmlVersion(np)</code> </li>
<li><code>getXmlEncoding(np)</code> </li>
</ul>
<p>Text:</p>
<ul>
<li><code>getIsElementContentWhitespace(np)</code> </li>
</ul>
<p>DOMConfiguration: <br />
<code>type(DOMConfiguration)</code></p>
<ul>
<li><code>canSetParameter(arg, name, value)</code> </li>
<li><code>getParameter(arg, name)</code> </li>
<li><code>getParameterNames(arg)</code> </li>
<li><code>setParameter(arg, name)</code> </li>
</ul>
<p><em>NB For details on DOMConfiguration, see <a href="#DomConfiguration">below</a></em></p>
<h3>Object Model</h3>
<p>The DOM is written in terms of an object model involving inheritance, but also permits a flattened model. FoX implements this flattened model - all objects descending from the Node are of the opaque type <code>Node</code>. Nodes carry their own type, and attempts to call functions defined on the wrong nodetype (for example, getting the <code>target</code> of a node which is not a PI) will result in a <code>FoX_INVALID_NODE</code> exception.</p>
<p>The other types available through the FoX DOM are:</p>
<ul>
<li><code>DOMConfiguration</code> </li>
<li><code>DOMException</code> </li>
<li><code>DOMImplementation</code> </li>
<li><code>NodeList</code> </li>
<li><code>NamedNodeMap</code></li>
</ul>
<h3>FoX DOM and pointers</h3>
<p>All DOM objects exposed to the user may only be manipulated through pointers. Attempts to access them directly will result in compile-time or run-time failures according to your environment.</p>
<p>This should have little effect on the structure of your programs, except that you must always remember, when calling a DOM function, to perform pointer assignment, not direct assignment, thus: <br />
<code>child => getFirstChild(parent)</code> <br />
and <em>not</em> <br />
<code>child = getFirstChild(parent)</code> </p>
<h3>Memory handling</h3>
<p>Fortran offers no garbage collection facility, so unfortunately a small degree of memory
handling is necessarily exposed to the user.</p>
<p>However, this has been kept to a minimum. FoX keeps track of all memory allocated and used when calling DOM routines, and keeps references to all DOM objects created.</p>
<p>The only memory handling that the user needs to take care of is destroying any
DOM Documents (whether created manually, or by the <code>parse()</code> routine.) All other nodes or node structures created will be destroyed automatically by the relevant <code>destroy()</code> call.</p>
<p>As a consequence of this, all DOM objects which are part of a given document will become inaccessible after the document object is destroyed.</p>
<h2>Additional functions.</h2>
<p><a name="DomUtilityFunctions"/></p>
<p>Several additional utility functions are provided by FoX.</p>
<h3>Input and output of XML data</h3>
<p>Firstly, to construct a DOM tree, from either a file or a string containing XML data.</p>
<ul>
<li><code>parseFile</code> <br />
<strong>filename</strong>: <em>string</em> <br />
(<strong>configuration</strong>): <em>DOMConfiguration</em> <br />
(<strong>ex</strong>): <em>DOMException</em> </li>
</ul>
<p><strong>filename</strong> should be an XML document. It will be opened and parsed into a DOM tree. The parsing is performed by the FoX SAX parser; if the XML document is not well-formed, a <code>PARSE_ERR</code> exception will be raised. <strong>configuration</strong> is an optional argument - see <a href="#DomConfiguration">DOMConfiguration</a> for its meaning.</p>
<ul>
<li><code>parseString</code> <br />
<strong>XMLstring</strong>: <em>string</em> <br />
(<strong>configuration</strong>): <em>DOMConfiguration</em> <br />
(<strong>ex</strong>): <em>DOMException</em></li>
</ul>
<p><strong>XMLstring</strong> should be a string containing XML data. It will be parsed into a DOM tree. The parsing is performed by the FoX SAX parser; if the XML document is not well-formed, a <code>PARSE_ERR</code> exception will be raised. <strong>configuration</strong> is an optional argument - see <a href="#DomConfiguration">DOMConfiguration</a> for its meaning.</p>
<p>Both <code>parseFile</code> and <code>parseString</code> return a pointer to a <code>Node</code> object containing the Document Node.`</p>
<p>Secondly, to output an XML document:</p>
<ul>
<li><code>serialize</code> <br />
<strong>arg</strong>: <em>Node, pointer</em>
<strong>fileName</strong>: <em>string</em> </li>
</ul>
<p>This will open <code>fileName</code> and serialize the DOM tree by writing into the file. If <code>fileName</code> already exists, it will be overwritten. If an problem arises in serializing the document, then a fatal error will result.</p>
<p>(Control over serialization options is done through the configuration of the <strong>arg</strong>'s ownerDocument, see <a href="#DomConfiguration">below</a>.)</p>
<p>Finally, to clean up all memory associated with the DOM, it is necessary to call:</p>
<ul>
<li><code>destroy</code> <br />
<strong>np</strong>: <em>Node, pointer</em></li>
</ul>
<p>This will clear up all memory usage associated with the document (or documentType) node passed in.</p>
<h3>Extraction of data from an XML file.</h3>
<p><a name="dataExtraction"/></p>
<p>The standard DOM functions only deal with string data. When dealing with numerical (or logical) data,
the following functions may be of use.</p>
<ul>
<li><code>extractDataContent</code> </li>
<li><code>extractDataAttribute</code> </li>
<li><code>extractDataAttributeNS</code></li>
</ul>
<p>These extract data from, respectively, the text content of an element, from one of its attributes, or from one of its namespaced attributes.
They are used like so:</p>
<p>(where <code>p</code> is an element which has been selected by means of the other DOM functions)</p>
<pre><code>call extractDataContent(p, data)
</code></pre>
<p>The subroutine will look at the text contents of the element, and interpret according to the type of <code>data</code>. That is, if <code>data</code> has been declared as an <code>integer</code>, then the contents of <code>p</code> will be read as such an placed into <code>data</code>.</p>
<p><code>data</code> may be a <code>string</code>, <code>logical</code>, <code>integer</code>, <code>real</code>, <code>double precision</code>, <code>complex</code> or <code>double complex</code> variable.</p>
<p>In addition, if <code>data</code> is supplied as a rank-1 or rank-2 variable (ie an array or a matrix) then the data will be read in assuming it to be a space- or comma-separated list of such data items.</p>
<p>Thus, the array of integers within the XML document:</p>
<pre><code><element> 1 2 3 4 5 6 </element>
</code></pre>
<p>could be extracted by the following Fortran program:</p>
<pre><code>type(Node), pointer :: doc, p
integer :: i_array(6)
doc => parseFile(filename)
p => item(getElementsByTagName(doc, "element"), 0)
call extractDataContent(p, i_array)
</code></pre>
<h4>Contents and Attributes</h4>
<p>For extracting data from text content, the example above suffices. For data in a non-namespaced attribute (in this case, a 2x2 matrix of real numbers)</p>
<pre><code><element att="0.1, 2.3 7.56e23, 93"> Some uninteresting text </element>
</code></pre>
<p>then use a Fortran program like:</p>
<pre><code>type(Node), pointer :: doc, p
real :: r_matrix(2,2)
doc => parseFile(filename)
p => item(getElementsByTagName(doc, "element"), 0)
call extractDataAttribute(p, "att", r_matrix)
</code></pre>
<p>or for extracting from a namespaced attribute (in this case, a length-2 array of complex numbers):</p>
<pre><code><myml xmlns:ns="http://www.example.org">
<element ns:att="0.1,2.3 3.4e2,5.34"> Some uninteresting text </element>
</myml>
</code></pre>
<p>then use a Fortran program like:</p>
<pre><code>type(Node), pointer :: doc, p
complex :: c_array(2)
doc => parseFile(filename)
p => item(getElementsByTagName(doc, "element"), 0)
call extractDataAttributeNS(p, &
namespaceURI="http://www.example.org", localName="att", &
data=c_array)
</code></pre>
<h4>Error handling</h4>
<p>The extraction may fail of course, if the data is not of the sort specified, or if there are not enough elements to fill the array or matrix. In such a case, this can be detected by the optional arguments <code>num</code> and <code>iostat</code>. </p>
<p><code>num</code> will hold the number of items successfully read. Hopefully this should be equal to the expected number of items; but it may be less if reading failed for some reason, or if there were less items than expected in the element.</p>
<p><code>iostat</code> will hold an integer - this will be <code>0</code> if the extraction went ok; <code>-1</code> if too few elements were found, <code>1</code> if although the read went ok, there were still some elements left over, or <code>2</code> if the extraction failed due to either a badly formatted number, or due to the wrong data type being found.</p>
<h4>String arrays</h4>
<p>For all data types apart from strings, arrays and matrices are specified by space- or comma-separated lists. For strings, some additional options are available. By default, arrays will be extracted assuming that separators are spaces (and multiple spaces are ignored). So:</p>
<pre><code><element> one two three </element>
</code></pre>
<p>will result in the string array <code>(/"one", "two", "three"/)</code>.</p>
<p>However, you may specify an optional argument <code>separator</code>, which specifies another single-character separator to use (and does not ignore multiple spaces). So:</p>
<pre><code><element>one, two, three </element>
</code></pre>
<p>will result in the string array <code>(/"one", " two", " three "/)</code>. (note the leading and trailing spaces).</p>
<p>Finally, you can also specify an optional logical argument, <code>csv</code>. In this case, the <code>separator</code> is ignored, and the extraction proceeds assuming that the data is a list of comma-separated values. (see: <a href="http://en.wikipedia.org/wiki/Comma-separated_values">CSV</a>)</p>
<h3>Other utility functions</h3>
<ul>
<li><code>setFoX_checks</code> <br />
<strong>FoX_checks</strong>: <em>logical</em></li>
</ul>
<p>This affects whether additional FoX-only checks are made (see <a href="#DomException">DomExceptions</a> below). </p>
<ul>
<li><code>getFoX_checks</code> <br />
<strong>arg</strong>: <em>DOMImplementation, pointer</em> </li>
</ul>
<p>Retrieves the current setting of FoX_checks.</p>
<p>Note that FoX_checks can only be turned on and off globally, not on a per-document basis.</p>
<ul>
<li><code>setLiveNodeLists</code> <br />
<strong>arg</strong>: <em>Node, pointer</em> <br />
<strong>liveNodeLists</strong>: <em>logical</em></li>
</ul>
<p><strong>arg</strong> must be a Document Node. Calling this function affects whether any nodelists active on the document are treated as live - ie whether updates to the documents are reflected in the contents of nodelists (see <a href="#DomLiveNodelists">DomLiveNodelists</a> below).</p>
<ul>
<li><code>getLiveNodeLists</code> <br />
<strong>arg</strong>: <em>Node, pointer</em></li>
</ul>
<p>Retrieves the current setting of liveNodeLists.</p>
<p>Note that the live-ness of nodelists is a per-document setting.</p>
<h3>Exception handling</h3>
<p><a name="DomException"/></p>
<p>Exception handling is important to the DOM. The W3C DOM standards provide not only interfaces to the DOM, but also specify the error handling that should take place when invalid calls are made.</p>
<p>The DOM specifies these in terms of a <code>DOMException</code> object, which carries a numeric code whose value reports the kind of error generated. Depending upon the features available in a particular computer language, this DOMException object should be generated and thrown, to be caught by the end-user application.</p>
<p>Fortran of course has no mechanism for throwing and catching exceptions. However, the behaviour of an exception can be modelled using Fortran features.</p>
<p>FoX defines an opaque <code>DOMException</code> object.
Every DOM subroutine and function implemented by FoX will take an optional argument, 'ex', of type <code>DOMException</code>. </p>
<p>If the optional argument is not supplied, any errors within the DOM will cause an immediate abort, with a suitable error message. However, if the optional argument <em>is</em> supplied, then the error will be captured within the <code>DOMException</code> object, and returned to the caller for inspection. It is then up to the application to decide how to proceed.</p>
<p>Functions for inspecting and manipulating the <code>DOMException</code> object are described below:</p>
<ul>
<li><code>inException</code>: <br />
<strong>ex</strong>: <em>DOMException</em></li>
</ul>
<p>A function returning a logical value, according to whether <code>ex</code> is in exception - that is, whether the last DOM function or subroutine, from which <code>ex</code> returned, caused an error. Note that this will not change the status of the exception.</p>
<ul>
<li><code>getExceptionCode</code> <br />
<strong>ex</strong>: <em>DOMException</em></li>
</ul>
<p>A function returning an integer value, describing the nature of the exception reported in <code>ex</code>. If the integer is 0, then <code>ex</code> does not hold an exception. If the integer is less than 200, then the error encountered was of a type specified by the DOM standard; for a full list, see below, and for explanations, see the various DOM standards. If the integer is 200 or greater, then the code represents a FoX-specific error. See the list below.</p>
<p>Note that calling <code>getExceptionCode</code> will clean up all memory associated with the DOMException object, and reset the object such that it is no longer in exception.</p>
<h4>Exception handling and memory usage.</h4>
<p>Note that when an Exception is thrown, memory is allocated within the DOMException object. Calling <code>getExceptionCode</code> on a DOMEXception will clean up this memory. If you use the exception-handling interfaces of FoX, then you must check every exception, and ensure you check its code, otherwise your program will leak memory.</p>
<h4>FoX exceptions.</h4>
<p>The W3C DOM interface allows the creation of unserializable XML document in various ways. For example, it permits characters to be added to a text node which would be invalid XML. FoX performs multiple additional checks on all DOM calls to prevent the creation of unserializable trees. These are reported through the DOMException mechanisms noted above, using additional exception codes. However, if for some reason, you want to create such trees, then it is possible to switch off all FoX-only checks. (DOM-mandated checks may not be disabled.) To do this, use the <code>setFoX_checks</code> function described in <a href="#DomUtilityFunctions">DomUtilityFunctions</a>.</p>
<p>Note that FoX does not yet currently check for all ways that a tree may be made non-serializable.</p>
<h4>List of exceptions.</h4>
<p>The following is the list of all exception codes (both specified in the W3C DOM and those related to FoX-only checks) that can be generated by FoX:</p>
<ul>
<li>INDEX_SIZE_ERR = 1</li>
<li>DOMSTRING_SIZE_ERR = 2</li>
<li>HIERARCHY_REQUEST_ERR = 3</li>
<li>WRONG_DOCUMENT_ERR = 4</li>
<li>INVALID_CHARACTER_ERR = 5</li>
<li>NO_DATA_ALLOWED_ERR = 6</li>
<li>NO_MODIFICATION_ALLOWED_ERR = 7</li>
<li>NOT_FOUND_ERR = 8</li>
<li>NOT_SUPPORTED_ERR = 9</li>
<li>INUSE_ATTRIBUTE_ERR = 10</li>
<li>INVALID_STATE_ERR = 11</li>
<li>SYNTAX_ERR = 12</li>
<li>INVALID_MODIFICATION_ERR = 13</li>
<li>NAMESPACE_ERR = 14</li>
<li>INVALID_ACCESS_ERR = 15</li>
<li>VALIDATION_ERR = 16</li>
<li>TYPE_MISMATCH_ERR = 17</li>
<li>INVALID_EXPRESSION_ERR = 51</li>
<li>TYPE_ERR = 52</li>
<li>PARSE_ERR = 81</li>
<li>SERIALIZE_ERR = 82</li>
<li>FoX_INVALID_NODE = 201</li>
<li>FoX_INVALID_CHARACTER = 202</li>
<li>FoX_NO_SUCH_ENTITY = 203</li>
<li>FoX_INVALID_PI_DATA = 204</li>
<li>FoX_INVALID_CDATA_SECTION = 205</li>
<li>FoX_HIERARCHY_REQUEST_ERR = 206</li>
<li>FoX_INVALID_PUBLIC_ID = 207</li>
<li>FoX_INVALID_SYSTEM_ID = 208</li>
<li>FoX_INVALID_COMMENT = 209</li>
<li>FoX_NODE_IS_NULL = 210</li>
<li>FoX_INVALID_ENTITY = 211</li>
<li>FoX_INVALID_URI = 212</li>
<li>FoX_IMPL_IS_NULL = 213</li>
<li>FoX_MAP_IS_NULL = 214</li>
<li>FoX_LIST_IS_NULL = 215</li>
<li>FoX_INTERNAL_ERROR = 999</li>
</ul>
<h3>Live nodelists</h3>
<p><a name="DomLiveNodelists"/></p>
<p>The DOM specification requires that all NodeList objects are <em>live</em> - that is, that any change in the document structure is immediately reflected in the contents of any nodelists.</p>
<p>For example, any nodelists returned by getElementsByTagName or getElementsByTagNameNS must be updated whenever nodes are added to or removed from the document; and the order of nodes in the nodelists must be changed if the document structure changes.</p>
<p>Though FoX does keep all nodelists live, this can impose a significant performance penalty when manipulating large documents. Therefore, FoX can be instructed to inly use 'dead' nodelists - that is, nodelists which reflect a snapshot of the document structure at the point they were created. To do this, call <code>setLiveNodeLists</code> (see API documentation).</p>
<p>However, note that the nodes within the nodelist remain live - any changes made to the nodes will be reflected in accessing them through the nodelist.</p>
<p>Furthermore, since the nodelists are still associated with the document, they and their contents will be rendered inaccessible when the document is destroyed.</p>
<h2>DOM Configuration</h2>
<p><a name="DomConfiguration"/></p>
<p>Multiple valid DOM trees may be produced from a single document. When parsing input, some of these choices are made available to the user.</p>
<p>By default, the DOM tree presented to the user will be produced according to the following criteria:</p>
<ul>
<li>there will be no adjacent text nodes </li>
<li>Cdata nodes will appear as such in the DOM tree </li>
<li>EntityReference nodes will appear in the DOM tree.</li>
</ul>
<p>However, if another tree is desired, the user may change this. For example, very often you would rather be working with the fully canonicalized tree, with all cdata sections replaced by text nodes and merged, and all entity references replaced with their contents.</p>
<p>The mechanism for doing this is the optional <code>configuration</code> argument to <code>parseFile</code> and <code>parseString</code>. <code>configuration</code> is a <code>DOMConfiguration</code> object, which may be manipulated by <code>setParameter</code> calls.</p>
<p>Note that FoX's implementation of <code>DOMConfiguration</code> does not follow the specification precisely. One <code>DOMConfiguration</code> object controls all of parsing, normalization and serialization. It can be used like so:</p>
<pre><code>use FoX_dom
implicit none
type(Node), pointer :: doc
! Declare a new configuration object
type(DOMConfiguration), pointer :: config
! Request full canonicalization
! ie convert CDATA sections to text sections, remove all entity references etc.
config => newDOMConfig()
call setParameter(config, "canonical-form", .true.)
! Turn on validation
call setParameter(config, "validate", .true.)
! parse the document
doc => parseFile("doc.xml", config)
! Do a whole lot of DOM processing ...
! change the configuration to allow cdata-sections to be preserved.
call setParameter(getDomConfig(doc), "cdata-sections", .true.)
! normalize the document again
call normalizeDocument(doc)
! change the configuration to influence the output - make sure there is an XML declaration
call setParameter(getDomConfig(doc), "xml-declaration", .true.)
! and write the document out.
call serialize(doc)
! once everything is done, destroy the doc and config
call destroy(doc)
call destroy(config)
</code></pre>
<p>The available configuration options are fully explained in:</p>
<ul>
<li><a href="http://www.w3.org/TR/DOM-Level-3-Core/core.html#DOMConfiguration">DOM Core 3</a> </li>
<li><a href="http://www.w3.org/TR/2004/REC-DOM-Level-3-LS-20040407/load-save.html#LS-LSParser">DOM Core LSParser</a> </li>
<li><a href="http://www.w3.org/TR/2004/REC-DOM-Level-3-LS-20040407/load-save.html#LS-LSSerializer">DOM Core LSSerializer</a> </li>
</ul>
<p>and are all implemented, with the exceptions of: <code>error-handler</code>, <code>schema-location</code>, and <code>schema-type</code>. <br />
In total there are 24 implemented configuration options (<code>schema-location</code> and <code>schema-type</code> are not
implemented). The options known by FoX are as follows:</p>
<ul>
<li><code>canonical-form</code> default: false, can be set to true. See note below.</li>
<li><code>cdata-sections</code> default: true, can be changed.</li>
<li><code>check-character-normalization</code> default: false, cannot be changed.</li>
<li><code>comments</code> default: true, can be changed.</li>
<li><code>datatype-normalization</code> default: false, cannot be changed.</li>
<li><code>element-content-whitespace</code> default: true, can be changed.</li>
<li><code>entities</code> default: true, can be changed.</li>
<li><code>error-handler</code> default: false, cannot be changed. This is a breach of the DOM specification.</li>
<li><code>namespaces</code> default: true, can be changed.</li>
<li><code>namespace-declarations</code> default: true, can be changed.</li>
<li><code>normalize-characters</code> default: false, cannot be changed.</li>
<li><code>split-cdata-sections</code> default: true, can be changed.</li>
<li><code>validate</code> default: false, can be changed. See note below.</li>
<li><code>validate-if-schema</code> default: false, can be changed.</li>
<li><code>well-formed</code> default true, cannot be changed.</li>
<li><code>charset-overrides-xml-encoding</code> default false, cannot be changed.</li>
<li><code>disallow-doctype</code> default false, cannot be changed.</li>
<li><code>ignore-unknown-character-denormalizations</code> default true, cannot be changed.</li>
<li><code>resource-resolver</code> default false, cannot be changed.</li>
<li><code>supported-media-types-only</code> default false, cannot be changed.</li>
<li><code>discard-default-content</code> default: true, can be changed.</li>
<li><code>format-pretty-print</code> default: false, cannot be changed.</li>
<li><code>xml-declaration</code> default: true, can be changed.</li>
<li><code>invalid-pretty-print</code> default: false, can be changed. This is a FoX specific extension which works like <code>format-pretty-print</code> but does not preseve the validity of the document.</li>
</ul>
<p>Setting <code>canonical-form</code> changes the value of <code>entities</code>, <code>cdata-sections</code>, <code>discard-default-content</code>, <code>invalid-pretty-print</code>, and <code>xml-declaration</code>to false and changes <code>namespaces</code>, <code>namespace-declarations</code>, and <code>element-content-whitespace</code> to true. Unsetting <code>canonical-form</code> causes these options to revert to the defalt settings. Changing the values of any of these options has the side effect of unsetting <code>canonical-form</code> (but does not cause the other options to be reset). Setting <code>validate</code> unsets <code>validate-if-schema</code> and vica versa.</p>
<h2>DOM Miscellanea</h2>
<p><a name="DomMiscellanea"/></p>
<p>Other issues</p>
<ul>
<li>As mentioned in the documentation for WXML, it is impossible within Fortran to reliably output lines longer than 1024 characters. While text nodes containing such lines may be created in the DOM, on serialization newlines will be inserted as described in the documentation for WXML.</li>
<li>All caveats with regard to the FoX SAX processor apply to reading documents through the DOM interface. In particular, note that documents containing characters beyond the US-ASCII set will not be readable.</li>
</ul>
<p>It was decided to implement W3C DOM interfaces primarily because they are specified in a language-agnostic fashion, and thus made Fortran implementation possible. A number of criticisms have been levelled at the W3C DOM, but many apply only from the perspective of Java developers. However, more importantly, the W3C DOM suffers from a lack of sufficient error checking so it is very easy to create a DOM tree, or manipulate an existing DOM tree into a state, that cannot be serialized into a legal XML document.</p>
<p>(Although the Level 3 DOM specifications finally addressed this issue, they did so in a fashion that was neither very useful, nor easily translatable into a Fortran API.)</p>
<p>Therefore, FoX will by default produce errors about many attempts to manipulate the DOM in such a way as would result in invalid XML. These errors can be switched off if standards-compliant behaviour is wanted. Although extensive, these checks are not complete.
In particular, the way the W3C DOM mandates namespace handling makes it trivial to produce namespace non-well-formed document trees, and very difficult for the processor to automatically detect the non-well-formedness. Thus a fully well-formed tree is only guaranteed after a suitable <code>normalizeDocument</code> call.</p>
</div>
</body>
</html>