-
Notifications
You must be signed in to change notification settings - Fork 1
/
libxml.dbk
7389 lines (6223 loc) · 314 KB
/
libxml.dbk
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<?xml version="1.0" encoding="UTF-8"?>
<!-- -*- nxml -*- -->
<book lang="en-us">
<title>XML::LibXML</title>
<bookinfo>
<authorgroup>
<author>
<firstname>Matt</firstname>
<surname>Sergeant</surname>
</author>
<author>
<firstname>Christian</firstname>
<surname>Glahn</surname>
</author>
<author>
<firstname>Petr</firstname>
<surname>Pajas</surname>
</author>
</authorgroup>
<edition>1.70</edition>
<copyright>
<year>2001-2007</year>
<holder>AxKit.com Ltd</holder>
</copyright>
<copyright>
<year>2002-2006</year>
<holder>Christian Glahn</holder>
</copyright>
<copyright>
<year>2006-2009</year>
<holder>Petr Pajas</holder>
</copyright>
</bookinfo>
<chapter id="README">
<title>Introduction</title>
<titleabbrev>README</titleabbrev>
<para>This module implements a Perl interface to the Gnome
libxml2 library which provides
interfaces for parsing and manipulating XML files. This
module allows Perl programmers to make use of the highly
capable validating XML parser and the high performance DOM
implementation.</para>
<sect1>
<title>Important Notes</title>
<para>XML::LibXML was almost entirely reimplemented between version 1.40 to version 1.49. This may cause problems on some production machines. With
version 1.50 a lot of compatibility fixes were applied, so programs written for XML::LibXML 1.40 or less should run with version 1.50 again.</para>
<para>In 1.59, a new callback API was introduced. This new API is not compatible with the previous one.
See XML::LibXML::InputCallback manual page for details.</para>
<para>In 1.61 the XML::LibXML::XPathContext module, previously distributed separately, was merged in.</para>
<para>An experimental support for Perl threads introduced in 1.66 has been replaced in 1.67.</para>
</sect1>
<sect1>
<title>Dependencies</title>
<para>Prior to installation you MUST have installed the libxml2 library. You can get the latest libxml2 version from</para>
<para>http://xmlsoft.org/</para>
<para>Without libxml2 installed this module will neither build nor run.</para>
<para>Also XML::LibXML requires the following packages:</para>
<itemizedlist>
<listitem>
<para>XML::SAX - base class for SAX parsers</para>
</listitem>
<listitem>
<para>XML::NamespaceSupport - namespace support for SAX parsers</para>
</listitem>
</itemizedlist>
<para>These packages are required. If one is missing some tests will fail.</para>
<para>Again, libxml2 is required to make XML::LibXML work. The library is not just required to build XML::LibXML, it has to be accessible during
run-time as well. Because of this you need to make sure libxml2 is installed properly. To test this, run the xmllint program on your system. xmllint
is shipped with libxml2 and therefore should be available.
For building the module you will also need the header file for libxml2, which in binary
(.rpm,.deb) etc. distributions usually dwell in a package named libxml2-devel or similar.</para>
</sect1>
<sect1>
<title>Installation</title>
<para>(These instructions are for UNIX and GNU/Linux systems. For MSWin32,
See Notes for Microsoft Windows below.)</para>
<para>To install XML::LibXML just follow the standard installation routine for Perl modules:</para>
<orderedlist>
<listitem>
<para>perl Makefile.PL</para>
</listitem>
<listitem>
<para>make</para>
</listitem>
<listitem>
<para>make test</para>
</listitem>
<listitem>
<para>make install # as superuser</para>
</listitem>
</orderedlist>
<para>Note that XML::LibXML is an XS based Perl extension and you need a C compiler
to build it.</para>
<para>Note also that you should rebuild XML::LibXML if you upgrade libxml2
in order to avoid problems with possible binary incompatibilities between releases of the library.</para>
<sect2>
<title>Notes on libxml2 versions</title>
<para>XML::LibXML requires at least
libxml2 2.6.16 to compile and pass all tests and
at least 2.6.21 is required for XML::LibXML::Reader.
For some older OS versions this means that an
update of the pre-built packages is required.</para>
<para>Although libxml2 claims binary compatibility between
its patch levels, it is a good idea to recompile XML::LibXML
and run its tests after an upgrade of libxml2.
</para>
<para>If your libxml2 installation is not within your $PATH,
you can pass the XMLPREFIX=$YOURLIBXMLPREFIX parameter to Makefile.PL
determining the correct libxml2 version in use. e.g.
</para>
<programlisting> perl Makefile.PL XMLPREFIX=/usr/brand-new </programlisting>
<para>will ask '/usr/brand-new/bin/xml2-config' about your real libxml2 configuration.</para>
<para>Try to avoid setting INC and LIBS directly on the
command-line, for if used, Makefile.PL does not check
the libxml2 version for compatibility with XML::LibXML.</para>
</sect2>
<sect2>
<title>Which version of libxml2 should be used?</title>
<para>XML::LibXML is tested against a couple versions of
libxml2 before it is released. Thus there are versions
of libxml2 that are known not to work properly with
XML::LibXML. The Makefile.PL keeps a blacklist of
the incompatible libxml2 versions.</para>
<para>If Makefile.PL detects one of the incompatible versions,
it notifies the user. It may still happen that
XML::LibXML builds and pass its tests with such
a version, but that does not mean everything
is OK. There will be no support at all for blacklisted versions!</para>
<para>As of XML::LibXML 1.61, only versions 2.6.16 and higher are supported.
XML::LibXML will probably not compile with earlier libxml2 versions than
2.5.6. Versions prior to 2.6.8 are known to be broken for various reasons,
versions prior to 2.1.16 exhibit problems with namespaced attributes
and do not therefore pass XML::LibXML regression tests.
</para>
<para>It may happen that an unsupported version of libxml2
passes all tests under certain conditions. This is no
reason to assume that it shall work without problems.
If Makefile.PL marks a version of libxml2 as incompatible or broken
it is done for a good reason.</para>
</sect2>
<sect2>
<title>Notes for Microsoft Windows</title>
<para>Thanks to Randy Kobes there is a pre-compiled PPM package available on</para>
<para>http://theoryx5.uwinnipeg.ca/ppmpackages/</para>
<para>Usually it takes a little time to build the package for the latest release.</para>
<para>If you want to build XML::LibXML on Windows from source, you can use
the following instructions contributed by Christopher J. Madsen:</para>
<para>These instructions assume that you already have your system set up to
compile modules that use C components.
</para>
<para>
First, get the libxml2 binaries from http://xmlsoft.org/sources/win32/
(currently also available at http://www.zlatkovic.com/pub/libxml/).
</para>
<para>
You need:
</para>
<programlisting> iconv-VERSION.win32.zip
libxml2-VERSION.win32.zip
zlib-VERSION.win32.zip</programlisting>
<para>Download the latest version of each. (Each package will probably have
a different version.) When you extract them, you'll get directories
named iconv-VERSION.win32, libxml2-VERSION.win32, and
zlib-VERSION.win32, each containing bin, lib, and include directories.</para>
<para>Combine all the bin, include, and lib directories under c:\Prog\LibXML.
(You can use any directory you prefer; just adjust the instructions
accordingly.)</para>
<para>Get the latest version of XML-LibXML from CPAN.
Extract them.</para>
<para>Issue these commands in the XML-LibXML-Common-VERSION directory:</para>
<programlisting> perl Makefile.PL INC=-Ic:\Prog\LibXML\include LIBS=-Lc:\Prog\LibXML\lib
nmake
copy c:\Prog\LibXML\bin\*.dll blib\arch\auto\XML\LibXML
nmake test
nmake install</programlisting>
<para>(Note: Some systems use dmake instead of nmake.)</para>
<para>By copying the libxml2 DLLs to the arch directory, you help avoid
conflicts with other programs you may have installed that use other
(possibly incompatible) versions of those DLLs.</para>
<para>Issue these commands in the XML-LibXML-VERSION directory:</para>
<programlisting> perl Makefile.PL INC=-Ic:\Prog\LibXML\include LIBS=-Lc:\Prog\LibXML\lib
nmake
nmake test
nmake install</programlisting>
</sect2>
<sect2>
<title>Notes for Mac OS X</title>
<para>Due refactoring the module, XML::LibXML will not
run with some earlier versions of Mac OS X. It appears that this is related
to special linker options for that OS prior to version
10.2.2. Since the developers do not have full access to this OS,
help/ patches from OS X gurus are highly
appreciated.</para>
<para>It is confirmed that XML::LibXML builds and runs
without problems since Mac OS X 10.2.6.</para>
</sect2>
<sect2>
<title>Notes for HPUX</title>
<para>XML::LibXML requires libxml2 2.6.16 or
later. There may not exist a usable binary
libxml2 package for HPUX and XML::LibXML. If
HPUX cc does not compile libxml2
correctly, you will be forced to recompile perl with
gcc (unless you have already done that).</para>
<para>Additionally I received the following Note from Rozi Kovesdi:</para>
<programlisting>Here is my report if someone else runs into the same problem:
Finally I am done with installing all the libraries and XML Perl
modules
The combination that worked best for me was:
gcc
GNU make
Most importantly - before trying to install Perl modules that depend on
libxml2:
must set SHLIB_PATH to include the path to libxml2 shared library
assuming that you used the default:
export SHLIB=/usr/local/lib
also, make sure that the config files have execute permission:
/usr/local/bin/xml2-config
/usr/local/bin/xslt-config
they did not have +x after they were installed by 'make install'
and it took me a while to realize that this was my problem
or one can use:
perl Makefile.PL LIBS='-L/path/to/lib' INC='-I/path/to/include'</programlisting>
</sect2>
</sect1>
<sect1>
<title>Contact</title>
<para>For bug reports, please use the CPAN request tracker on http://rt.cpan.org/NoAuth/Bugs.html?Dist=XML-LibXML</para>
<para>For suggestions etc. you may contact the maintainer directly at "pajas at ufal dot mff dot cuni dot cz", but in general, it is recommended to use the mailing list given below.
</para>
<para>For suggestions etc., and other issues
related to XML::LibXML you may use the perl XML mailing list
(<email>perl-xml@listserv.ActiveState.com</email>),
where most XML-related Perl modules are discussed.
In case of problems you should check the archives of that
list first. Many problems are already discussed there. You
can find the list's archives and subscription options at
http://aspn.activestate.com/ASPN/Mail/Browse/Threaded/perl-xml</para>
</sect1>
<sect1>
<title>Package History</title>
<para>Version < 0.98 were maintained by Matt Sergeant</para>
<para>0.98 > Version > 1.49 were maintained by Matt Sergeant and Christian Glahn</para>
<para>Versions >= 1.49 are maintained by Christian Glahn</para>
<para>Versions > 1.56 are co-maintained by Petr Pajas</para>
<para>Versions >= 1.59 are provisionally maintained by Petr Pajas</para>
</sect1>
<sect1>
<title>Patches and Developer Version</title>
<para>As XML::LibXML is open source software help and
patches are appreciated. If you find a bug in the current
release, make sure this bug still exists in the developer
version of XML::LibXML. This version can be downloaded
from its Subversion repository, e.g. via</para>
<para>svn co svn://axkit.org/XML-LibXML/trunk</para>
<para>Note that this account does not allow direct commits.</para>
<para>Please consider all regression tests as correct. If
any test fails it is most certainly related to a
bug.</para>
<para>If you find documentation bugs, please fix them in
the libxml.dbk file, stored in the docs directory.</para>
</sect1>
<sect1>
<title>Known Issues</title>
<para>The push-parser implementation causes memory leaks.</para>
</sect1>
</chapter>
<chapter id="LICENSE">
<title>License</title>
<titleabbrev>LICENSE</titleabbrev>
<para>This is free software, you may use it and distribute it under the same terms as Perl itself.</para>
<para>Copyright 2001-2003 AxKit.com Ltd., 2002-2006 Christian Glahn, 2006-2009 Petr Pajas</para>
<sect1>
<title>Disclaimer</title>
<para>THIS PROGRAM IS DISTRIBUTED IN THE HOPE THAT IT WILL
BE USEFUL, BUT WITHOUT ANY WARRANTY; WITHOUT EVEN THE
IMPLIED WARRANTY OF MERCHANTABILITY OR FITNESS FOR A
PARTICULAR PURPOSE.</para>
</sect1>
</chapter>
<chapter id="XML-LibXML">
<title>Perl Binding for libxml2</title>
<titleabbrev>XML::LibXML</titleabbrev>
<sect1>
<title>Synopsis</title>
<programlisting>use XML::LibXML;
my $dom = XML::LibXML->load_xml(string => <<'EOT');
<some-xml/>
EOT</programlisting>
</sect1>
<sect1>
<title>Description</title>
<para>This module is an interface to libxml2, providing
XML and HTML parsers with DOM, SAX and XMLReader interfaces,
a large subset of DOM Layer 3 interface and
a XML::XPath-like interface to XPath API of libxml2.
The module is split into several packages which are not described in this section;
unless stated otherwise, you only need to <literal>use XML::LibXML;</literal>
in your programs.</para>
<para>For further information, please check the following documentation:</para>
<variablelist>
<varlistentry>
<term><xref linkend="XML-LibXML-Parser"/></term>
<listitem>
<para>Parsing XML files with XML::LibXML</para>
</listitem>
</varlistentry>
<varlistentry>
<term><xref linkend="XML-LibXML-DOM"/></term>
<listitem>
<para>XML::LibXML Document Object Model (DOM) Implementation</para>
</listitem>
</varlistentry>
<varlistentry>
<term><xref linkend="XML-LibXML-SAX"/></term>
<listitem>
<para>XML::LibXML direct SAX parser</para>
</listitem>
</varlistentry>
<varlistentry>
<term><xref linkend="XML-LibXML-Reader"/></term>
<listitem>
<para>Reading XML with a pull-parser</para>
</listitem>
</varlistentry>
<varlistentry>
<term><xref linkend="XML-LibXML-Dtd"/></term>
<listitem>
<para>XML::LibXML frontend for DTD validation</para>
</listitem>
</varlistentry>
<varlistentry>
<term><xref linkend="XML-LibXML-RelaxNG"/></term>
<listitem>
<para>XML::LibXML frontend for RelaxNG schema validation</para>
</listitem>
</varlistentry>
<varlistentry>
<term><xref linkend="XML-LibXML-Schema"/></term>
<listitem>
<para>XML::LibXML frontend for W3C Schema schema validation</para>
</listitem>
</varlistentry>
<varlistentry>
<term><xref linkend="XML-LibXML-XPathContext"/></term>
<listitem>
<para>API for evaluating XPath expressions with enhanced support
for the evaluation context</para>
</listitem>
</varlistentry>
<varlistentry>
<term><xref linkend="XML-LibXML-InputCallback"/></term>
<listitem>
<para>Implementing custom URI Resolver and input callbacks</para>
</listitem>
</varlistentry>
<varlistentry>
<term><xref linkend="XML-LibXML-Common"/></term>
<listitem>
<para>Common functions for XML::LibXML related Classes</para>
</listitem>
</varlistentry>
</variablelist>
<para>The nodes in the Document Object Model (DOM) are represented by the following classes
(most of which "inherit" from <xref linkend="XML-LibXML-Node"/>):</para>
<variablelist>
<varlistentry>
<term><xref linkend="XML-LibXML-Document"/></term>
<listitem>
<para>XML::LibXML class for DOM document nodes</para>
</listitem>
</varlistentry>
<varlistentry>
<term><xref linkend="XML-LibXML-Node"/></term>
<listitem>
<para>Abstract base class for XML::LibXML DOM nodes</para>
</listitem>
</varlistentry>
<varlistentry>
<term><xref linkend="XML-LibXML-Element"/></term>
<listitem>
<para>XML::LibXML class for DOM element nodes</para>
</listitem>
</varlistentry>
<varlistentry>
<term><xref linkend="XML-LibXML-Text"/></term>
<listitem>
<para>XML::LibXML class for DOM text nodes</para>
</listitem>
</varlistentry>
<varlistentry>
<term><xref linkend="XML-LibXML-Comment"/></term>
<listitem>
<para>XML::LibXML class for comment DOM nodes</para>
</listitem>
</varlistentry>
<varlistentry>
<term><xref linkend="XML-LibXML-CDATASection"/></term>
<listitem>
<para>XML::LibXML class for DOM CDATA sections</para>
</listitem>
</varlistentry>
<varlistentry>
<term><xref linkend="XML-LibXML-Attr"/></term>
<listitem>
<para>XML::LibXML DOM attribute class</para>
</listitem>
</varlistentry>
<varlistentry>
<term><xref linkend="XML-LibXML-DocumentFragment"/></term>
<listitem>
<para>XML::LibXML's DOM L2 Document Fragment implementation</para>
</listitem>
</varlistentry>
<varlistentry>
<term><xref linkend="XML-LibXML-Namespace"/></term>
<listitem>
<para>XML::LibXML DOM namespace nodes</para>
</listitem>
</varlistentry>
<varlistentry>
<term><xref linkend="XML-LibXML-PI"/></term>
<listitem>
<para>XML::LibXML DOM processing instruction nodes</para>
</listitem>
</varlistentry>
</variablelist>
</sect1>
<sect1>
<title>Encodings support in XML::LibXML</title>
<para>Recall that since version 5.6.1, Perl distinguishes between
character strings (internally encoded in UTF-8) and so
called binary data and, accordingly, applies either
character or byte semantics to them. A scalar
representing a character string is distinguished from
a byte string by special flag (UTF8). Please refer to <emphasis>perlunicode</emphasis> for details.
</para>
<para>
XML::LibXML's API is designed to deal with many
encodings of XML documents completely transparently, so
that the application using XML::LibXML can be completely
ignorant about the encoding of the XML documents it works with.
On the other hand, functions like <function>XML::LibXML::Document->setEncoding</function>
give the user control over the document encoding.
</para>
<para>
To ensure the aforementioned transparency and
uniformity, most functions of XML::LibXML that work with
in-memory trees accept and return data as character
strings (i.e. UTF-8 encoded with the UTF8 flag on)
regardless of the original document encoding; however,
the functions related to I/O operations (i.e. parsing
and saving) operate with binary data (in the original
document encoding) obeying the encoding declaration of
the XML documents.</para>
<para>Below we summarize basic rules and principles
regarding encoding:
</para>
<orderedlist>
<listitem><para>Do NOT apply any encoding-related PerlIO layers
(<literal>:utf8</literal> or <literal>:encoding(...)</literal>)
to file handles that are an input for the parses
or an output for a serializer of (full) XML documents.
This is because the conversion of the data to/from the internal character representation
is provided by libxml2 itself which must be able to enforce the encoding
specified by the <literal><?xml version="1.0" encoding="..."?></literal>
declaration. Here is an example to follow:
<programlisting>use XML::LibXML;
# load
open my $fh, "file.xml";
binmode $fh; # drop all PerlIO layers possibly created by a <literal>use open</literal> pragma
$doc = XML::LibXML->load_xml(IO => $fh);
# save
open my $out, "out.xml";
binmode $out; # as above
$doc->toFh($out);
# or
print $out $doc->toString();</programlisting>
</para>
</listitem>
<listitem>
<para>All functions working with DOM accept and return
character strings (UTF-8 encoded with UTF8 flag on). E.g.
<programlisting><![CDATA[
my $doc = XML::LibXML:Document->new('1.0',$some_encoding);
my $element = $doc->createElement($name);
$element->appendText($text);
$xml_fragment = $element->toString(); # returns a character string
$xml_document = $doc->toString(); # returns a byte string
]]>
</programlisting>
where
<literal>$some_encoding</literal> is the document encoding
that will be used when saving the document,
and <literal>$name</literal> and <literal>$text</literal>
contain character strings (UTF-8 encoded with UTF8 flag on).
Note that the method <function>toString</function>
returns XML as a character string if applied to
other node than the Document node and
a byte string containing the apropriate
<programlisting><?xml version="1.0" encoding="..."?></programlisting>
declaration if applied to a <xref linkend="XML-LibXML-Document"/>.
</para>
</listitem>
<listitem>
<para>DOM methods also accept binary strings in the original encoding of the
document to which the node belongs (UTF-8 is assumed if the node is not
attached to any document). Exploiting this feature is NOT RECOMMENDED
since it is considered a bad practice.
</para>
<programlisting><![CDATA[
my $doc = XML::LibXML:Document->new('1.0','iso-8859-2');
my $text = $doc->createTextNode($some_latin2_encoded_byte_string);
# WORKS, BUT NOT RECOMMENDED!
]]>
</programlisting>
</listitem>
</orderedlist>
<para><emphasis>NOTE:</emphasis> libxml2 support for many
encodings is based on the iconv library. The actual list
of supported encodings may vary from platform to
platform. To test if your platform works correctly with
your language encoding, build a simple document in the
particular encoding and try to parse it with XML::LibXML
to see if the parser produces any errors. Occasional
crashes were reported on rare platforms that ship with a broken
version of iconv.</para>
</sect1>
<sect1>
<title>Thread Support</title>
<para>
XML::LibXML since 1.67 partially supports Perl threads
in Perl >= 5.8.8. XML::LibXML can be used with threads
in two ways:
</para>
<para>
By default, all
XML::LibXML classes use CLONE_SKIP class method
to prevent Perl from copying XML::LibXML::* objects
when a new thread is spawn.
In this mode, all XML::LibXML::* objects are thread specific.
This is the safest way
to work with XML::LibXML in threads.
</para>
<para>
Alternatively, one may use
</para>
<programlisting>use threads;
use XML::LibXML qw(:threads_shared);</programlisting>
<para>
to indicate, that
all XML::LibXML node and parser objects
should be shared between the main thread
and any thread spawn from there.
For example, in
</para>
<programlisting>my $doc = XML::LibXML->load_xml(location => $filename);
my $thr = threads->new(sub{
# code working with $doc
1;
});
$thr->join;
</programlisting>
<para>
the variable <literal>$doc</literal>
refers to the exact same XML::LibXML::Document
in the spawned thread as in the main thread.
</para>
<para>
Without using mutex locks,
oaralel threads may read the same document
(i.e. any node that belongs to the document),
parse files, and modify different documents.
</para>
<para>
However, if there is a chance that
some of the threads will attempt to modify a document
( or even create
new nodes based on that document,
e.g. with <literal>$doc->createElement</literal>)
that other threads may be reading at the same time,
the user is responsible for creating a mutex lock
and using it in <emphasis>both</emphasis>
in the thread that modifies and
the thread that reads:
</para>
<programlisting>my $doc = XML::LibXML->load_xml(location => $filename);
my $mutex : shared;
my $thr = threads->new(sub{
lock $mutex;
my $el = $doc->createElement('foo');
# ...
1;
});
{
lock $mutex;
my $root = $doc->documentElement;
say $root->name;
}
$thr->join;
</programlisting>
<para>Note that libxml2 uses dictionaries to store short strings and
these dicionaries are kept on a document node. Without mutex locks, it
could happen in the previous example that the thread modifies the
dictionary while other threads attempt to read from it, which could
easily lead to a crash.</para>
</sect1>
<sect1>
<title>Version Information</title>
<para>Sometimes it is useful to figure out, for which
version XML::LibXML was compiled for. In most cases this
is for debugging or to check if a given installation meets
all functionality for the package. The functions
XML::LibXML::LIBXML_DOTTED_VERSION and
XML::LibXML::LIBXML_VERSION provide this version
information. Both functions simply pass through the values
of the similar named macros of libxml2.
Similarly, XML::LibXML::LIBXML_RUNTIME_VERSION returns
the version of the (usually dynamically) linked libxml2.
</para>
<variablelist>
<varlistentry>
<term>XML::LibXML::LIBXML_DOTTED_VERSION</term>
<listitem>
<funcsynopsis>
<funcsynopsisinfo>$Version_String = XML::LibXML::LIBXML_DOTTED_VERSION;</funcsynopsisinfo>
</funcsynopsis>
<para>Returns the version string of the
libxml2 version XML::LibXML was compiled
for. This will be "2.6.2" for "libxml2
2.6.2".</para>
</listitem>
</varlistentry>
<varlistentry>
<term>XML::LibXML::LIBXML_VERSION</term>
<listitem>
<funcsynopsis>
<funcsynopsisinfo>$Version_ID = XML::LibXML::LIBXML_VERSION;</funcsynopsisinfo>
</funcsynopsis>
<para>Returns the version id of the libxml2
version XML::LibXML was compiled for. This
will be "20602" for "libxml2 2.6.2". Don't mix
this version id with
$XML::LibXML::VERSION. The latter contains the
version of XML::LibXML itself while the first
contains the version of libxml2 XML::LibXML
was compiled for.</para>
</listitem>
</varlistentry>
<varlistentry>
<term>XML::LibXML::LIBXML_RUNTIME_VERSION</term>
<listitem>
<funcsynopsis>
<funcsynopsisinfo>$DLL_Version = XML::LibXML::LIBXML_RUNTIME_VERSION;</funcsynopsisinfo>
</funcsynopsis>
<para>Returns a version string of the libxml2
which is (usually dynamically) linked by
XML::LibXML. This will be "20602" for libxml2
released as "2.6.2" and something like
"20602-CVS2032" for a CVS build of
libxml2.</para>
<para>XML::LibXML issues a warning if the version
of libxml2 dynamically linked to it is less than the version of libxml2
which it was compiled against.
</para>
</listitem>
</varlistentry>
</variablelist>
</sect1>
<sect1>
<title>EXPORTS</title>
<para>
By default the module exports all constants and functions
listed in the :all tag, described below.
</para>
</sect1>
<sect1>
<title>EXPORT TAGS</title>
<variablelist>
<varlistentry>
<term><literal>:all</literal></term>
<listitem>
<para>Includes the tags <literal>:libxml</literal>, <literal>:encoding</literal>, and
<literal>:ns</literal> described below.</para>
</listitem>
</varlistentry>
<varlistentry>
<term><literal>:libxml</literal></term>
<listitem>
<para>Exports integer constants for DOM node types.</para>
<programlisting>XML_ELEMENT_NODE => 1
XML_ATTRIBUTE_NODE => 2
XML_TEXT_NODE => 3
XML_CDATA_SECTION_NODE => 4
XML_ENTITY_REF_NODE => 5
XML_ENTITY_NODE => 6
XML_PI_NODE => 7
XML_COMMENT_NODE => 8
XML_DOCUMENT_NODE => 9
XML_DOCUMENT_TYPE_NODE => 10
XML_DOCUMENT_FRAG_NODE => 11
XML_NOTATION_NODE => 12
XML_HTML_DOCUMENT_NODE => 13
XML_DTD_NODE => 14
XML_ELEMENT_DECL => 15
XML_ATTRIBUTE_DECL => 16
XML_ENTITY_DECL => 17
XML_NAMESPACE_DECL => 18
XML_XINCLUDE_START => 19
XML_XINCLUDE_END => 20</programlisting>
</listitem>
</varlistentry>
<varlistentry>
<term><literal>:encoding</literal></term>
<listitem>
<para>Exports two encoding conversion functions from XML::LibXML::Common.</para>
<programlisting>
encodeToUTF8()
decodeFromUTF8()
</programlisting>
</listitem>
</varlistentry>
<varlistentry>
<term><literal>:ns</literal></term>
<listitem>
<para>Exports two convenience constants: the implicit namespace of the
reserved <literal>xml:</literal> prefix,
and the implicit namespace for the reserved <literal>xmlns:</literal> prefix.</para>
<programlisting>
XML_XML_NS => 'http://www.w3.org/XML/1998/namespace'
XML_XMLNS_NS => 'http://www.w3.org/2000/xmlns/'
</programlisting>
</listitem>
</varlistentry>
</variablelist>
</sect1>
<sect1>
<title>Related Modules</title>
<para>The modules described in this section are not part of the XML::LibXML package itself. As they support some additional features, they are
mentioned here.</para>
<variablelist>
<varlistentry>
<term><olink targetdoc="XML::LibXSLT">XML::LibXSLT</olink></term>
<listitem>
<para>XSLT 1.0 Processor using libxslt and XML::LibXML</para>
</listitem>
</varlistentry>
<varlistentry>
<term><olink targetdoc="XML::LibXML::Iterator">XML::LibXML::Iterator</olink></term>
<listitem>
<para>XML::LibXML Implementation of the DOM Traversal Specification</para>
</listitem>
</varlistentry>
<varlistentry>
<term><olink targetdoc="XML::CompactTree::XS">XML::CompactTree::XS</olink></term>
<listitem>
<para>Uses XML::LibXML::Reader to very efficiently to parse XML document
or element into native Perl data structures, which are less flexible but
significantly faster to process then DOM.</para>
</listitem>
</varlistentry>
</variablelist>
</sect1>
<sect1>
<title>XML::LibXML and XML::GDOME</title>
<para>Note: <emphasis>THE FUNCTIONS DESCRIBED HERE ARE STILL EXPERIMENTAL</emphasis></para>
<para>Although both modules make use of libxml2's XML capabilities, the DOM implementation of both modules are not compatible. But still it is
possible to exchange nodes from one DOM to the other. The concept of this exchange is pretty similar to the function cloneNode(): The particular
node is copied on the low-level to the opposite DOM implementation.</para>
<para>Since the DOM implementations cannot coexist within one document, one is forced to copy each node that should be used. Because you are always
keeping two nodes this may cause quite an impact on a machines memory usage.</para>
<para>XML::LibXML provides two functions to export or import GDOME nodes: import_GDOME() and export_GDOME(). Both function have two parameters: the
node and a flag for recursive import. The flag works as in cloneNode().</para>
<para>The two functions allow to export and import XML::GDOME nodes explicitly, however, XML::LibXML allows also the transparent import of
XML::GDOME nodes in functions such as appendChild(), insertAfter() and so on. While native nodes are automatically adopted in most functions
XML::GDOME nodes are always cloned in advance. Thus if the original node is modified after the operation, the node in the XML::LibXML document will
not have this information.</para>
<variablelist>
<varlistentry>
<term>import_GDOME</term>
<listitem>
<funcsynopsis>
<funcsynopsisinfo>$libxmlnode = XML::LibXML->import_GDOME( $node, $deep );</funcsynopsisinfo>
</funcsynopsis>
<para>This clones an XML::GDOME node to a XML::LibXML node explicitly.</para>
</listitem>
</varlistentry>
<varlistentry>
<term>export_GDOME</term>
<listitem>
<funcsynopsis>
<funcsynopsisinfo>$gdomenode = XML::LibXML->export_GDOME( $node, $deep );</funcsynopsisinfo>
</funcsynopsis>
<para>Allows to clone an XML::LibXML node into a XML::GDOME node.</para>
</listitem>
</varlistentry>
</variablelist>
</sect1>
<sect1>
<title>CONTACTS</title>
<para>For bug reports, please use the CPAN request tracker on http://rt.cpan.org/NoAuth/Bugs.html?Dist=XML-LibXML</para>
<para>For suggestions etc., and other issues
related to XML::LibXML you may use the perl XML mailing list
(<email>perl-xml@listserv.ActiveState.com</email>),
where most XML-related Perl modules are discussed.
In case of problems you should check the archives of that
list first. Many problems are already discussed there. You
can find the list's archives and subscription options at
<ulink url="http://aspn.activestate.com/ASPN/Mail/Browse/Threaded/perl-xml">http://aspn.activestate.com/ASPN/Mail/Browse/Threaded/perl-xml</ulink>.
</para>
</sect1>
</chapter>
<chapter id="XML-LibXML-Parser">
<title>Parsing XML Data with XML::LibXML</title>
<titleabbrev>XML::LibXML::Parser</titleabbrev>
<sect1>
<title>Synopsis</title>
<programlisting>use XML::LibXML 1.70;
<!--
my $dom = XML::LibXML->load_xml(
location => $file_or_url,
# or string => $xml_string,
# or IO => $perl_file_handle,
# ...parser options...
);
my $html_dom = XML::LibXML->load_html(
location => $file_or_url,
# or string => $html_string,
# or IO => $perl_file_handle,
# ...parser options...
);
my $parser = XML::LibXML->new(
# ... parser options ...
);