Skip to content

Commit

Permalink
See 2001-06-08 ChangeLog
Browse files Browse the repository at this point in the history
  • Loading branch information
smh committed Jun 8, 2001
1 parent 1b2c067 commit 1270b3e
Show file tree
Hide file tree
Showing 2 changed files with 143 additions and 130 deletions.
5 changes: 5 additions & 0 deletions ChangeLog
@@ -1,3 +1,8 @@
2001-06-08 Steve Haflich <smh@romeo>

* pxml.htm: Added mention that it is necessary to load or require
the module. Cleaned up a little html formatting.

2001-05-30 John Foderaro <jkf@tiger.franz.com>

* phtml.cl - add :,_,- and . to valid attribute name characters.
Expand Down
268 changes: 138 additions & 130 deletions pxml.htm
Expand Up @@ -43,20 +43,20 @@
<br>
a. a string containing text content, such as &quot;Here is some text with a &quot;<br>
<br>
b. a list representing a XML tag with associated attributes and/or content,<br>
b. a list representing a XML tag with associated attributes and/or content,
such as ('item1 &quot;text&quot;) or (('item1 :att1 &quot;help.html&quot;)
&quot;link&quot;). If the XML tag<br>
does not have associated attributes, then the first list member will be a<br>
symbol representing the XML tag, and the other elements will <br>
represent the content, which can be a string (text content), a symbol (XML<br>
tag with no attributes or content), or list (nested XML tag with<br>
associated attributes and/or content). If there are associated attributes,<br>
then the first list member will be a list containing a symbol<br>
followed by two list members for each associated attribute; the first member is a<br>
symbol representing the attribute, and the next member is a string corresponding<br>
&quot;link&quot;). If the XML tag
does not have associated attributes, then the first list member will be a
symbol representing the XML tag, and the other elements will
represent the content, which can be a string (text content), a symbol (XML
tag with no attributes or content), or list (nested XML tag with
associated attributes and/or content). If there are associated attributes,
then the first list member will be a list containing a symbol
followed by two list members for each associated attribute; the first member is a
symbol representing the attribute, and the next member is a string corresponding
to the attribute value.<br>
<br>
c. XML comments and or processing instructions - see the more detailed example below for<br>
c. XML comments and or processing instructions - see the more detailed example below for
further information.</p>

<p><a name="props"></a><strong>Non Validating Parser Properties</strong></p>
Expand Down Expand Up @@ -139,27 +139,27 @@
<br>
<strong><big>Usage Notes</big></strong><br>
<br>
<a name="modern"></a>1. The parse-xml function has been primarily compiled and tested in a
<ol>
<li><a name="modern"></a>The parse-xml function has been primarily compiled and tested in a
modern ACL. However, in an ANSI Lisp with wide character support, it DOES pass the valid
component of the conformance suite in the same manner as it does in a Modern Lisp. The
parser's successful operation in all potential situations depends on wide character
support. .<br>
<br>
<a name="keyword"></a>2. The parser uses the keyword package for DTD tokens and other<br>
special XML tokens. Since element and attribute token symbols are usually interned<br>
in the current package, it is not recommended to execute parse-xml<br>
when the current package is the keyword package.<br>
<br>
<a name="namespace"></a>3. The XML parser supports the XML Namespaces specification. The
parser<br>
recognizes a &quot;xmlns&quot; attribute and attribute names starting with
&quot;xmlns:&quot;.<br>
As per the specification, the parser expects that the associated value<br>
is an URI string. The parser then associates XML Namespace prefixes with a<br>
Lisp package provided via the parse-xml :uri-to-package option or, if <br>
necessary, a package created on the fly. The following example demonstrates<br>
parser's successful operation in all potential situations depends on wide character support.
<br><br>
</li>
<li><a name="keyword"></a>The parser uses the keyword package for DTD tokens and other
special XML tokens. Since element and attribute token symbols are usually interned
in the current package, it is not recommended to execute parse-xml
when the current package is the keyword package.
<br><br>
</li>
<li><a name="namespace"></a>The XML parser supports the XML Namespaces specification. The
parser recognizes a &quot;xmlns&quot; attribute and attribute names starting with
&quot;xmlns:&quot;.
As per the specification, the parser expects that the associated value
is an URI string. The parser then associates XML Namespace prefixes with a
Lisp package provided via the parse-xml :uri-to-package option or, if
necessary, a package created on the fly. The following example demonstrates
this behavior:<br>
</p>

<p>(setf *xml-example-string4*<br>
&nbsp;&nbsp; &quot;&lt;bibliography<br>
Expand Down Expand Up @@ -224,37 +224,44 @@
&nbsp; (#&lt;uri urn:com:books-r-us&gt; . #&lt;The royal package&gt;)<br>
&nbsp; (#&lt;uri http://www.bibliography.org/XML/bib.ns&gt; . #&lt;The bib package&gt;)))<br>
<br>
In the absence of XML Namespace attributes, element and attribute symbols are interned<br>
in the current package. Note that this implies that attributes and elements referenced<br>
in DTD content will be interned in the current package.<br>
<br>
6. The parse-xml function has been tested using the OASIS conformance test suite (see<br>
details below). The test suite has wide coverage across possible XML and DTD syntax,<br>
but there may be some syntax paths that have not yet been tested or completely<br>
supported. Here is a list of currently known syntax parsing issues:<br>
<br>
<a name="unicode-scalar"></a>a. ACL does not support 4 byte Unicode scalar values, so
input containing such data<br>
will not be processed correctly. (Note, however, that parse-xml does correctly detect<br>
and process wide Unicode input.)<br>
<br>
<a name="big-endian"></a>b. The OASIS tests that contain wide Unicode all use a
little-endian encoded Unicode.<br>
Changes to the unicode-check function are required to also support big-endian encoded<br>
Unicode. (Note also that this issue may be resolved by an ACL 6.0 final release change.)<br>
<br>
c. An initial &lt;?xml declaration in external entity files is skipped without a check<br>
being made to see if the &lt;?xml declaration is itself incorrect.<br>
<br>
<a name="debug"></a>7. When investigating possible parser errors or examining more closely
where the parser<br>
determined that the input was non-well-formed, the net.xml.parser internal symbols <br>
*debug-xml* and *debug-dtd* are useful. When not bound to nil, these variables cause<br>
lexical analysis and intermediate parsing results to be output to *standard-output*.<br>
<br>
</li>
<li>In the absence of XML Namespace attributes, element and attribute symbols are interned
in the current package. Note that this implies that attributes and elements referenced
in DTD content will be interned in the current package.
</li>
<li>The parse-xml function has been tested using the OASIS conformance test suite (see
details below). The test suite has wide coverage across possible XML and DTD syntax,
but there may be some syntax paths that have not yet been tested or completely
supported. Here is a list of currently known syntax parsing issues:
<ul>
<li><a name="unicode-scalar"></a>ACL does not support 4 byte Unicode scalar values, so
input containing such data
will not be processed correctly. (Note, however, that parse-xml does correctly detect
and process wide Unicode input.)
</li>
<li><a name="big-endian"></a>The OASIS tests that contain wide Unicode all use a
little-endian encoded Unicode.
Changes to the unicode-check function are required to also support big-endian encoded
Unicode. (Note also that this issue may be resolved by an ACL 6.0 final release change.)
</li>
<li>An initial &lt;?xml declaration in external entity files is skipped without a check
being made to see if the &lt;?xml declaration is itself incorrect.
</li>
</ul>
</li>
<li><a name="debug"></a>When investigating possible parser errors or examining more closely
where the parser
determined that the input was non-well-formed, the net.xml.parser internal symbols
*debug-xml* and *debug-dtd* are useful. When not bound to nil, these variables cause
lexical analysis and intermediate parsing results to be output to *standard-output*.
</li>
<li><a name="loading"></a>It is necessary to load the <b>pxml</b> module before using it.
Typically this can be done by evaluating <b>(require&nbsp;:pxml)</b>.
</li>
</ol>
<a name="conformance"></a><strong>XML Conformance Test Suite</strong><br>
<br>
Using the OASIS test suite <a href="http://www.oasis-open.org">(http://www.oasis-open.org)</a>,<br>
Using the OASIS test suite <a href="http://www.oasis-open.org">(http://www.oasis-open.org)</a>,
here are the current parse-xml results:<br>
<br>
xmltest/invalid:&nbsp;&nbsp;&nbsp; Not tested, since parse-xml is a non-validating parser<br>
Expand Down Expand Up @@ -298,82 +305,83 @@
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; uri-to-package<br>
<br>
Returns multiple values:<br>
<br>
1) LXML and parsed DTD output, as described above.<br>
2) An association list containing the uri-to-package argument conses (if any)<br>
and conses associated with any XML Namespace packages created during the<br>
parse (see uri-to-package argument description, below).<br>
<br>
The external-callback argument, if specified, is a function object or symbol<br>
that parse-xml will execute when encountering an external DTD subset<br>
or external entity DTD declaration. Here is an example which shows that<br>
arguments the function should expect, and the value it should return:<br>
<br>
(defun file-callback (uri-object token &amp;optional public)<br>
;; the uri-object is an ACL URI object created from<br>
;; the XML input. In this example, this function<br>
;; assumes that all uri's will be file specifications.<br>
;;<br>
;; the token argument identifies what token is associated<br>
;; with the external parse (for example :DOCTYPE for external<br>
;; DTD subset<br>
;;<br>
;; the public argument contains the associated PUBLIC string,<br>
;; when present<br>
;;<br>
(declare (ignorable token public))<br>
;; an open stream is returned on success<br>
;; a nil return value indicates that the external<br>
;; parse should not occur<br>
;; Note that parse-xml will close the open stream before<br>
;; exiting<br>
(ignore-errors (open (uri-path uri-object))))<br>
<br>
The general-entities argument is an association list containing general entity symbol <br>
and replacement text pairs. The entity symbols should be in the keyword package.<br>
Note that this option may be useful in generating desirable parse results in <br>
situations where you do not wish to parse external entities or the external DTD subset.<br>
<br>
The parameter-entities argument is an association list containing parameter entity symbol <br>
and replacement text pairs. The entity symbols should be in the keyword package.<br>
Note that this option may be useful in generating desirable parse results in <br>
situations where you do not wish to parse external entities or the external DTD subset.<br>
<br>
The uri-to-package argument is an association list containing uri objects and package<br>
objects. Typically, the uri objects correspond to XML Namespace attribute values, and<br>
the package objects correspond to the desired package for interning symbols associated<br>
with the uri namespace. If the parser encounters an uri object not contained in this list,<br>
<ol>
<li>LXML and parsed DTD output, as described above.</li>
<li>An association list containing the uri-to-package argument conses (if any)
and conses associated with any XML Namespace packages created during the
parse (see uri-to-package argument description, below).</li>
</ol>
The external-callback argument, if specified, is a function object or symbol
that parse-xml will execute when encountering an external DTD subset
or external entity DTD declaration. Here is an example which shows that
arguments the function should expect, and the value it should return:
<br><pre>
(defun file-callback (uri-object token &amp;optional public)
;; The uri-object is an ACL URI object created from
;; the XML input. In this example, this function
;; assumes that all uri's will be file specifications.
;;
;; The token argument identifies what token is associated
;; with the external parse (for example :DOCTYPE for external
;; DTD subset
;;
;; The public argument contains the associated PUBLIC string,
;; when present
;;
(declare (ignorable token public))
;; An open stream is returned on success,
;; a nil return value indicates that the external
;; parse should not occur.
;; Note that parse-xml will close the open stream before exiting.
(ignore-errors (open (uri-path uri-object))))
</pre>
<p>
The general-entities argument is an association list containing general entity symbol
and replacement text pairs. The entity symbols should be in the keyword package.
Note that this option may be useful in generating desirable parse results in
situations where you do not wish to parse external entities or the external DTD subset.
<p>
The parameter-entities argument is an association list containing parameter entity symbol
and replacement text pairs. The entity symbols should be in the keyword package.
Note that this option may be useful in generating desirable parse results in
situations where you do not wish to parse external entities or the external DTD subset.
<p>
The uri-to-package argument is an association list containing uri objects and package
objects. Typically, the uri objects correspond to XML Namespace attribute values, and
the package objects correspond to the desired package for interning symbols associated
with the uri namespace. If the parser encounters an uri object not contained in this list,
it will generate a new package. The first generated package will be named
net.xml.namespace.0,<br>
the second will be named net.xml.namespace.1, and so on.<br>
<br>
parse-xml Methods<br>
<br>
(parse-xml (p stream) &amp;key external-callback content-only <br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; general-entities
parameter-entities<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; uri-to-package)<br>
<br>
(parse-xml (str string) &amp;key external-callback content-only <br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; general-entities
parameter-entities<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; uri-to-package)<br>
<br>
An easy way to parse a file containing XML input:<br>
<br>
(with-open-file (p &quot;example.xml&quot;)<br>
(parse-xml p :content-only p))<br>
<br>
<strong>net.xml.parser unexported special variables:</strong><br>
<br>
net.xml.namespace.0,
the second will be named net.xml.namespace.1, and so on.
<h3>parse-xml methods</h3>
<pre>
(parse-xml (p stream) &amp;key
external-callback content-only
general-entities
parameter-entities
uri-to-package)

(parse-xml (str string) &amp;key
external-callback content-only
general-entities
parameter-entities
uri-to-package)
</pre>
An easy way to parse a file containing XML input:
<pre>
(with-open-file (p &quot;example.xml&quot;)
(parse-xml p :content-only p))
</pre>
<h3>net.xml.parser unexported special variables:</h3>
<p>
*debug-xml*<br>
<br>
When not bound to nil, generates XML lexical state and intermediary<br>
parse result debugging output.<br>
<br>
When true, parse-xml generates XML lexical state and intermediary
parse result debugging output.
<p>
*debug-dtd*<br>
<br>
When not bound to nil, generates DTD lexical state and intermediary<br>
parse result debugging output.</p>
When true, parse-xml generates DTD lexical state and intermediary
parse result debugging output.
</body>
</html>

0 comments on commit 1270b3e

Please sign in to comment.