ocaml sax xml parser
OCaml
Switch branches/tags
Nothing to show
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
doc
src
.gitignore
COPYING
README

README

oxml is currently a simple non-conforming XML parser with
a SAX API for applications written in OCaml.

The parser was written in order to parse simple XML-based network
protocols in asynchronous client-server applications.  The programming
interface follows that of Expat/FXP, except for the use of OCaml
objects.

Most of the complexity-inducing portions of the XML specification have
not been implemented, primarily support for DTD parsing and
validation, parsed external entities and Unicode, since these
features are not used in the applications the parser was written for.

In particular, with reference to the XML 1.1 specification at
http://www.w3.org/TR/2004/REC-xml11-20040204:

- Since there is no support for DTD whatsoever, features
  that rely on DTD declarations like internal DTD sections,
  conditional sections, entity and notation declarations are not
  supported.

- No entities other than the predefined entities of Section 4.6 are
  supported.

- Attributes are always treated as of type CDATA, and attribute values
  are normalized as specified in Section 3.3.3.  Charset encoding
  declarations are basically ignored, and only ISO 8859-1 encoding is
  supported.

- xml:space is not implemented; all whitespace in element content is
  treated as significant.

The following well-formedness constraints of the XML 1.1 specification
are not implemented:

http://www.w3.org/TR/2004/REC-xml11-20040204/#ExtSubset
http://www.w3.org/TR/2004/REC-xml11-20040204/#PE-between-Decls
http://www.w3.org/TR/2004/REC-xml11-20040204/#wfc-PEinInternalSubset
http://www.w3.org/TR/2004/REC-xml11-20040204/#wf-entdeclared

The following well-formedness constraints are implemented:

http://www.w3.org/TR/2004/REC-xml11-20040204/#GIMatch
http://www.w3.org/TR/2004/REC-xml11-20040204/#uniqattspec
http://www.w3.org/TR/2004/REC-xml11-20040204/#NoExternalRefs
http://www.w3.org/TR/2004/REC-xml11-20040204/#CleanAttrVals
http://www.w3.org/TR/2004/REC-xml11-20040204/#uniqattspec

(the implementation of the next constraint is restricted to the
 ISO 8859-1 charset)
http://www.w3.org/TR/2004/REC-xml11-20040204/#wf-Legalchar 

(the next two constraints fall out trivially, since entity
 declarations are not supported)
http://www.w3.org/TR/2004/REC-xml11-20040204/#textent 
http://www.w3.org/TR/2004/REC-xml11-20040204/#norecursion

(the next constraint is also trivial, since internal DTD subsets are
 not supported)
http://www.w3.org/TR/2004/REC-xml11-20040204/#indtd

End-of-line normalization (Section 2.11) is implemented.