MParser

Welcome to the MParser!

The MParser is short for Mini (XML) Parser that combines important benefits:

Simplified but still very powerful grammar (reduced from 89 to 25 rules) that makes it much more intuitive and improves performance.
High performance and low memory usage (thanks to it's streaming nature similar to SAX)
Automatic nodes recognition (roughly speaking it's similar to a basic XPath and allows unique identification of elements in XML tree)
Efficient and small POJO implementations that lacks dependencies on third-part libraries.

MParser's grammar is based on one fundamental observation. Most IT people don't know nor take advantage of the full XML grammar. Therefore the simpler grammar is a balanced compromise that allows to parse most XMLs - for example the MParser works with Jenkins */api/xml files.

MParser characteristic

The fundamental assumptions for MParser are:

POJO - use plain Java objects for maximal compatibility.
Implements own token and grammar engine from scratch, avoided dependencies on any fancy libraries.
High performance and low memory consumption model with event messages similar to SAX.
Powerful, built-in engine that allows to globally identify and distinguish XML elements while traversing an XML.
Very simple language to explicitly define structure of XML (their elements like a very simple "XPath").
Implements subset of XML grammar that guarantees parsing most of XML files.

All together it gives a parser that:

Can parse most XML files (as long they uses only the selected rules).
The jar takes only some KB and doesn't require any external classes.
Combines advantages of both SAX and DOM: takes minimum resources and allows to unambiguously traverse XML elements.

Very loosely the MParser grammar can be expressed as:

The system complies with XML elements: empty tag element or a sequence of start element, content and end element.
A tag includes name. A start tag might also include attributes in addition to a name.
Content can be empty or can include other elements, comment or a stream of char data.
Escaping special characters is OK. See table 2 for details.
The '' sequence is accepted but there is no expectation that the system will act according to it.
Works with ASCII and UTF-8 data.

Please note, MParser doesn't parse XML that includes CDATA.

More resources:

See MParserExample - a working example with howto
See Tokens and grammar
See MParser Wiki for more information.

Please send a message to aftrwork.tea@gmail.com if you are interested in license to use the parser.

Thank you for visiting!

Robert

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
MParser		MParser
MParserExample		MParserExample
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MParser

Welcome to the MParser!

MParser characteristic

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MParser

Welcome to the MParser!

MParser characteristic

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages