Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 10 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,13 @@

[API Documentation][API documentation]

html5ever is an HTML parser developed as part of the [Servo](https://github.com/servo/servo) project.
html5ever is an HTML parser developed as part of the [Servo][] project.

It can parse and serialize HTML according to the [WHATWG](https://whatwg.org/) specs (aka "HTML5"). There are some omissions at present, most of which are documented [in the bug tracker](https://github.com/servo/html5ever/issues?q=is%3Aopen+is%3Aissue+label%3Aweb-compat). html5ever passes all tokenizer tests from [html5lib-tests](https://github.com/html5lib/html5lib-tests), and most tree builder tests outside of the unimplemented features. The goal is to pass all html5lib tests, and also provide all hooks needed by a production web browser, e.g. `document.write`.
It can parse and serialize HTML according to the [WHATWG](https://whatwg.org/) specs (aka "HTML5"). There are some omissions at present, most of which are documented [in the bug tracker][]. html5ever passes all tokenizer tests from [html5lib-tests][], and most tree builder tests outside of the unimplemented features. The goal is to pass all html5lib tests, and also provide all hooks needed by a production web browser, e.g. `document.write`.

Note that the HTML syntax is a language almost, but not quite, entirely unlike XML. For correct parsing of XHTML, use an XML parser. (That said, many XHTML documents in the wild are serialized in an HTML-compatible form.)

html5ever is written in [Rust](http://www.rust-lang.org/), so it avoids the most notorious security problems from C, but has performance similar to a parser written in C. You can call html5ever as if it were a C library, without pulling in a garbage collector or other heavy runtime requirements.
html5ever is written in [Rust][], so it avoids the most notorious security problems from C, but has performance similar to a parser written in C. You can call html5ever as if it were a C library, without pulling in a garbage collector or other heavy runtime requirements.


## Getting started in Rust
Expand All @@ -22,7 +22,7 @@ Add html5ever as a dependency in your [`Cargo.toml`](http://crates.io/) file:
html5ever = "*"
```

Then take a look at [`examples/html2html.rs`](https://github.com/servo/html5ever/blob/master/examples/html2html.rs) and [`examples/print-rcdom.rs`](https://github.com/servo/html5ever/blob/master/examples/print-rcdom.rs) and the [API documentation][].
Then take a look at [`examples/html2html.rs`] and [`examples/print-rcdom.rs`] and the [API documentation][].

## Getting started in other languages

Expand Down Expand Up @@ -51,3 +51,9 @@ The code is cross-referenced with the WHATWG syntax spec, and eventually we will
html5ever builds against the official stable releases of Rust, though some optimizations are only supported on nightly releases.

[API documentation]: http://doc.servo.org/html5ever/index.html
[Servo]: https://github.com/servo/servo
[Rust]: http://www.rust-lang.org/
[in the bug tracker]: https://github.com/servo/html5ever/issues?q=is%3Aopen+is%3Aissue+label%3Aweb-compat
[html5lib-tests]: https://github.com/html5lib/html5lib-tests
[`examples/html2html.rs`]: https://github.com/servo/html5ever/blob/master/html5ever/examples/html2html.rs
[`examples/print-rcdom.rs`]: https://github.com/servo/html5ever/blob/master/html5ever/examples/print-rcdom.rs