Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

C# port of the Validator.nu HTML Parser (http://about.validator.nu/htmlparser/)

branch: master
README.md

HtmlParserSharp

This is a manual C# port of the Validator.nu HTML Parser, a HTML5 parser originally written in Java and (compiled to C++ using the Google Web Toolkit) used by Mozilla's Gecko rendering engine. The port uses the DOM implemented in System.Xml.

Status

PLEASE SEE https://github.com/jamietre/HtmlParserSharp FOR AN ACTIVELY MAINTAINED VERSION OF THIS PROJECT.

Currently the port is based on Validator.nu 1.3.1 and works, as far as I have tested it. However as there are no unit tests, I'm not sure if every detail is working correctly. Tests showed that it is quite fast (about 3-6 times slower than parsing XML using .NET's XDocument API, but I think XML parsing is easier to implement, so this is okay and it's still FAST).

What's missing

If you want to contribute, maybe you can start here:

  • Support for character encodings other than UTF-8
  • More C#-ish coding style
  • Unit tests
  • Look for TODOs in the code
Something went wrong with that request. Please try again.