Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
C# Other
Branch: master

This branch is 28 commits behind MindTouch:master

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
CommandLine
SGMLTests
packages
sgmlreaderdll
tools
.gitignore
Demo.aspx
ReadMe.md
Readme.htm
SGMLReader.nuspec
SgmlReader.sln
build.ps1
default.ps1
download.gif
eula.htm
license.txt
mindtouch.build.xml

ReadMe.md

SGMLReader - Convert any HTML to valid XML

SGMLReader is a versatile C# .NET library written by Chris Lovett for parsing HTML/SGML files. The original community around SGMLReader used to be hosted by GotDotNet, but this site was phased out (update: it appears the code has re-surfaced on MSDN Code Gallery, but without any updates). MindTouch Dream and MindTouch Core use the SGMLReader library extensively. Over the last few years we have made many improvements to this code; thereby, making us the de facto maintainers of this library. In the spirit of the original author, we're providing back these changes on the MindTouch Developer Center site.

XmlDocument FromHtml(TextReader reader) {

    // setup SGMLReader
    Sgml.SgmlReader sgmlReader = new Sgml.SgmlReader();
    sgmlReader.DocType = "HTML";
    sgmlReader.WhitespaceHandling = WhitespaceHandling.All;
    sgmlReader.CaseFolding = Sgml.CaseFolding.ToLower;
    sgmlReader.InputStream = reader;

    // create document
    XmlDocument doc = new XmlDocument();
    doc.PreserveWhitespace = true;
    doc.XmlResolver = null;
    doc.Load(sgmlReader);
    return doc;
}

Sample Output

Visit the HTML-to-XML Conversion Examples page to see how SGMLReader converts HTML source into valid XML.

Community

If you find/fix issues in SGMLReader, please post them in the SGMLReader forum.

Release History

Visit the SGMLReader wiki page for a complete release history.

License

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.

Something went wrong with that request. Please try again.