Skip to content

Latest commit

 

History

History
45 lines (29 loc) · 1.3 KB

README.md

File metadata and controls

45 lines (29 loc) · 1.3 KB

Textorizer

Build status Nuget

Sanitize and 'clean' html for safe consumption in a plain text format.

  var plainText = Textorize.HtmlToPlainText("<span>I contain html</span><p>convert me</p>");
  //  plaintext = "I contain html\nconvert me\n"  

Converts html input to a safe plain text representation without html. Content in Style and Script tags are completely removed, html entity characters are explicitly converted to their unicode characters. Invalid html is handled best effort for a reasonable equivalent plain text output.

Keep in mind the following equivalence:

Textorize(input) == Textorize(HtmlEncode(Textorize(input)))

For more examples see the testsuite

Install

Package Manager Console

PM> Install-Package Textorizer

.NET CLI Console

> dotnet add package Textorizer

License

Dual licensed

MIT

https://opensource.org/licenses/MIT

Unlicense

https://opensource.org/licenses/Unlicense