Skip to content
Java
Find file
Failed to load latest commit information.
config
src
.gitignore ignore all jars Mar 7, 2013
.travis.yml
README.txt announce merge and 2.4 release Mar 7, 2013
build.xml different location for bundle jar Mar 8, 2013
example.xml
licence.txt random .gitignore removed Mar 7, 2013
pom.xml

README.txt


NOTE: This fork of htmlcleaner is now merged back into the http://htmlcleaner.sourceforge.net/ project as of version 2.4

2.4 is officially released!

This fork is kept only to help with patch submission to the official version.

==========================================================================


* omitHtmlEnvelope behavior change:
 * output all the html contained in the body not just first TagNode contents. ( useful for cleaning html fragments )  ( creates a new blank TagNode to hold the nodes to be outputed
 * omitHtmlEnvelope also triggers omitDoctype

* TagNodes that can be reopened after their parent is closed ( i.e. <b><i></b> -- would result in <b><i></i><b><i> ) if the reopened tag ( <i> in this example ) is immediately closed, the reopened tag is pruned. -- accomplished by checking the autoGenerated boolean on TagNode ) 

* refactoring template methods from Utils to TagTransformer.

*CleanerTransformations changes:
 * Utils.updateTagTransformations now member function.
 * Handles the transformation work so that multiple TagTransformations can be applied to a given tag. ( sets up for regex expression matching ) 
 * now owns responsibility for determining transformed tagname.
 *concept of global AttributeTransformations -- used to strip all attributes that start with "on" for example ( i.e. "onclick" , "onblur" ) 
 * plus added regular expressions matching on values/attribute names

XmlSerializer/HtmlCleaner -- remove IOException being thrown when reading from strings.

* work on spotting "tricky" encoding -- unencode normal ascii characters.

 * get Default Output charset from CleanerProperties

 * handle badly encoded numbers better for example &x0fx , &0A; were parsed badly before

 * added a bunch of html special entities

 * convert &apos; in html context to &#39; 
 * added regex attribute/value matching

 * random spelling corrections
 * additional documentation
 
* add greek and math symbols

* cleanup change - if tag was closed due to improperly placed child it will be reopened after the child.
  See ClosedTagReopenTest.java for examples
  
* added audit code - now it is possible to hook in code that will be notified about changes that htmlcleaner does.
  See CleanerProperties#addHtmlModificationListener.
  
* Added unit tests for escapeXml function from Utils

* JDom generation updated not to fail on starting with 'xml' attributes. 

* Unit tests TODOs added  
Something went wrong with that request. Please try again.