I've added support for hrefs starting with '#', so that links to internal anchors on a page don't get ripped out by the Cleaner, which is something I need for my day job and thought might be useful to you.

W3C syntax for anchors states simply that they must start with a # and contain no spaces.
I've added 2 tests in the CleanerTest class that document this behaviour.

If you could merge this in I'd be grateful; currently we're using the forked jar, but it'd be nice to stay on trunk.

ishults commented Jul 23, 2014

I noticed this was never merged -- is this fix not wanted, or were there issues with the implementation? Would it be worth submitting a new pull request for this issue?

jhy commented Oct 2, 2014

Merged with #441, thanks

@jhy jhy closed this Oct 2, 2014
