New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] Add a HtmlScrubber class #679
Conversation
This class can remove parts of the HTML. (In the `Emogrifier` class, this was solved with switches for the emogrification process.) Fixes #651
I'll also appreciate feedback on the name of the class. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look mostly fine, though I've commented on the assert in the test - you might want to consider a change there.
Regarding the name, I'm not sure. One name it should not have is HtmlMinifier
as that would imply something different: removal of whitespace, renaming classes to one or two letter names, etc.; and may cause confusion. Other possibile nouns that spring to mind (or found with a thesaurus): tidier, cleaner, garbage-collector, reducer, squeezer, squasher, declutterer, streamliner, purifier, depolluter - though none of them strike me as being necessarily any better. I'll have a think...
|
||
$subject->removeInvisibleNodes(); | ||
|
||
self::assertNotRegExp('/display:\\s*none;?/', $subject->render()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whilst it's OK as it is, the ;?
isn't necessary, and adding a possessive quantifier (\\s*
-> \\s*+
) would improve performance. That said, we could simply assert that HTML doesn't contain '<div'
, which would be a stronger assertion...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, that's right. I completely missed that. I'll change that in a minute.
A few more: de-redundantifier; pruner; eliminator. |
I am thinking now that |
|
Repushed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine except that '<div>'
needs to be '<div'
in the test...
|
||
$subject->removeInvisibleNodes(); | ||
|
||
self::assertNotRegExp('/display:\\s*none;?/', $subject->render()); | ||
self::assertNotContains('<div>', $subject->render()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to omit the closing >
otherwise it will miss the tag with attributes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Darn, you're right. This happens when I code before having had a decent coffee.
This class can remove parts of the HTML. (In the
Emogrifier
class, thiswas solved with switches for the emogrification process.)
Fixes #651