Email Cloaking plugin corrupts HTML #11353

Merged
merged 4 commits into from Aug 4, 2016

Projects

None yet
@demis-palma
Contributor
demis-palma commented Jul 29, 2016 edited

The plugin “Content - Email Cloaking”, under some circumstances, corrupts the HTML of the processed content.

The use of email addresses within attributes of HTML tags is legitimate in the HTML code of a Joomla article.
Chances are high this actually happens when users include any kind of modules within articles, using the “Content - Load Modules” plugin (now used more than ever, due to the new editor button “Module”).

An article containing this HTML code
<img src="/envelope.png" title="email@address.com">

Should appear like that:

img-expected

But the cloak plugin corrupts the output

<img src="/envelope.png" title="<span id="cloak84493">This email address is being protected ...</span><script type='text/javascript'>...</script>

Which produce that result:

img-got

The same goes for an input element

<input type="text" value="" placeholder="email@address.com">

Which should appear like that:

input-expected

but it becomes like that instead:

<input type="text" value="" placeholder="<span id="cloak77416">This email address is being protected ...</span><script type='text/javascript'>...</script>

input-got

The cause

HTML attributes (like title=”...”) don't allow any further nested HTML tags, nor JavaScripts inside them.

Summary of Changes

  1. The negative lookahead '(?![^<]*>)' added to the regular expression is used to exclude occurrences within HTML tags.
  2. Added exhaustive comments to explain the reason behind the lookahead.

Testing Instructions

I have tested the new regular expression with a lot of sample text, and it behaves good to me, but everyone interested is encouraged to test further (we know how tricky regular expressions can be).

@wojsmol
Contributor
wojsmol commented Jul 29, 2016

@demis-palma Please see CS PR demis-palma#1

demis-palma added some commits Jul 29, 2016
@demis-palma demis-palma Merge pull request #1 from wojsmol/11353-cs
Comment CS fix
cc71247
@demis-palma demis-palma Changed comment stype as per Wojciech's suggestion
8feba8d
@andrepereiradasilva
Contributor
andrepereiradasilva commented Jul 29, 2016 edited

I have tested this item successfully on 8feba8d

Confirmed issue and works as described.
Did some tests and all worked fine.

But again...

(we know how tricky regular expressions can be)



This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/11353.

@truptikagathara

I have tested this item successfully on 8feba8d


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/11353.

@zero-24
Contributor
zero-24 commented Jul 30, 2016

RTC on testing.


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/11353.

@jeckodevelopment
Member

RTC

@zero-24
Contributor
zero-24 commented Jul 30, 2016

@joomla-cms-bot please add the RTC label!


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/11353.

@zero-24 zero-24 referenced this pull request in joomla/jissues Jul 30, 2016
Closed

The Bot don't run for the CMS Issue #11353 #850

@brianteeman
Contributor

Now lets try to make the bot work

@jeckodevelopment
Member

it seems that the bot doesn't work :)
RTC please

@brianteeman brianteeman added the RTC label Jul 30, 2016
@brianteeman
Contributor

Adding it manually seems to have worked

@wilsonge wilsonge added this to the Joomla 3.6.2 milestone Aug 1, 2016
@PhilETaylor
Contributor

I have tested this item successfully on 8feba8d

Installed 3.6.1 - added this html to article

<img title="phil@phil-taylor.com" src="/images/powered_by.png" alt="phil@phil-taylor.com">

It was broken when rendered on frontend.

applied PR #11353 - then the rendering was fine, and rendered the html as above


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/11353.

@PhilETaylor
Contributor

@wilsonge any chance this can go to 3.6.2 ?

@jeckodevelopment
Member

It was originally scheduled for 3.6.2, but it has been moved to 3.6.3.
As far as i know, no new features will be added to the new 3.6.2.

@brianteeman
Contributor
@PhilETaylor
Contributor

its a bug fix - not a new feature

@jeckodevelopment
Member

You're right both! :D
Let's wait for @wilsonge

@Sandra97
Sandra97 commented Aug 4, 2016

I have tested this item 🔴 unsuccessfully on 8feba8d

I applied the patch but as long as the Email Cloaking Plugin is enable, my display is not correct as it removes my CSS class:
If I have for example this: <a href="mailto:toto@toto.com?subject=Subject" class="nameofmyclass">email</a>
It becomes in the source code: <span id="cloak9b08867df87471a610f686815c73ade8"><a href="mailto:toto@toto.com?subject=subject">email</a></span>



This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/11353.
@PhilETaylor
Contributor
PhilETaylor commented Aug 4, 2016 edited

@Sandra97 to be clear - your input to the test is the HTML

<a href="mailto:toto@toto.com?subject=Subject" class="nameofmyclass">email</a>

right?

With that input I get this as an output:

<span id="cloakf157cce915dfb5ec51f818d7e55aee9e"><a href="mailto:toto@toto.com?subject=Subject">email</a></span>

which is like you say _missing_ the CSS class...

@PhilETaylor
Contributor

This is exactly the type of bug that would benefit from 100 unit tests on it!!!

@mbabker
Member
mbabker commented Aug 4, 2016

We almost had unit tests for this specific plugin. #4071

Sadly poor ol' George inherited my workload and it was closed out due to inactivity.

@brianteeman
Contributor
@PhilETaylor
Contributor

lol - I dont know where you think I get so much free time from... tomorrow I have to sit for 2 hours to watch 3-9year olds sing Chitty Chitty Bang Bang in a amateur play... give me unit testing anytime over that!!! #sendhelp

@brianteeman
Contributor
@demis-palma
Contributor

@Sandra97 Thanks for testing, and I confirm your result: the class is removed, while it shouldn't.
However, this is not caused or fixed by this PR. It is simply a different bug which was present before and is still present after this patch. You'll find that your current version of Joomla is affected as well.

If you open a separate issue, I can fix it.
But consider that the current implementation of Email Cloak is based on regular expressions.
Regular expressions are definitely not good for parse HTML.
As much as I can fix something, someone will be always able to find another rare code sequence that cause problems.

But as I said, I would open a different issue, because this is a different bug, it was already present before this PR, and this PR is not intended to fix it.

@PhilETaylor
Contributor

As much as I can fix something, someone will be always able to find another rare code sequence that cause problems.

Which is why a set of reproducible unit tests on this code would be invaluable!

@demis-palma
Contributor

@PhilETaylor I agree. However, you know, parsing HTML using regexp is fighting against windmills.

@Sandra97
Sandra97 commented Aug 4, 2016

@demis-palma, thanks. I'm gonna open a new issue

@Sandra97
Sandra97 commented Aug 4, 2016

@demis-palma Please see #11456.
Thank you.

@demis-palma
Contributor

@Sandra97 would you remove the flag "unsuccessfully" from this PR?
Now its state is "Human Test Results: 3 Successful 1 Failed."

@wilsonge wilsonge merged commit 98b7c98 into joomla:staging Aug 4, 2016

1 of 2 checks passed

JTracker/HumanTestResults Human Test Results: 3 Successful 1 Failed.
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@wilsonge wilsonge modified the milestone: Joomla 3.6.2, Joomla 3.6.3 Aug 4, 2016
@brianteeman
Contributor

Removed RTC as it has been merged

@brianteeman brianteeman removed the RTC label Aug 4, 2016
@Sandra97
Sandra97 commented Aug 4, 2016

I have tested this item successfully on 8feba8d


This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/11353.

@Edu20 Edu20 pushed a commit to Edu20/joomla-cms that referenced this pull request Aug 11, 2016
@demis-palma demis-palma + Eduard Ellert Email Cloaking plugin corrupts HTML (#11353)
* Fixed the regular expression

* Comment CS fix

* Changed comment stype as per Wojciech's suggestion
41221cb
@roland-d roland-d added a commit to roland-d/joomla-cms that referenced this pull request Sep 11, 2016
@demis-palma @roland-d demis-palma + roland-d Email Cloaking plugin corrupts HTML (#11353)
* Fixed the regular expression

* Comment CS fix

* Changed comment stype as per Wojciech's suggestion
27eebb7
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment