use HTML::Entities in escapeHTML call #157

leejo · 2014-11-28T09:26:47Z

Because that almost certainly makes sense.

MichaelDaum · 2015-03-24T08:28:29Z

This change causes considerable issues when handing down un-decoded values to CGI::textfield and friends. These now either require a Encode::decode(), or CGI::autoEscape to be switched off. Both have significant issues: auto-escaping is generally required, i.e. for quotes. use of decode() is not backwards compatible for those that use CGI < 4.11. Only users of CGI>=4.11 will require strings to be decoded. Not before.

leejo · 2015-03-24T17:43:37Z

I'll probably make this a documentation change only. I think escapeHTML should be all or nothing.

Although if you can put together a test case I'll see if a reasonable fix is possible.

Lee.

Envoyé par myMail pour iOS

mardi, 24 mars 2015 09:28 +0100 de Michael Daum notifications@github.com:

This change causes considerable issues when handing down un-decoded values to CGI::textfield and friends. These now either require a Encode::decode(), or CGI::autoEscape to be switched off. Both have significant issues: auto-escaping is generally required, i.e. for quotes. use of decode() is not backwards compatible for those that use CGI < 4.11. Only users of CGI>=4.11 will require strings to be decoded. Not before.
—
Reply to this email directly or view it on GitHub .

WRT 9348e51 - that change made *all* chars encode, so to keep it back compatible we set the default list of chars to encode to: &<>"\x8b\x9b' this can be controlled by setting the value of $CGI::ENCODE_ENTITIES and if this var is set to undef then *all* cars will be encoded as per HTML::Entities documentation

leejo · 2015-03-25T16:21:21Z

@MichaelDaum - does the above commit (e8def3f) solve this problem for you?

MichaelDaum · 2015-03-25T17:16:50Z

This looks good. I think that this is going to save some pain. Have to test. Thanks a lot for adding flexibility to control escaping.

leejo · 2015-03-25T17:56:35Z

@MichaelDaum - OK, i've pushed DEV release 4.13_05 to CPAN: https://metacpan.org/release/LEEJO/CGI-4.13_05

If all is well i will release v4.14 next week (this will include #162 amongst others)

Thanks!

MichaelDaum · 2015-03-25T18:02:02Z

@leejo thanks again.

MichaelDaum · 2015-03-26T13:29:44Z

Btw @leejo why do you encode \x8b\x9b by default and what are these code points?

leejo · 2015-03-26T14:06:45Z

Btw @leejo why do you encode \x8b\x9b by default and what are these code points?

Backwards compatibility, although probably not relevant anymore. The commit that introduced these seems to be 0640e44. According to the perldoc that accompanies that change:

In addition, the hexadecimal 0x8b and 0x9b characters, which many windows-based
browsers interpret as the left and right angle-bracket characters, are replaced
by their numeric HTML entities ("&#139" and "&#155;")

It was then tweaked in v2.84 (Changes file entry):

1192   Version 2.84
1193
1194     1. Fix for failed file uploads on Cygwin platforms.
1195     2. HTML escaping code now replaced 0x8b and 0x9b with unicode
1196        references < and *#8250;

These can probably be removed TBH. I've found a reference on perlmonks here although there doesn't seem to be any explanation why these changes were made. In fact, i'm going to remove them. I can deal with legacy stuff when it's described or documented (in either the comments or commit logs), but when it's a mystery it can be purged. And given these were fixes were for legacy browsers i'm not too concerned about them.

Edit: on second thoughts, i'll keep them with a comment.

MichaelDaum · 2015-03-26T14:13:37Z

:)

from the original change that introduced these: 0640e44 the hexadecimal 0x8b and 0x9b characters, which many windows-based browsers interpret as the left and right angle-bracket characters, are replaced by their numeric HTML entities ("&#139" and "") probably no longer relevant, but who knows how many antiquated web browsers are still being used out there

leejo · 2015-04-01T06:51:42Z

v4.14 on its way to CPAN.

MichaelDaum · 2015-04-01T13:23:07Z

Works great.

MichaelDaum · 2015-04-20T13:52:42Z

@leejo unfortunately leaving in those two chars 0x8b and 0x9b will break unicode strings randomly.
eg. Nechť již hříšné saxofony ďáblů rozezvučí síň úděsnými tóny waltzu, tanga a quickstepu. becomes Nechť již hříšné saxofony ďáblů rozezvučí síň úd�›snými tóny waltzu, tanga a quickstepu. There are other languages that are affected more thoroughly depending on their codepoints. Removing 0x8b and 0x9b from CGI::ENCODE_ENTITIES fixes it. Tested on 4.15

leejo · 2015-04-20T14:30:06Z

@MichaelDaum - Hmmm, were you seeing this with pre v4 versions of CGI? I may strip these out as they seem to be causing more problems than they are fixing.

MichaelDaum · 2015-04-20T14:37:22Z

I can't properly say anymore. I guess however that v3 was okay-ish. It needed different loops to hop thru with regards to setting the charset.

leejo · 2015-04-20T14:51:42Z

okay-ish

Heh. I'll get these removed for the next release unless any wants to say otherwise. I'll try to test this, i have a Windows VM somewhere.

leejo added Enhancement Refactoring labels Nov 28, 2014

leejo closed this as completed in a8a47e6 Dec 1, 2014

leejo reopened this Mar 25, 2015

leejo closed this as completed Apr 1, 2015

leejo reopened this Apr 20, 2015

leejo closed this as completed in c101bf2 May 29, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use HTML::Entities in escapeHTML call #157

use HTML::Entities in escapeHTML call #157

leejo commented Nov 28, 2014

MichaelDaum commented Mar 24, 2015

leejo commented Mar 24, 2015

leejo commented Mar 25, 2015

MichaelDaum commented Mar 25, 2015

leejo commented Mar 25, 2015

MichaelDaum commented Mar 25, 2015

MichaelDaum commented Mar 26, 2015

leejo commented Mar 26, 2015

MichaelDaum commented Mar 26, 2015

leejo commented Apr 1, 2015

MichaelDaum commented Apr 1, 2015

MichaelDaum commented Apr 20, 2015

leejo commented Apr 20, 2015

MichaelDaum commented Apr 20, 2015

leejo commented Apr 20, 2015

use HTML::Entities in escapeHTML call #157

use HTML::Entities in escapeHTML call #157

Comments

leejo commented Nov 28, 2014

MichaelDaum commented Mar 24, 2015

leejo commented Mar 24, 2015

leejo commented Mar 25, 2015

MichaelDaum commented Mar 25, 2015

leejo commented Mar 25, 2015

MichaelDaum commented Mar 25, 2015

MichaelDaum commented Mar 26, 2015

leejo commented Mar 26, 2015

MichaelDaum commented Mar 26, 2015

leejo commented Apr 1, 2015

MichaelDaum commented Apr 1, 2015

MichaelDaum commented Apr 20, 2015

leejo commented Apr 20, 2015

MichaelDaum commented Apr 20, 2015

leejo commented Apr 20, 2015