Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use HTML::Entities in escapeHTML call #157

Closed
leejo opened this issue Nov 28, 2014 · 15 comments
Closed

use HTML::Entities in escapeHTML call #157

leejo opened this issue Nov 28, 2014 · 15 comments

Comments

@leejo
Copy link
Owner

leejo commented Nov 28, 2014

Because that almost certainly makes sense.

@MichaelDaum
Copy link

This change causes considerable issues when handing down un-decoded values to CGI::textfield and friends. These now either require a Encode::decode(), or CGI::autoEscape to be switched off. Both have significant issues: auto-escaping is generally required, i.e. for quotes. use of decode() is not backwards compatible for those that use CGI < 4.11. Only users of CGI>=4.11 will require strings to be decoded. Not before.

@leejo
Copy link
Owner Author

leejo commented Mar 24, 2015

I'll probably make this a documentation change only. I think escapeHTML should be all or nothing.

Although if you can put together a test case I'll see if a reasonable fix is possible.

Lee.

Envoyé par myMail pour iOS

mardi, 24 mars 2015 09:28 +0100 de Michael Daum notifications@github.com:

This change causes considerable issues when handing down un-decoded values to CGI::textfield and friends. These now either require a Encode::decode(), or CGI::autoEscape to be switched off. Both have significant issues: auto-escaping is generally required, i.e. for quotes. use of decode() is not backwards compatible for those that use CGI < 4.11. Only users of CGI>=4.11 will require strings to be decoded. Not before.

Reply to this email directly or view it on GitHub .

@leejo leejo reopened this Mar 25, 2015
leejo added a commit that referenced this issue Mar 25, 2015
WRT 9348e51 - that change made *all* chars encode, so to keep it
back compatible we set the default list of chars to encode to:

    &<>"\x8b\x9b'

this can be controlled by setting the value of $CGI::ENCODE_ENTITIES
and if this var is set to undef then *all* cars will be encoded as
per HTML::Entities documentation
@leejo
Copy link
Owner Author

leejo commented Mar 25, 2015

@MichaelDaum - does the above commit (e8def3f) solve this problem for you?

@MichaelDaum
Copy link

This looks good. I think that this is going to save some pain. Have to test. Thanks a lot for adding flexibility to control escaping.

@leejo
Copy link
Owner Author

leejo commented Mar 25, 2015

@MichaelDaum - OK, i've pushed DEV release 4.13_05 to CPAN: https://metacpan.org/release/LEEJO/CGI-4.13_05

If all is well i will release v4.14 next week (this will include #162 amongst others)

Thanks!

@MichaelDaum
Copy link

@leejo thanks again.

@MichaelDaum
Copy link

Btw @leejo why do you encode \x8b\x9b by default and what are these code points?

@leejo
Copy link
Owner Author

leejo commented Mar 26, 2015

Btw @leejo why do you encode \x8b\x9b by default and what are these code points?

Backwards compatibility, although probably not relevant anymore. The commit that introduced these seems to be 0640e44. According to the perldoc that accompanies that change:

In addition, the hexadecimal 0x8b and 0x9b characters, which many windows-based
browsers interpret as the left and right angle-bracket characters, are replaced
by their numeric HTML entities ("&#139" and "&#155;")

It was then tweaked in v2.84 (Changes file entry):

1192   Version 2.84
1193
1194     1. Fix for failed file uploads on Cygwin platforms.
1195     2. HTML escaping code now replaced 0x8b and 0x9b with unicode
1196        references < and *#8250;

These can probably be removed TBH. I've found a reference on perlmonks here although there doesn't seem to be any explanation why these changes were made. In fact, i'm going to remove them. I can deal with legacy stuff when it's described or documented (in either the comments or commit logs), but when it's a mystery it can be purged. And given these were fixes were for legacy browsers i'm not too concerned about them.

Edit: on second thoughts, i'll keep them with a comment.

@MichaelDaum
Copy link

:)

leejo added a commit that referenced this issue Mar 26, 2015
from the original change that introduced these:

	0640e44

the hexadecimal 0x8b and 0x9b characters, which many windows-based
browsers interpret as the left and right angle-bracket characters,
are replaced by their numeric HTML entities ("&#139" and "&#155;")

probably no longer relevant, but who knows how many antiquated web
browsers are still being used out there
@leejo
Copy link
Owner Author

leejo commented Apr 1, 2015

v4.14 on its way to CPAN.

@leejo leejo closed this as completed Apr 1, 2015
@MichaelDaum
Copy link

Works great.

@MichaelDaum
Copy link

@leejo unfortunately leaving in those two chars 0x8b and 0x9b will break unicode strings randomly.
eg. Nechť již hříšné saxofony ďáblů rozezvučí síň úděsnými tóny waltzu, tanga a quickstepu. becomes Nechť již hříšné saxofony ďáblů rozezvučí síň úd�›snými tóny waltzu, tanga a quickstepu. There are other languages that are affected more thoroughly depending on their codepoints. Removing 0x8b and 0x9b from CGI::ENCODE_ENTITIES fixes it. Tested on 4.15

@leejo
Copy link
Owner Author

leejo commented Apr 20, 2015

@MichaelDaum - Hmmm, were you seeing this with pre v4 versions of CGI? I may strip these out as they seem to be causing more problems than they are fixing.

@leejo leejo reopened this Apr 20, 2015
@MichaelDaum
Copy link

I can't properly say anymore. I guess however that v3 was okay-ish. It needed different loops to hop thru with regards to setting the charset.

@leejo
Copy link
Owner Author

leejo commented Apr 20, 2015

okay-ish

Heh. I'll get these removed for the next release unless any wants to say otherwise. I'll try to test this, i have a Windows VM somewhere.

@leejo leejo closed this as completed in c101bf2 May 29, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants