decodeForHTML returns same character for Ù and ù #11

GoogleCodeExporter · 2015-05-24T00:35:02Z

What steps will reproduce the problem?
1. decodeForHTML returns same character for &Ugrave; and &ugrave;  This is true 
for all named entities with upper/lower case versions. 

What is the expected output? What do you see instead?

&Ugrave; should return upper case U with accent, and &ugrave; should return 
lower case u with accent.

What version of the product are you using? On what operating system?

Latest version on Linux.

Please provide any additional information below.

In HTMLEntityCodec.js, you should probably not do a case insensitive look-up at 
the end of the getNamedEntity function.

Thanks!

Original issue reported on code.google.com by wvinc...@gmail.com on 5 Aug 2012 at 9:19

The text was updated successfully, but these errors were encountered:

GoogleCodeExporter · 2015-05-24T00:35:02Z

Hi,

I found one issue with decodeForHTML function. I tried below steps

org.owasp.esapi.ESAPI.initialize();

$ESAPI.encoder().encodeForHTML("<script>alert('123');</script>");
"<script>alert('123');</script>"

$ESAPI.encoder().decodeForHTML("<script>alert('123');</script>");
"<script>alert4039123394159<47script>"

Issue:- decodeForHTML is not giving me the actual data which i had encoded.

Solution:- In org.owasp.esapi.codecs.HTMLEntityCodec, the function parseNumber 
and parseHex returning number directly(return parseInt(out);). it should return 
char code(return String.fromCharCode(parseInt(out));).
Below are the function i have modified

var parseNumber = function(input) {
        var out = '';
        while (input.hasNext()) {
            var c = input.peek();
            if (c.match(/[0-9]/)) {
                out += c;
                input.next();
            } else if (c == ';') {
                input.next();
                break;
            } else {
                break;
            }
        }

        try {
            return String.fromCharCode(parseInt(out));
            //Commented to fix esapi bug
            //return parseInt(out);
        } catch (e) {
            return null;
        }
    };

    var parseHex = function(input) {
        var out = '';
        while (input.hasNext()) {
            var c = input.peek();
            if (c.match(/[0-9A-Fa-f]/)) {
                out += c;
                input.next();
            } else if (c == ';') {
                input.next();
                break;
            } else {
                break;
            }
        }
        try {
            return String.fromCharCode(parseInt(out, 16));
            //Commented to fix esapi bug
            //return parseInt(out, 16);
        } catch (e) {
            return null;
        }
    };

I have fixed this issue in esapi.js and using it for my project.

Thanks
Bikesh Kumar

Original comment by bikesh....@gmail.com on 19 Mar 2013 at 8:22

GoogleCodeExporter · 2015-05-24T00:35:02Z

I think all we did was change in HTMLEntityCodec.js

return String.fromCharCode(entityToCharacterMap.getCaseInsensitive('&' + 
entity));

to

return String.fromCharCode(entityToCharacterMap['&' + entity]);

Original comment by wvinc...@gmail.com on 19 Mar 2013 at 10:58

GoogleCodeExporter added Priority-Medium auto-migrated Type-Defect labels May 24, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

decodeForHTML returns same character for Ù and ù #11

decodeForHTML returns same character for Ù and ù #11

GoogleCodeExporter commented May 24, 2015

GoogleCodeExporter commented May 24, 2015

GoogleCodeExporter commented May 24, 2015

decodeForHTML returns same character for &Ugrave; and &ugrave; #11

decodeForHTML returns same character for &Ugrave; and &ugrave; #11

Comments

GoogleCodeExporter commented May 24, 2015

GoogleCodeExporter commented May 24, 2015

GoogleCodeExporter commented May 24, 2015

decodeForHTML returns same character for Ù and ù #11

decodeForHTML returns same character for Ù and ù #11