Skip to content
This repository has been archived by the owner on Apr 6, 2021. It is now read-only.

decodeForHTML returns same character for Ù and ù #11

Open
GoogleCodeExporter opened this issue May 24, 2015 · 2 comments
Open

Comments

@GoogleCodeExporter
Copy link

What steps will reproduce the problem?
1. decodeForHTML returns same character for Ù and ù  This is true 
for all named entities with upper/lower case versions. 

What is the expected output? What do you see instead?

Ù should return upper case U with accent, and ù should return 
lower case u with accent.

What version of the product are you using? On what operating system?

Latest version on Linux.

Please provide any additional information below.

In HTMLEntityCodec.js, you should probably not do a case insensitive look-up at 
the end of the getNamedEntity function.

Thanks!

Original issue reported on code.google.com by wvinc...@gmail.com on 5 Aug 2012 at 9:19

@GoogleCodeExporter
Copy link
Author

Hi,

I found one issue with decodeForHTML function. I tried below steps

org.owasp.esapi.ESAPI.initialize();

$ESAPI.encoder().encodeForHTML("<script>alert('123');</script>");
"<script>alert('123');</script>"

$ESAPI.encoder().decodeForHTML("<script>alert('123');</script>");
"<script>alert4039123394159<47script>"

Issue:- decodeForHTML is not giving me the actual data which i had encoded.

Solution:- In org.owasp.esapi.codecs.HTMLEntityCodec, the function parseNumber 
and parseHex returning number directly(return parseInt(out);). it should return 
char code(return String.fromCharCode(parseInt(out));).
Below are the function i have modified

var parseNumber = function(input) {
        var out = '';
        while (input.hasNext()) {
            var c = input.peek();
            if (c.match(/[0-9]/)) {
                out += c;
                input.next();
            } else if (c == ';') {
                input.next();
                break;
            } else {
                break;
            }
        }

        try {
            return String.fromCharCode(parseInt(out));
            //Commented to fix esapi bug
            //return parseInt(out);
        } catch (e) {
            return null;
        }
    };

    var parseHex = function(input) {
        var out = '';
        while (input.hasNext()) {
            var c = input.peek();
            if (c.match(/[0-9A-Fa-f]/)) {
                out += c;
                input.next();
            } else if (c == ';') {
                input.next();
                break;
            } else {
                break;
            }
        }
        try {
            return String.fromCharCode(parseInt(out, 16));
            //Commented to fix esapi bug
            //return parseInt(out, 16);
        } catch (e) {
            return null;
        }
    };

I have fixed this issue in esapi.js and using it for my project.

Thanks
Bikesh Kumar

Original comment by bikesh....@gmail.com on 19 Mar 2013 at 8:22

@GoogleCodeExporter
Copy link
Author

I think all we did was change in HTMLEntityCodec.js

return String.fromCharCode(entityToCharacterMap.getCaseInsensitive('&' + 
entity));

to

return String.fromCharCode(entityToCharacterMap['&' + entity]);

Original comment by wvinc...@gmail.com on 19 Mar 2013 at 10:58

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant