More characters to be allowed in Security::_validate_entities() #1512

Closed
code2be opened this Issue Jun 20, 2012 · 3 comments

Comments

Projects
None yet
4 participants

code2be commented Jun 20, 2012

While I was working Today on some form that expects the following input format from the user:
http://site.tld/file.php?param1={var1}&param2={var2}

The user should input the {var1} and {var2} part as they are, so I can replace them while using that URL later in the code.

I faced a problem when tried to store that URL in the DB, As I got a semicolon ( ; ) added before the equal sign ( = ) to make the URL like:
http://site.tld/file.php?param1={var1}&param2;={var2}

After long looking into this problem, I found that it's related to global_xss_filtering being enabled in config.
So, Looking into the code at security_helper.php, found that xss_clean() calls $CI->security->xss_clean($str, $is_image);
And, Security::xss_clean() by turn calls Security::_validate_entities()

Finally, Found the following regexp in the last method, at line #792:
#(&\#?[0-9a-z]{2,})([\x00-\x20])*;?#i

line before that one, the comment tells the whole story, that CI added semicolon when it is missing, That's nice job, but I think CI needs to know that more characters should be allowed in the URL, before adding a semicolon between parameters.

Fixing that, I changed the regexp to:
#(&\#?\{\}[0-9a-z]{2,})([\x00-\x20])*;?#i

In order to identify the { and } as valid characters, And it's working now...

Contributor

alexbilbie commented Jun 24, 2012

I don't think this should be changed because { and } aren't valid URI characters, the only valid characters are ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-._~:/?#[]@!$&'()*+,;=

Therefore in your case I'd have disabled XSS filtering for that input value and run a custom regex match to test the user input is correct.

Contributor

hArpanet commented Jul 1, 2012

I agree with @alexbilbie on this one (In general URIs as defined by RFC 3986 may contain any of the following characters: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-._~:/?#[]@!$&'()*+,;=.). @ahmedhosny can you not just change the url to wrap the vars in square brackets or something instead and solve the problem? param1=[var1]&param2=[var2]

Contributor

narfbg commented Nov 1, 2012

Agreed - if the RFC says so, we shouldn't accept those characters.

narfbg closed this Nov 1, 2012

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment