Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Don't assume input data is urlencoded when removing invisible characters #1229

wants to merge 1 commit into


None yet
5 participants

veenix commented Mar 30, 2012

When cleaning input data, set url_encode to be false when removing invisible characters. The $_POST, $_GET, and $_COOKIE variables are not urlencoded, but remove_invisible_characters assume that it is and will remove characters such as "%00" unless the url_encoded parameter is set to false.

@veenix veenix When cleaning input data, set url_encode to be false when removing in…
…visible characters. The $_POST, $_GET, and $_COOKIE variables are not urlencoded, but remove_invisible_characters assume that it is and will remove characters such as "%00" unless the url_encoded parameter is set to false.

ckdarby commented Jun 8, 2012

@veenix can you open an issue providing a working example?


narfbg commented Jun 8, 2012

No need to open another issue - just explain it in here.

Also, this would need a changelog entry and false should be FALSE.

veenix commented Jun 8, 2012

For example, if you send a post request with the key-value pair of "test" => "a%00b", vanilla PHP will have the value "a%00b" for $_POST['test']. However, when CI cleans the input data, it erroneously assumes that the input data is urlencoded when it removes control characters from the input data. As such, it decodes "%00" to its ASCII equivalent, which is a control character, and removes it when it does the input data cleaning, hence, $_POST['test'] will be "ab" instead of "a%00b".

Here is a simple controller that demonstrates this:

class Test extends CI_Controller {

function index() {
    echo '<form action="test" method="post"><input type="hidden" name="test" value="a%00b"><input type="submit" value="submit"></form>';



narfbg commented Jun 8, 2012

I'd rather say this is security protection rather than a bug. I think @wesbaker could say that for sure.

veenix commented Jun 8, 2012

Setting the $url_encoded = FALSE in the remove_invisible_characters method does not present a security issue because it will still remove invisible characters. It just won't assume that you will be decoding the data as if it was url encoded. Assuming that the parameters are url encoded violates the HTTP specification (since PHP already does any url decoding that is necessary and this essentially assumes doubly url encoded values).

This implementation makes it impossible to send several strings via GET, POST, or COOKIES input, namely "%00"-"%08", "%11", "%12", "%14", "%15", and "%16"-"%31". If a user's password or any form data contains these strings in any part of the data, then they will be erroneously removed. The only way around would be to encode the data before sending, which is counter intuitive to the developer, since vanilla PHP has no such requirement. Making this change presents no security risk since the data should not actually be url decoded, and if that was the intention of the developer, than it is up to them to ensure security. After all, url encoding is not the only way to encode data (it could be base64, etc.).

It is important to note that this only affects the most recent CI version. I'm not sure when it was changed, but older versions of CI did not have this problem.


narfbg commented Jun 8, 2012

XSS protection is (I believe) something rather new in CodeIgniter and it's being improved in each version, so it's quite normal that older version didn't behave in the same way.

We should wait on Wes or somebody else more familiar with it to say something for sure, I can't approve this on my own.


narfbg commented Jun 11, 2012

Just referencing #1136.

Hexcles commented Nov 12, 2012

Confirmed this bug as well. Hope to fix it ASAP.


narfbg commented Nov 13, 2012

A note to check the fix against vulnerabilities listed here: http://ha.ckers.org/xss.html


@alexbilbie I originally got this link from you in another thread. Have you investigated more into this?

GET data is URL-decoded by PHP (and the CI router class, if re-initialized), not sure about POST and COOKIE data at this time, but I assume they are as well. The question here is whether the current behavior is intended or just an erroneous double-filter.

Referenced issue: #148

Hexcles commented Nov 13, 2012

@narfbg I can confirm that $_POST is already decoded by PHP itself. It's just a common knowledge and I've just tested it to make sure.

It is really a serious bug. I came across a bug report that a user had a password like "blahblah%00..." and others could access this account with password "blahblah..." because of CI "removing the invisible chars of input".

But I think it might be terrible to FIX this bug as well, concerning the information previously stored in databases. Applications may need a logic to examine an input twice(first compare the input with database directly, if failed, "remove the invisible chars" as before and compare again...).

I just spent the past 5 hours trying to figure out what was wrong with my feed, and found this post.


@narfbg narfbg added a commit that referenced this pull request Jan 8, 2014

@narfbg narfbg Fix #148
CI_Input::_clean_input_data() assumed that all input data is URL-encoded while sanitizing it.
However, PHP already performs URL-decoding on it, so this is either redudant or overly
intrusive as it resulted in many, many reports of data containing '%' followed by 1 numeric
characters being essentially destroyed.

Supersedes PR #1229

@narfbg narfbg closed this Jan 8, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment