"Filter segments for malicious characters", really ? #47

Closed
bitbucket-import opened this Issue Aug 19, 2011 · 13 comments

Projects

None yet

9 participants

@bitbucket-import

When requesting a URI containing URI encoded parenthesis (I mean %28 and %29), they reach the controller maimed and unusable. Their strlen() reads wrong and they wont be matched as parenthesis by regex.

The reason for this is in system/core/URI.php at lines 238-241. Why would the parenthesis be replaced by shitty HTML entities that damage the string ?

// Convert programatic characters to entities
$bad    = array('$',        '(',        ')',        '%28',      '%29');
$good   = array('$',    '(',    ')',    '(',    ')');
return str_replace($bad, $good, $str);

Unless there's a good reason for it, I think this replacement should be removed altogether. You don't want to know how many hours it took me to pinpoint this issue.

With love,

Antoine Gersant

Contributor
narfbg commented Jun 19, 2012

This really depends on wether #388 is considered a bug or not.

Contributor
narfbg commented Oct 31, 2012

... or not - ignore my last comment.

+1 for scrapping the "conversion programatic characters" - it is pointless, has the potential to waste a lot of time 'debugging' and it is incorrect. the code essentially performs a limited "html entitizing" ... html entitizing is something you do to output not input!

Contributor

Hasn't this been address by #388? Can this be closed?

Contributor
narfbg commented Nov 11, 2014

#388 has nothing to do with this ... other issues related to this one have been fixed, but the suggestion here is to remove this filter altogether.

Contributor
Razican commented Nov 11, 2014

I think enough people are having issues with it and that it does not have a clear advantage, so in my opinion it should be removed.

Contributor

I agree that the substitution should be removed. RFC 3986 says that the dollar sign and parentheses are safe characters and do not need encoding. They are also flagged as "reserved" characters, which can be encoded and interpreted by an application, but that appears to be subsequent to any use as a URI.

@jim-parry jim-parry removed the Dead Issue? label Nov 11, 2014
Contributor

I'm voting for removing it.

Contributor

+1 for "conversion programatic characters" removal. The three presented justifications are good enough.


Edit: Justification 4: Such characters within a segment may be needed as a result of sloppy slug generation. If you use url_title() for this purpose, the segment would be clean, and then the "programatic characters" simply may not be enabled using the setting $config['permitted_uri_chars'].

url_title() may be reworked to transliterate from non-Latin languages, but this is another story.

Contributor
Razican commented Dec 2, 2014

I vote for it too.

Contributor
narfbg commented Dec 3, 2014

Well, the public opinion seems to be unanimous.

@pfote @benedmunds @druu @lonnieezell Any objections?

Contributor

No objection.

Contributor

I can't think of a reason it's really needed, but haven't scoured the code about this either.

No objection.

@narfbg narfbg closed this in bc11439 Dec 5, 2014
@narfbg narfbg added a commit that referenced this issue Dec 5, 2014
@narfbg narfbg Further changes related to issue #47, PR #3323
 - Removed a test that was created specifically for the 'convert programmatic characters to entities' feature.
 - Changed filter_uri() to accept by reference and to not return anything as its only purpose now is to trigger a show_error() call.
 - Added changelog messages and updated the upgrade instructions.
bfa233f
@ghost Unknown pushed a commit to goreilly/CodeIgniter that referenced this issue Jan 3, 2015
@Razican @goreilly Razican + goreilly Remove URI filter for parenthesis and dollar symbols, as talked in #47.
Signed-off-by: Razican <admin@razican.com>
69eb9d6
@ghost Unknown pushed a commit to goreilly/CodeIgniter that referenced this issue Jan 3, 2015
@narfbg @goreilly narfbg + goreilly Further changes related to issue #47, PR #3323
 - Removed a test that was created specifically for the 'convert programmatic characters to entities' feature.
 - Changed filter_uri() to accept by reference and to not return anything as its only purpose now is to trigger a show_error() call.
 - Added changelog messages and updated the upgrade instructions.
6e30282
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment