Umlaut in URL causes error (permitted_uri_chars are set) #2060

tobysommer opened this Issue Dec 7, 2012 · 5 comments

3 participants



I added some characters and umlauts to the permitted_uri_chars config var:

$config['permitted_uri_chars'] = 'a-z 0-9~%.:_\-äöüÄÖÜß';

Unfortunately, this doesn't work, navigating to anything with an umlaut results into the following error:

The URI you submitted has disallowed characters.

If I leave the permitted_uri_chars config var empty, everything works.

PS: Environment is an IIS 7.5 on Windows Server 2k8


OK, the following changes made it work:

system/core/URI.php change _filter_uri() to:

public function _filter_uri($str)
    if ($str !== '' && $this->config->item('permitted_uri_chars') != '' && $this->config->item('enable_query_strings') === FALSE)
        // preg_quote() in PHP 5.3 escapes -, so the str_replace() and addition of - to preg_quote() is to maintain backwards
        // compatibility as many are unaware of how characters in the permitted_uri_chars will be parsed as a regex pattern
        if ( ! preg_match('|^['.str_replace(array('\\-', '\-'), '-', preg_quote($this->config->item('permitted_uri_chars'), '-')).']+$|i', utf8_encode($str)))
            show_error('The URI you submitted has disallowed characters.', 400);

    // Convert programatic characters to entities and return
    return str_replace(
                array('$',     '(',     ')',     '%28',   '%29'), // Bad
                array('$', '(', ')', '(', ')'), // Good

I added the two utf8_encode()s in there.

@tobysommer tobysommer closed this Dec 7, 2012

Any chance, this will get into the core?

@tobysommer tobysommer reopened this Dec 8, 2012

I hope not :). "utf8_encode — Encodes an ISO-8859-1 string to UTF-8". PHP source files should not be assumed to be ISO-8859-1. That would penalize everyone who uses the correct character encoding for their files.

There's a charset config option, but it's only supposed to be a default for function parameters, which doesn't apply here.

If you're really stuck with ISO-8859-1 for your source files, I think you need to apply the utf8_encode() yourself.


I haven't tested with your particular case, but this is as far as it would go: 58ae971

@narfbg narfbg closed this Dec 10, 2012

👍 Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment