Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Fix UnicodeEncodeError in urllib if non-ascii characters are present in Request parameters #91

wants to merge 1 commit into


None yet
9 participants

romke commented Sep 30, 2011

Request.to_postdata was fixed some time ago, but Request.to_url was still unable to create url when unicode characters were present in any of Request parameters (because of UnicodeEncodeError)

See included test.

8maki commented Oct 2, 2011

I have the same problem now.
Please pull this!

atomatt commented Oct 14, 2011

Just hit this problem too, and can confirm that romke's commit fixes things.

atomatt commented Oct 14, 2011

Also, that method seems awfully complicated and can probably be replaced with the following if you prefer:

    def to_url(self):
        """Serialize as a URL for a GET request."""
        scheme, netloc, path, query, fragment = urlparse.urlsplit(self.url.encode('utf-8')) 
        query = parse_qs(query)                                                             
        for k, v in self.iteritems():                                                       
            query.setdefault(k.encode('utf-8'), []).append(to_utf8_optional_iterator(v))    
        query = urllib.urlencode(query, True)                                               
        return urlparse.urlunsplit((scheme, netloc, path, query, fragment))                 

Thank you.

Thank you romke for the patch, works like charm..

Someone should pull this into master, the patch has been available for over a year now..

marians commented Jul 10, 2013

Is there any update on this? I seem to run into this problem.

bxm156 commented Oct 27, 2013

Thank you!

yjlou commented Jul 13, 2015

Thanks emgee. Your fix saves my code.

@joestump joestump closed this in 7d0b8d7 Jul 29, 2015


joestump commented Jul 29, 2015

Your patch was half-fixed in a different PR. I modified your PR to use to_utf8() instead, but kept your test as-is, which passes. See 7d0b8d7 for the details.

@joestump joestump added this to the 2.0 milestone Jul 29, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment