better handling of strange url

Hello,

I am using Django and URLObject, I encounter some UnidecodeEncodeError due to the use of URLObject with some invalid URLs (coming from search engines).

``` python
>>> from urlobject.urlobject import QueryString
>>> qs = QueryString(u's=glaci%E8re')
>>> qs.list
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/Eric/Python/Env/lpdc/lib/python2.7/site-packages/urlobject/query_string.py", line 35, in list
    value = qs_decode(value)
  File "/Users/Eric/Python/Env/lpdc/lib/python2.7/site-packages/urlobject/query_string.py", line 138, in _qs_decode_py2
    return urllib.unquote_plus(s).decode('utf-8')
  File "/Users/Eric/Python/Env/lpdc/lib/python2.7/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xe8 in position 5: invalid continuation byte
```

A partial solution would be:

``` python
def _qs_decode_py2(s):
    """Unquote unicode or str using query string rules."""
    if isinstance(s, unicode):
        s = s.encode('utf-8')
    return urllib.unquote_plus(s).decode('utf-8', errors='replace')
```

But I don't know for py3.

For information Django also does replace when handling query_string.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

better handling of strange url #31

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

better handling of strange url #31

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions