Browse files

Reinstate falling back to self.text for JSON responses

A JSON response that has no encoding specified will be decoded with a detected UTF codec (compliant with the JSON RFC), but if that fails, we guessed wrong and need to fall back to charade character detection (via `self.text`). Kenneth removed this functionality (by accident?) in 1451ba0, this reinstates it again and adds a log warning.

Fixes #1674
  • Loading branch information...
1 parent 0b68037 commit 5ee8b348ebab9a7c427a87355dd089c83ee74be9 @mjpieters mjpieters committed Feb 3, 2014
Showing with 11 additions and 2 deletions.
  1. +11 −2 requests/
@@ -725,11 +725,20 @@ def json(self, **kwargs):
if not self.encoding and len(self.content) > 3:
# No encoding set. JSON RFC 4627 section 3 states we should expect
# UTF-8, -16 or -32. Detect which one to use; If the detection or
- # decoding fails, fall back to `self.text` (using chardet to make
+ # decoding fails, fall back to `self.text` (using charade to make
# a best guess).
encoding = guess_json_utf(self.content)
if encoding is not None:
- return json.loads(self.content.decode(encoding), **kwargs)
+ try:
+ return json.loads(self.content.decode(encoding), **kwargs)
+ except UnicodeDecodeError:
+ # Wrong UTF codec detected; usually because it's not UTF-8
+ # but some other 8-bit codec. This is an RFC violation,
+ # and the server didn't bother to tell us what codec *was*
+ # used.
+ pass
+ log.warn('No encoding specified for JSON response, and no '
+ 'UTF codec detected. Falling back to charade best guess.')
return json.loads(self.text, **kwargs)

0 comments on commit 5ee8b34

Please sign in to comment.