-
-
Notifications
You must be signed in to change notification settings - Fork 9.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use a default encoding in Response's text property #1546
Comments
Thanks for asking this question @JerryKwan! The short and pithy answer is: because it's better to be slow and correct than fast and wrong. =) If we were concerned about speed we'd simply not have the Having a 'default' encoding is just wild optimism, because no such default exists on the web. Saying that we'll use UTF-8 whenever we don't know what the correct encoding is means that some users will find that Requests very quickly downloads gibberish. They will then conclude that Requests, while very fast, also doesn't work properly, and they'll go and use another library. =) EDIT: A user can also simulate this behaviour by searching for a |
You would find that a default of 'utf-8' would make a shockingly high number of requests fail :) |
Actually, I forgot to mention something. This is implemented so that you can provide your own default if you'd like. >>> r = requests.get('http://httpbin.org/get')
>>> r.encoding = 'utf-8'
>>> r.text
... This will fully skip encoding detection. |
why not use a default encoding in Response's text property?if the server does not set content-type explicitly, why not use a default encoding such as utf-8? the chardet.detect() function is time consuming. if some one just use response.text and does aware the inner mechanism, he may think the Requests library is too slow and change to other libraries.
The text was updated successfully, but these errors were encountered: