updated get_encoding_from_headers to return utf-8 if the content type is set to application/json
#5673
Conversation
…pe is set to application/json, following RFC 4627. fixes #5667
|
Awesome, thanks for the fast responses here and for merging! |
This comment has been minimized.
This comment has been minimized.
|
The |
This comment has been minimized.
This comment has been minimized.
|
As the contributor I don't know enough about the library to weigh in, but the reason for this PR was based on #5667. Can you take a look and see if there is another solution that will solve that issue? In short, the issue is that the I'm more than happy to create a different PR for a quick review based on what you would like. To quickly reproduce the issue in #5667, you can run the following code:
I would expect the 2nd print statement to be identical to the first -- i.e. |
This comment has been minimized.
This comment has been minimized.
|
Thanks for bringing this to our attention, @k0ssi! It sounds like this needs some more work. We may need to move the |
This comment has been minimized.
This comment has been minimized.
|
Unfortunately I dont have much time to take a closer look at the this at the moment. I'll have a look at it on the weekend and then get back to you next week. At the moment we first have to figure out what causes the issue @jjmaldonis tells. At first glance this looks to me as a encoding error within text() not json() @jjmaldonis by setting an charset within get_encoding_from_headers function which is used to build the response there is no guessing while using .json() since charset is already set. Since there is no guarantee everyone is using 'utf-8' we cannot simply assume it is. |
This PR fixes #5667 by adding a small bit of code to
get_encoding_from_headersthat returnsutf-8if the content type is set toapplication/jsonandcharsetis not set. This follows RFC 4627 (ctrl+f forshall be) which reads: "JSON text SHALL be encoded in Unicode. The default encoding is UTF-8."