Remove charset from WebOb JSON exceptions #237

Closed
bertjwregeer opened this Issue Mar 24, 2016 · 6 comments

Projects

None yet

2 participants

@bertjwregeer
Member

Related to #236

We need to make sure to remove the charset from any WebOb exceptions that return JSON.

@bertjwregeer bertjwregeer added this to the 1.6.1 milestone Mar 24, 2016
@mmerickel
Member

It's perfectly valid to have a charset on a json response, including utf-8. What's the goal here?

@bertjwregeer
Member

@mmerickel We just removed the charset from WebOb's responses for JSON (to fix this: Pylons/pyramid#1611), now we added JSON to the exceptions, and the exceptions are still sending a charset=utf-8. I feel that we should at least be consistent. Either we send the charset for JSON or we don't.

@mmerickel
Member

As I commented on that previous issue I still don't read the spec the way @dstufft was reading it. The spec does allow alternate encodings and while the default is utf-8 where does it say you can't specify utf-8?

@bertjwregeer
Member

http://www.iana.org/assignments/media-types/application/json

There are no optional/required parameters for application/json. charset=utf-8 should thus get ignored by UA's. Setting it is a no-op because application/json means binary encoding ("Encoding considerations: binary"), the charset doesn't get taken into account.

content-type is defined as being a media type followed by any required/optional parameters. Since application/json has neither (required/optional paramters), sending them while "allowed" is IMHO not best practice.

What spec allows alternate encodings? The JSON spec allows UTF based encodings, and knows which it is based upon the first 4 bytes of the JSON response, but this is based upon parsing the JSON content itself, and not based upon the charset sent by the remote server.

@mmerickel
Member

Okay so a compliant parser should ignore the options then, right? As per the note in that link. Thanks for linking to the media type, I wasn't aware it didn't have any optional parameters. Regardless, I guess the fact that webob has a charset is a little odd since it only applies to some media types.

@bertjwregeer
Member

Yes, a compliant parser will ignore it.

Currently the charset is automatically set on a Response when the user provides a content type of text/html. Since this also happens to be the default_content_type, the charset is always set to UTF-8.

Now when a user sets the content_type on a Response object using resp.content_type = 'application/json' the existing charset is NOT removed, and instead is carried over. This behaviour is what leads to the issue in the exceptions, a Response object is created, the default content type is set, thus a charset is set, than the content type is overridden, which doesn't remove the charset.

The default_content_type continues to be an issue, and it is difficult to solve, and I am seriously considering removing it for a 2.0 WebOb release.

I should be able to create a Response object without having to override multiple defaults before modifying it to be "my own".

@bertjwregeer bertjwregeer modified the milestone: 1.6.2, 1.6.1 May 20, 2016
@rylz rylz added a commit to rylz/webob that referenced this issue Jun 3, 2016
@rylz rylz response: improve charset defaults
Refactored the logic in Response.__init__ to handle default charset more
consistently.

Added logic in the charset setter that ignores attempts to set it on
JSON content types.

Removed explicit charset specification from exceptions since this is
handled correctly for the text types within Response.

Fixed some affected tests, and added assertions for content types in
exceptions.

Addresses Pylons#237
a8fbdee
@rylz rylz added a commit to rylz/webob that referenced this issue Jun 3, 2016
@rylz rylz response: improve charset defaults
Refactored the logic in Response.__init__ to handle default charset more
consistently.

Added logic in the charset setter that ignores attempts to set it on
JSON content types.

Removed explicit charset specification from exceptions since this is
handled correctly for the text types within Response.

Fixed some affected tests, and added assertions for content types in
exceptions.

Addresses Pylons#237
b4ebe2d
@bertjwregeer bertjwregeer modified the milestone: 1.7.0, 1.6.2 Jul 9, 2016
@bertjwregeer bertjwregeer added a commit that referenced this issue Jul 17, 2016
@rylz @bertjwregeer rylz + bertjwregeer response: improve charset defaults
Refactored the logic in Response.__init__ to handle default charset more
consistently.

Added logic in the charset setter that ignores attempts to set it on
JSON content types.

Removed explicit charset specification from exceptions since this is
handled correctly for the text types within Response.

Fixed some affected tests, and added assertions for content types in
exceptions.

Addresses #237
e810667
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment