JSON renderer uses invalid Content-Type #1611

Closed
dstufft opened this Issue Mar 17, 2015 · 5 comments

Projects

None yet

4 participants

@dstufft
Contributor
dstufft commented Mar 17, 2015

When using the JSON renderer Pyramid adds this header: Content-Type: application/json; charset=UTF-8. However according to the IANA the application/json media type does not actually support a charset. It is always in UTF-8 there is no other valid encoding.

@mmerickel
Member

Note: No "charset" parameter is defined for this registration.
Adding one really has no effect on compliant recipients.

I guess I don't see what the issue is. The JSON RFC defines the content as UTF-8 by default but that other encodings are also valid. We are just being cautious, and according to your link still compliant, no?

@dstufft
Contributor
dstufft commented Mar 17, 2015

It's not a major issue, but it's not compliant no. Any compliant parser will ignore the value because the application/json mimetype does not have any parameters (required or optional). It's as meaningless as putting application/json; frob=lob.

The only valid encodings for JSON is UTF8, UTF16, and UTF32. The way a compliant JSON parser determines encoding is by looking at the first four octects:

Encoding

JSON text SHALL be encoded in Unicode. The default encoding is
UTF-8.

Since the first two characters of a JSON text will always be ASCII
characters [RFC0020], it is possible to determine whether an octet
stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking
at the pattern of nulls in the first four octets.

      00 00 00 xx  UTF-32BE
      00 xx 00 xx  UTF-16BE
      xx 00 00 00  UTF-32LE
      xx 00 xx 00  UTF-16LE
      xx xx xx xx  UTF-8

and

IANA Considerations

The MIME media type for JSON text is application/json.

Type name: application

Subtype name: json

Required parameters: n/a

Optional parameters: n/a

Encoding considerations: 8bit if UTF-8; binary if UTF-16 or UTF-32

JSON may be represented using UTF-8, UTF-16, or UTF-32. When JSON
is written in UTF-8, JSON is 8bit compatible. When JSON is
written in UTF-16 or UTF-32, the binary content-transfer-encoding
must be used.

FWIW I'm OK if this is wontfixed too, I just noticed that it was generating a meaningless (to a compliant parser) value and figured I'd open an issue incase y'all cared.

@mmerickel
Member

Someone disagrees with you. The relevant code[1] is in webob (where this issue belongs). Webob is explicitly adding a charset to application/json. A simple fix if webob stays the same is to delete the charset after setting the content type.

resp = Response(content_type='application/json')
del resp.charset

resp = Response()
resp.content_type = 'application/json'
del resp.charset

resp = Response()
resp.headers['Content-Type'] = 'application/json'
# it looks like you can trick webob since it doesn't monitor headers for changes.

[1] https://github.com/Pylons/webob/blob/master/webob/response.py#L124

@nnja
Contributor
nnja commented Apr 14, 2015

punt to webob

@mcdonc
Member
mcdonc commented Jun 6, 2015

Closing in this tracker, as it's now being tracked in WebOb.

@mcdonc mcdonc closed this Jun 6, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment