Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using encodings other than UTF-8 in Response #1005

Closed
barosl opened this issue Mar 19, 2014 · 6 comments
Closed

Using encodings other than UTF-8 in Response #1005

barosl opened this issue Mar 19, 2014 · 6 comments

Comments

@barosl
Copy link

barosl commented Mar 19, 2014

The Flask documentation states that Flask assumes the encoding of the response to be UTF-8.

the encoding for text on your website is UTF-8

From http://flask.pocoo.org/docs/unicode/

Does that mean we are discouraged to use the encodings other than UTF-8 in the Flask response? I was unable to find a way to change the intended encoding of neither flask.wrappers.Response nor werkzeug.wrappers.Response correctly.

  1. I cannot directly pass the text to the constructor, as it calls set_data() with the UTF-8 encoding. That's because the constructor has no charset parameter. There is no way to change its behavior. So I should create the response object with no constructor arguments, and then assign 'utf-8' to response.charset, and call response.set_data().
  2. But still, as content_type is determined in the constructor, it will still be "text/html; charset=UTF-8" because the charset attribute is always 'utf-8' during the object creation process. So I'm forced to pass content_type to the constructor, which is kinda confusing because my original intention was just changing the encoding, rather than explicitly setting the Content-Type.

Do I understand the process accurately?

If I'm right, I suggest:

  1. Allow passing charset to the Response class.
  2. Or, the content_type attribute should be updated again when the user manually sets the charset attribute.
@barosl barosl changed the title Encodings other than UTF-8 Using encodings other than UTF-8 in Response Mar 19, 2014
@ThiefMaster
Copy link
Member

Why would you want any other encoding for text?

@barosl
Copy link
Author

barosl commented Mar 19, 2014

@ThiefMaster My original intention was not emitting the charset header at all, cause we have many legacy documents written in the encoding other than UTF-8. So until we convert them to the unified format, I was to let the client choose the encoding by itself.

@ThiefMaster
Copy link
Member

I think converting them is a better idea. Letting the client choose the charset is a bad idea - chances are good it'll get it wrong and show it as gibberish. I guess all of your documents have the same charset? If yes it shouldn't be too hard to convert them!

@barosl
Copy link
Author

barosl commented Mar 19, 2014

@ThiefMaster Currently at least the two encodings(cp949, cp932) are used, which are so similar that I cannot make an automated converter, because the text in one encoding does not cause UnicodeDecodeError when decoded by the other encoding... The only way to determine the encoding is using chardet, which is not a 100% solution.

@remram44
Copy link

remram44 commented Aug 2, 2014

To summarize:

  • You can set or remove the charset by returning Response(b'data', content_type='text/html; charset=whatever') (but you have to mention the mimetype)
  • You can set the charset by subclassing Response and setting the 'charset' attribute to something else (which will be used for all text/* or xml mimetypes) (but get_content_type() won't accept None).

Maybe adding a check for self.charset is None before calling get_content_type(mimetype, self.charset)? (in werkzeug) Optionally, accepting it as parameter as well.

(pushed these to override-response-charset)

@davidism
Copy link
Member

davidism commented Apr 8, 2017

Going to close this in favor of the options in the previous comment.

@davidism davidism closed this as completed Apr 8, 2017
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 14, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants