New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: content_type/charset handling #261

Merged
merged 40 commits into from Jul 30, 2016

Conversation

Projects
None yet
5 participants
@bertjwregeer
Copy link
Member

bertjwregeer commented Jul 9, 2016

  • Response.content_type removes charset unless the new content_type is a text like type that has a charset parameter
  • Reverted change that was made in #253 whereby Response.charset would not allow the user to set a charset if the content type was JSON. I am not a big fan of trying to stop the user from shooting themselves in the foot, and would rather let them add/remove as needed.
  • json.dumps/loads are now always UTF-8. Python returns a string to us, and we can encode it as we see fit.
  • Response.__init__ has been cleaned up to remove a lot of extraneous branch conditions that were only complicating the logic.
@bertjwregeer

This comment has been minimized.

Copy link
Member

bertjwregeer commented Jul 9, 2016

Closes #130 and #237

@bertjwregeer

This comment has been minimized.

Copy link
Member

bertjwregeer commented Jul 9, 2016

Closes #236

@bertjwregeer

This comment has been minimized.

Copy link
Member

bertjwregeer commented Jul 10, 2016

This change breaks Pyramid because it treats JSON as text, and attempts to get/set .text on the Response object.

@cdent

This comment has been minimized.

Copy link

cdent commented Jul 11, 2016

I reckon a common way in which this change may surprise people is if they have tests which were written to expect (the incorrect) application/json; charset=UTF-8. With these changes they'll now get (the correct) application/json. Since such tests are wrong™ I reckon that's okay.

@bertjwregeer

This comment has been minimized.

Copy link
Member

bertjwregeer commented Jul 11, 2016

Any changes made in this area were going to have this issue, I'd argue that I'd rather have WebOb do the right thing rather than continue sending an invalid content-type.

@bertjwregeer

This comment has been minimized.

Copy link
Member

bertjwregeer commented Jul 12, 2016

@cdent Wanted to add a little more of a note: in WebOb 1.6 there's already been some work to remove charset=UTF-8 from certain responses that accidentally contained it, so I don't think it will be that big of a surprise to people.

CHANGES.txt Outdated

# Will raise
try:
print(res.text)

This comment has been minimized.

@bertjwregeer
@mmerickel

This comment has been minimized.

Copy link
Member

mmerickel commented Jul 17, 2016

Is there any planned deprecation warnings for this feature? Or just ripping the bandaid off?

@bertjwregeer

This comment has been minimized.

Copy link
Member

bertjwregeer commented Jul 17, 2016

@mmerickel What part of it? The content-type/charset stuff is fixing the remainder of the items that should have been fixed in 1.6 but were inadequately completed.

The default_encoding is something I am not happy with, and is a change I am no longer planning to make.

rylz and others added some commits Jun 2, 2016

response: improve charset defaults
Refactored the logic in Response.__init__ to handle default charset more
consistently.

Added logic in the charset setter that ignores attempts to set it on
JSON content types.

Removed explicit charset specification from exceptions since this is
handled correctly for the text types within Response.

Fixed some affected tests, and added assertions for content types in
exceptions.

Addresses #237
Document why we are doing what we are doing
This way the next time I come across this I don't have to spend 20
minutes trying to figure out why status_code wasn't being used.
Set content_type in headers only if we have one
It is entirely possible that if there is no content_type passed by the
user, no headerlist that contains a content type and there is no default
content type that we don't have a content_type.

In that case we want to make sure we don't set anything in the header
dictionary since None is not a valid header to return in a Response.
Clean up Response.__init__
Overhaul a lot of the logic in Response.__init__ to be smaller, and
remove a lot of extra logic checking that didn't belong in __init__ but
instead in the properties themselves.
Remove _is_json and add _content_type_has_charset
We no longer want to specialise application/json, instead attempt to do
the right thing as best as possible all the time.
Allow charset on all content_types
Don't try to stop the user from shooting themselves in the foot, if the
user explicitly asks for a charset, even on a content type that may not
support it as a paramater who are we tell them they can't do that.
Flip assert around
Makes pytest output more sane

bertjwregeer added some commits Jul 17, 2016

Remove body_encoding
Just encode the json.dumps as UTF-8, then raise an error if the user
passes a text type. This basically undoes a bunch of changes where we
were attempting to stop the user from shooting their own foot off. Let
them shoot.
Documentation for Response()
The docs fairy came to town!

@bertjwregeer bertjwregeer force-pushed the fix/charset_handling branch from 774148a to 141d14e Jul 17, 2016


if 'charset' in params:
if not _content_type_has_charset(value):
warn_deprecation(

This comment has been minimized.

@bertjwregeer

bertjwregeer Jul 17, 2016

Member

This should maybe be turned into a RuntimeWarning instead.

@bertjwregeer

This comment has been minimized.

Copy link
Member

bertjwregeer commented Jul 17, 2016

The only backwards incompatible change is https://github.com/Pylons/webob/pull/261/files#diff-84ba6a58e29e169ddf6578bceca65dc9R768. Whereby if you currently have a content-type with a charset, and you then you replace it with one that doesn't, it is explicitly removed, but the rest of the parameters stick around.

I don't see a way to gracefully deprecate the removal of charset, since that is the crux of the issue. For the other parameters it can continue as is for now, and full removal of all of that can be done at a later point in time.

bertjwregeer and others added some commits Jul 17, 2016

improve presentation of class variables
- make sub-classing notes an HTML list
- fix grammar and punctuation
- rewrap as needed
@bertjwregeer

This comment has been minimized.

Copy link
Member

bertjwregeer commented Jul 18, 2016

Thanks @stevepiercy. I need to make a couple minor changes to this, and then this will be ready to go.

I forgot to document app_iter for the constructor so I need to do that.

bertjwregeer added some commits Jul 30, 2016

Fix content_type bug if content_type was empty
When the new Content-Type value did not contain any semicolons we
attempted to save any existing parameters from the existing Content-Type
headers. If that header was empty, it would do nothing, and no
Content-Type would be set at all.

The new branch explicitly sets the Content-Type to the value if we don't
find any paramters to save.
Change the logic for the content_type setting
We already state that if the headerlist contains a Content-Type, then we
won't accept the passed in value, nor will we set it to a default
content type. There is no reason to pull the value out of the header
because if it exists we won't set the content_type anyway.

We have also have all of these fantastic properties, why are we changing
the underlying headers list directly when we can just use the property
and have it do it's appropriate magic.
Don't set Content-Type if headerlist is provided
If the caller provides the headerlist, we shouldn't try to be too smart
and add in a default Content-Type. This is used by
Request.get_response() for example to create a Response object that
matches the Response received from a WSGI application.

We do still add a default charset even if there is none provided on
Content-Types that are known to allow for a charset (basically texty
responses). This way Response.text will function without additional
work.
Don't set charset if already set
The charset provided to the constructor should not be used if a charset
is already set on the Content-Type.

@bertjwregeer bertjwregeer merged commit e632139 into master Jul 30, 2016

0 of 2 checks passed

continuous-integration/travis-ci/pr The Travis CI build is in progress
Details
continuous-integration/travis-ci/push The Travis CI build is in progress
Details

@bertjwregeer bertjwregeer deleted the fix/charset_handling branch Jul 30, 2016

@pyup-bot pyup-bot referenced this pull request Nov 3, 2017

Closed

Update webob to 1.7.3 #434

@pyup-bot pyup-bot referenced this pull request Jan 26, 2018

Closed

Update webob to 1.7.4 #407

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment