-
Notifications
You must be signed in to change notification settings - Fork 187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixing decode error #405
Fixing decode error #405
Conversation
adding a try / except statement when the body.decode fails in detecting the right encoding
fixing bare except
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial triage. Still needs further review from maintainers.
try: | ||
return body.decode(decoding, self.unicode_errors) | ||
except UnicodeDecodeError: | ||
return body.decode("UTF-8", self.unicode_errors) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This causes test coverage to fail. 100% test coverage is required in PRs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
am not familiar with coverage test, trying to learn. I had a look at the code in test_response.py
which seems the part of coverage addressing the function I modified.
do you have any hints on what needs to be added, in order to fix the coverage issue?
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to add a test that covers that which is not covered, specifically the except
block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@epifanio to save time, you can also run tests locally before committing and pushing to this PR by installing tox
and using tox -e py27,py36,coverage
. Let me know if you need help with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got the tox -e py27,py36,coverage
working. From the test coverage codei see some clue can come from the following lines https://github.com/Pylons/webob/blob/master/tests/test_response.py#L758-L787. — it is my understanding I should add a new test method like def test_text_get_wrong_encoding()
. To try to cover the case when a wrong encoding is passed to _text_get
method. Is that correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're on the right path. I would narrow it down by running a specific test in debug mode, and step through, to see which lines are executed. Saves guesswork.
As far as its name, naming is hard. Maybe incorporate the method name, e.g., text_get_body_unicode_decode_error
or something? Although longer, it gives hints to what it tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not fall back to attempting to decode with UTF-8 if the remote client has sent a charset.
IDE specific instructions should go in a personal `.gitignore` file.
@@ -18,3 +18,4 @@ WebOb.egg-info/ | |||
pytest*.xml | |||
coverage*.xml | |||
.pytest_cache/ | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't update the .gitignore
, I would ask you to please rebase your changes to avoid touching this file at all.
try: | ||
return body.decode(decoding, self.unicode_errors) | ||
except UnicodeDecodeError: | ||
return body.decode("UTF-8", self.unicode_errors) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not fall back to attempting to decode with UTF-8 if the remote client has sent a charset.
I am sorry, but I am going to reject this change. If the remote client has sent a charset, we shouldn't attempt to guess and fall back to decoding UTF-8. If the remote has not set a charset, we are correctly using the |
Fixing issue with UTF-8 strings when
passes the wrong decoding option.