-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow Receiver to accept requests if both the hash and payload are missing #49
Conversation
424269f
to
933d9ea
Compare
wheel >= 0.29.0 | ||
twine >= 1.6.5 | ||
wheel == 0.29.0 | ||
twine >= 1.6.5, < 1.8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes were necessary to continue to support py26
- python: "3.7" | ||
env: TOXENV=py37 | ||
sudo: required | ||
dist: xenial |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hope this is ok
Hi @jcwilson, thanks for spending time on a patch. Just so I understand it better, can you give me an example of an exception you were running into and what the request looked like? I'm confused because I thought we already handle this case for GET requests. Were you encountering this for non-GET requests? |
Sure thing! Just to be sure, I personally wasn't encountering this for non-GET requests, but I don't think it's a requirement that non-GET requests have a payload. For example, I wouldn't expect DELETE to have one, and I can imagine scenarios where POST wouldn't necessarily provide a payload either. And the current implementation does handle GET requests, it's just that it imposes a constraint on the sender to provide a hash of a non-existent payload and to use and provide an irrelevant Here's a code sample that raises a
And here's the relevant debug output from the
The changes in this PR would allow the One might point out that we'd get the same behavior just by providing |
Is there anything else that I can do to help get this accepted? |
We have a similar problem. Hawkrest library relies on the default behavior of Mohawk, which expects all payload requests to be hashed, including GETs. This change would be greatly appreciated. |
I have started the work to add hawk (via mohawk) as a supported backend for falcon-auth, but I'm considering this a blocker for that work to proceed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delayed review and thanks for your patience.
Originally, I intended not to allow the behavior you're requesting because I didn't think it was necessary. A client that is passing an empty payload can simply hash an empty payload. I see how this may be inconvenient, though, so I'm willing to accept a patch for it.
However, we need to minimize the footguns here. The NodeJS library you linked to has a lot of footguns (in my opinion) which is also why I did not model the mohawk
interface on theirs. I have requested some changes to remove the footguns.
mohawk/base.py
Outdated
'(no hash in header)') | ||
check_hash = False | ||
content_hash = None | ||
elif (resource.content is EmptyValue): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EmptyValue
means that content
has not been given an explicit value. What you meant was to look for actual empty values, such as an empty string or None
.
We need to protect against someone defining a receiver object and forgetting to pass in the content
keyword.
For example, the following code contains a terrible typo that would allow content tampering, something that the developer did not intend.
request = {
'headers': {
'Authorization': sender.request_header,
'Content-Type': content_type
},
'url': url,
'method': method,
'content': 'Evil unhashed content'
}
# Whoops! The developer forgot to pass in the `content` keyword. This could have been a mistake.
receiver = Receiver(
lookup_credentials,
request['headers']['Authorization'],
request['url'],
request['method'],
content_type=request['headers']['Content-Type'])
mohawk/tests.py
Outdated
always_hash_content=False)) | ||
|
||
def test_expected_unhashed_empty_content(self): | ||
content = EmptyValue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When content
is EmptyValue
, MissingContent
should still be raised. In other words, you should keep this test but add @raises(MissingContent)
to the top. After that, add two more tests. One where content=''
and another where content=None
.
docs/usage.rst
Outdated
@@ -393,7 +393,10 @@ without a declared hash using ``accept_untrusted_content=True``: | |||
|
|||
This will skip checking the hash of ``content`` and ``content_type`` only if | |||
the ``Authorization`` header omits the ``hash`` attribute. If the ``hash`` | |||
attribute is present, it will be checked as normal. | |||
attribute is present, it will be checked as normal. For requests that have no |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's document this explicitly. Say that content
can be an empty string or content
can be None
.
I'm still a little confused because I thought I already made a compromise to allow the feature you are requesting in #45 🤔 It's been a while so I'll have to look back at it in detail. |
I saw that, too, and was hoping that it would suffice, but I couldn't find a valid set of
Agreed, and I'm happy to help out there :) I think what would be helpful is to make sure my thinking aligns with your intent and if so, communicate it clearly to the user in the documentation in this PR. I may have misunderstood the intent of the I'll keep the discussion focused on the We have essentially three "states" for the Situations that might result in All The logic for determining the Then the hash validation logic becomes:
Thank you for being patient with me, too! :) |
Correct.
Yes.
If you are setting
Depending on these headers seems out of scope for
A
That seems right.
It is possible to hash an empty string so I think we can convert
Again, it's possible to hash an empty string, so, if a hash is present, compare it. Otherwise, skip it. |
Yeah, that's fair. I should have clarified that I'd expect the caller to do this work, not
My concern with this is that it's a departure from the reference JS implementation where it checks for payload content (even if it's just the empty string) and then errors if the hash is not present.
Yep, and empty (as opposed to non-existent) to me would mean a I think I'm ok with coercing |
ok, I see. Yes, I suppose it would make sense to also raise an error for this case.
Sounds good! No rush. |
…e missing It is an acceptable and expected use case that the sender not provide a content hash when no content is available. The most common example of this would be conventional GET requests. These changes omit the content hash validation if there is no content to hash, even if `accept_untrusted_content` is `False`.
Ok, I think I've wrapped this up. It turns out that once we treat I've replaced the unexpected unhashed content test with two tests: one for the Thanks for looking. |
bump :) |
Hi Josh. Thanks for all your hard work on this. I’ve been taking some
extended time off during the holidays and didn’t get a chance to look at
the PR before I left. I’ll be back at work on Jan 7th but I may have time
to take a look beforehand. I’ll make a note in my calendar to take a look
when I’m back, regardless.
…On Thu, Dec 27, 2018 at 5:37 PM Josh Wilson ***@***.***> wrote:
bump :)
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#49 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AADYZjmzgCTKDDJg2hYJrxE52kM4nNO_ks5u9VnNgaJpZM4Xa5w1>
.
|
No problem and no rush from my end. Thanks for the update and enjoy your vacation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again for your patience. I requested some minor cleanup but I also wanted your feedback on whether or not we need to check content_type
. I'm not sure. Empty content is empty content.
mohawk/base.py
Outdated
# It is acceptable to not receive a hash if there is no content | ||
# to hash. | ||
log.debug('NOT calculating/verifiying payload hash ' | ||
'(no hash in header, but no content either)') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since all that matters here is that there's an empty body, I suggest this: "NOT calculating/verifying payload hash (request body is empty)"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went with (no hash in header, request body is empty)
to be consistent with the one below. I can change it back to this suggestion if you wish, though.
mohawk/base.py
Outdated
# content_type values will be coerced to the empty string for | ||
# hashing purposes. | ||
log.debug('NOT calculating/verifiying payload hash ' | ||
'(no hash in header)') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make this super clear, how about adding: "(no hash in header, accept_untrusted_content=True)"
mohawk/tests.py
Outdated
# the payload when in fact there is literally no content. In this case, | ||
# mohawk depends on the presence of the content hash in the auth header | ||
# to determine how to treat the empty strings: no hash in the header | ||
# implies that no hashing is expected to occur on the server. | ||
self.receive(sender_kw=dict(content=EmptyValue, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test relies on default params and those values are important. I suggest setting them explicitly:
diff --git a/mohawk/tests.py b/mohawk/tests.py
index 7d117ed..a784882 100644
--- a/mohawk/tests.py
+++ b/mohawk/tests.py
@@ -642,7 +642,9 @@ class TestReceiver(Base):
# mohawk depends on the presence of the content hash in the auth header
# to determine how to treat the empty strings: no hash in the header
# implies that no hashing is expected to occur on the server.
- self.receive(sender_kw=dict(content=EmptyValue,
+ self.receive(content='',
+ content_type='',
+ sender_kw=dict(content=EmptyValue,
content_type=EmptyValue,
always_hash_content=False))
mohawk/tests.py
Outdated
@raises(MisComputedContentHash) | ||
def test_unexpected_unhashed_content(self): | ||
def test_expected_unhashed_empty_content(self): | ||
# The receiver will receive empty strings for content and content_type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment would read better to me if it was clearly about the test itself. I suggest:
This test sets up a scenario where the receiver will receive empty strings...
mohawk/tests.py
Outdated
self.receive(sender_kw=dict(content=EmptyValue, | ||
content_type=EmptyValue, | ||
always_hash_content=False)) | ||
|
||
def test_expected_unhashed_no_content(self): | ||
# The receiver will receive None for content and content_type and no |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, this comment would read better to me with a preamble:
This test sets up a scenario where the receiver will receive None for content...
mohawk/base.py
Outdated
if not resource.content and not resource.content_type: | ||
# It is acceptable to not receive a hash if there is no content | ||
# to hash. | ||
log.debug('NOT calculating/verifiying payload hash ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: verifiying (I think it was in the old code)
mohawk/base.py
Outdated
# Allow the request, even if it has content. Missing content or | ||
# content_type values will be coerced to the empty string for | ||
# hashing purposes. | ||
log.debug('NOT calculating/verifiying payload hash ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: verifiying (again)
log.info('request unexpectedly did not hash its content') | ||
if 'hash' not in parsed_header: | ||
# The request did not hash its content. | ||
if not resource.content and not resource.content_type: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this really need to check content_type
? As far as I can tell, the Hawk reference implementation does not check it for this case.
What I mean is: this line could maybe just be:
if not resource.content:
log.debug('NOT calculating...')
If you think it does need to check content_type
then we should at least add a test case for it:
diff --git a/mohawk/tests.py b/mohawk/tests.py
index 7d117ed..8ab4249 100644
--- a/mohawk/tests.py
+++ b/mohawk/tests.py
@@ -658,6 +658,14 @@ class TestReceiver(Base):
content_type=EmptyValue,
always_hash_content=False))
+ @raises(MisComputedContentHash)
+ def test_cannot_receive_partially_empty_content(self):
+ self.receive(content=None,
+ content_type='text/plain',
+ sender_kw=dict(content=EmptyValue,
+ content_type=EmptyValue,
+ always_hash_content=False))
+
@raises(MissingContent)
def test_cannot_receive_empty_content_only(self):
content_type = 'text/plain'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the content_type
check is necessary for the cases where the server would expect a hash and the content == ''
and the content_type
is an actual non-empty string value. I've updated the tests to reflect this intent.
I think the only weird scenario now is test_expected_unhashed_no_content_with_content_type()
but handling that differently would require us to revisit the coercing decision, which I think is outside the scope of this work.
@@ -3,9 +3,9 @@ mock >= 1.0.1 | |||
nose >= 1.3.0 | |||
|
|||
# For documentation. | |||
Sphinx >= 1.2.1 | |||
Sphinx >= 1.2.1, < 1.5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't get the docs to build but I was confused that the PR was green. I discovered that tox isn't triggering the html
builder because of a typo (you can't specify -b
more than once). Oops. Can you add this to your patch? It's working for me after this. I could put it on master but it will create a conflict.
diff --git a/requirements/dev.txt b/requirements/dev.txt
index 674106b..b34c439 100644
--- a/requirements/dev.txt
+++ b/requirements/dev.txt
@@ -3,6 +3,7 @@ mock >= 1.0.1
nose >= 1.3.0
# For documentation.
+docutils < 0.13.1
Sphinx >= 1.2.1, < 1.5
sphinx-rtd-theme >= 0.1.5
diff --git a/tox.ini b/tox.ini
index 7a40af1..d018a59 100644
--- a/tox.ini
+++ b/tox.ini
@@ -18,4 +18,5 @@ basepython=python2.7
changedir=docs
deps={[base]deps}
commands=
- sphinx-build -b html -b doctest -d {envtmpdir}/doctrees . {envtmpdir}/html
+ sphinx-build -b html -d {envtmpdir}/doctrees . {envtmpdir}/html
+ sphinx-build -b doctest -d {envtmpdir}/doctrees . {envtmpdir}/doctest
@@ -395,6 +395,15 @@ This will skip checking the hash of ``content`` and ``content_type`` only if | |||
the ``Authorization`` header omits the ``hash`` attribute. If the ``hash`` | |||
attribute is present, it will be checked as normal. | |||
|
|||
For requests whose ``content`` (and by extension ``content_type``) is ``None`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, thanks. I expanded on it a bit to add more context. Can you add this?
diff --git a/docs/usage.rst b/docs/usage.rst
index 4704d2c..16a7339 100644
--- a/docs/usage.rst
+++ b/docs/usage.rst
@@ -395,10 +395,17 @@ This will skip checking the hash of ``content`` and ``content_type`` only if
the ``Authorization`` header omits the ``hash`` attribute. If the ``hash``
attribute is present, it will be checked as normal.
+Empty requests
+==============
+
For requests whose ``content`` (and by extension ``content_type``) is ``None``
or ``''``, it is acceptable for the sender to omit the declared hash,
regardless of the ``accept_untrusted_content`` value provided to the
-:class:`mohawk.Receiver`. If the ``hash`` attribute is present and
+:class:`mohawk.Receiver`.
+For example, a ``GET`` request typically has empty content and some
+libraries may or may not hash the content.
+
+If the ``hash`` attribute *is* present for an empty request and
``accept_untrusted_content`` is ``False``, a ``None`` value for either
``content`` or ``content_type`` will be coerced to ``''`` prior to hashing.
This is to account for some dependent libraries that may provide the empty
* Updated documentation with suggested clarifying edits * Updated logging debug message with suggested clarifying edits ** Did a little massaging for consistencies sake * Updated tests with further tests and better doc strings * Updated tox.ini with suggested edits
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work, thanks again. I'll get a release out shortly. I didn't realize that were other important changes to release, as well.
@raises(MisComputedContentHash) | ||
def test_unexpected_unhashed_content(self): | ||
self.receive(sender_kw=dict(content=EmptyValue, | ||
def test_expected_unhashed_empty_content_with_content_type(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this case. I'm not totally sure it's the right thing to do (the content is still empty even though there is a content type) but I'm fine with shipping like this. Someone will complain if it poses a problem.
Thanks for all the help and attention on this. I'd be glad to help address any potential future issues regarding these changes |
Awesome -- future help would be great! I just released this in 1.0.0 https://mohawk.readthedocs.io/en/latest/#changelog |
Fixes #51
It is an acceptable and expected use case that the sender not provide a content
hash when no content is available. The most common example of this would be
conventional GET requests.
These changes skip the receiver's content hash validation if there is no content to
hash, even if
accept_untrusted_content
isFalse
.See the js reference implementation here and here