New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validate header and querystring with cornice schemas (fixes #873) #1021
Validate header and querystring with cornice schemas (fixes #873) #1021
Conversation
55d3d24
to
d2d96de
Compare
d2d96de
to
57398be
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is excellent, very well executed (as usual)! Thank you for being so thorough!
I made a few comments, but none requires structural change I believe...
@@ -542,7 +542,8 @@ def patch(self): | |||
new_record[extra_field] = existing[extra_field] | |||
|
|||
# Adjust response according to ``Response-Behavior`` header | |||
body_behavior = self.request.headers.get('Response-Behavior', 'full') | |||
body_behavior = self.request.validated.get('header', | |||
{}).get('Response-Behavior', 'full') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When does it happen that validated
doesn't have a header
entry? See https://github.com/Cornices/cornice/blob/9b73c5ae8dfbebede6413a007ffac7fe28e76401/cornice/validators/__init__.py#L60-L62
return | ||
raise_invalid(self.request, **error_details) | ||
if if_none_match == '*': | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
@@ -918,9 +887,8 @@ def _raise_400_if_id_mismatch(self, new_id, record_id): | |||
def _extract_partial_fields(self): | |||
"""Extract the fields to do the projection from QueryString parameters. | |||
""" | |||
fields = self.request.GET.get('_fields', None) | |||
fields = self.request.validated.get('querystring', {}).get('_fields') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto: could this happen that querystring
is missing? If it's for the test, we could fix the test setUp instead I believe.
@@ -150,3 +151,71 @@ class BookmarkSchema(ResourceSchema): | |||
|
|||
def preparer(self, appstruct): | |||
return strip_whitespace(appstruct) | |||
|
|||
|
|||
class CSVQuerystring(colander.SchemaNode): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think of FieldList
or StringList
instead ?
params = super(CSVQuerystring, self).deserialize(cstruct) | ||
if params is colander.drop: | ||
return params | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: superfluous else
if params is colander.drop: | ||
return params | ||
else: | ||
return params.split(',') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is the notion of preparer
in Colander that could split, and then the notion of Sequence
. In a second iteration, we can try to leverage that if possible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can definitely be improved but I didn't understand how to use preparer
to do it. Thinking through it, maybe defining it as Sequence
we can do something as simple as:
class FieldList(colander.SchemaNode):
fields = colander.SchemaNode(colander.String(), missing=colander.drop)
def deserialize(self, cstruct=colander.null):
if isinstance(cstruct, six.string_types):
cstruct = cstruct.split(',')
return super(FieldList, self).deserialize(cstruct)
if isinstance(cstruct, six.string_types): | ||
try: | ||
cstruct = decode_header(cstruct) | ||
except: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
except UnicodeDecodeError
?
name='Response-Behavior', | ||
validator=colander.OneOf( | ||
['full', 'light', 'diff']), | ||
missing=colander.drop) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is decode_header
not necessary for this one? Maybe we could run decode_header
here before deserailize each sub-node so that we have it one place only? Or do it in Cornice instead...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess handling it on cornice is a good idea. I'm also not sure if it's still needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't remember why we had that, but if I remember well it's because of Webob (/cc @Natim)
|
||
class HeaderSchema(colander.MappingSchema): | ||
if_match = HeaderQuotedInteger(name='If-Match', missing=colander.drop) | ||
if_none_match = HeaderQuotedInteger(name='If-None-Match', missing=colander.drop) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that you repeat missing=drop
, maybe we can remove it from each schema then!
record_schema = self.get_record_schema(resource_cls, method) | ||
record_schema.name = 'body' | ||
schema.add(record_schema) | ||
args['schema'] = schema | ||
else: | ||
args['schema'] = SimpleSchema() | ||
args['schema'] = RequestSchema() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One was Partial
and the other one Simple
. Is that ok to now give the same behaviour to both?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I've got why StrictSchema
and SimpleSchema
were needed by get_record_schema
but I actually didn't understand why we needed PartialSchema
and SimpleSchema
on this one, aren't we just setting the request schema here (in contrary to the body schema)?
Also, this didn't break any tests and the API behavior looks ok, so I think it's ok to keep the same behavior to both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok then!
57398be
to
cf050ab
Compare
Note: I'm probably going to address #880 on this PR as well. |
Looks OK to me. Maybe I misunderstood what you were saying in our meeting earlier, but it sounded like you needed to fix the JSON thing to fix the build failures, but the build failures don't seem related to the JSON library validation per se. So what's the rationale for fixing the JSON thing in this PR? (Just curious...) |
Getting JSON Patch requests to be validated by cornice. We need this because now we are using colander for validating and deserializing other aspects of the request that are also needed on JSON Patch (sync headers, etc). Right now we only trust the external library for validation and ignore cornice validation for this content-type. https://github.com/Kinto/kinto/blob/master/kinto/core/resource/__init__.py#L101 The problem is that I just discovered colander doesn't accept JSON Arrays at the top level, so this may be a bit more tricky than it seems. Cornices/cornice#433 |
7debfd1
to
b45d0f7
Compare
b45d0f7
to
3f07d5a
Compare
…erialize all the filters (including the unknown ones) during the deserialize call
…date-header-with-cornice
if schema_values is colander.drop: | ||
return schema_values | ||
|
||
# Deserialize filters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not look very good, but think it's better to handle the filter deserialization here than leave it to the Resource._extract_filters
method.
Opinions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't look bad by any means IMHO, and I really like having an explicit, formal schema, so I think this is an improvement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Me neither, well done!
I have one remark though: the above comment could be more explicit about filters
(field filters?) to help understand what the code does. A couple of minimalist examples with boolean or list value could help for example (?deleted=true
-> {"deleted": True}
)
response_behaviour = HeaderField(colander.String(), | ||
name='Response-Behavior', | ||
validator=colander.OneOf(['full', 'light', 'diff']), | ||
missing=colander.drop) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why couldn't we set missing
in HeaderField
class instead? (like FieldList)
if schema_values is colander.drop: | ||
return schema_values | ||
|
||
# Deserialize filters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Me neither, well done!
I have one remark though: the above comment could be more explicit about filters
(field filters?) to help understand what the code does. A couple of minimalist examples with boolean or list value could help for example (?deleted=true
-> {"deleted": True}
)
|
||
op = colander.SchemaNode(colander.String(), | ||
validator=colander.OneOf( | ||
['test', 'add', 'remove', 'replace', 'move', 'copy'])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please move this list elsewhere (or whole validator) to improve indentation ;) #IndentationFreak
|
||
@staticmethod | ||
def schema_type(): | ||
return colander.Mapping(unknown='preserve') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we allow value
only instead ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that value don't have an specific type, so we can't set it on the schema (or at least I don't know how to do it). Maybe we could check this at deserialize
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe define a new colander type that deserialize
returns ctruct
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just check its presence in deserialize maybe then
validator=colander.OneOf( | ||
['test', 'add', 'remove', 'replace', 'move', 'copy'])) | ||
path = colander.SchemaNode(colander.String()) | ||
from_ = colander.SchemaNode(colander.String(), name='from', missing=colander.drop) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cherry on cake: you could have a regex to make sure those look valid :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be cool, but I'm not sure what we can validate here. I know they all have to start with /
, but I'm not sure if there's anything else to check here. Maybe if there is anything between two /
? IDK
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah something like ^(/\w)+$
? .
|
||
|
||
class JsonPatchRequestSchema(RequestSchema): | ||
body = JsonPatchBodySchema() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically we don't support as many querystring values for a patch. But maybe we can keep it simple as it is, they don't harm.
Your call!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that was something I was thinking about earlier... this doesn't apply only for JSON Patch.
Should we use the same schema for all requests and validate all parameters, even the ones we won't use it, or set individual request schemas for each method with only the expected params?
I think it's more explicit to have multiple request schemas, and it's better for documentation purposes, but that also means more code to maintain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a step 1, what we have here is fine. If you struggle with that when working on #1006 then you can do a second round.
The remaining changes to do are nice to have. You can merge as soon as you feel good about the code :) GG ! |
(are you sure you added a changelog entry?) |
Fixes #873
Related to #1006
r? @