Keep a reference to the current document, 'rename' and 'remove_unknown' options #113

misja · 2015-06-16T09:39:16Z

This relates to #95 , where no reference is kept to the document being validated and hence not being able to change or remove fields. This commit maintains references and adds a current property to the validator, pointing to the current document and making it available to custom validators.

funkyfuture · 2015-06-16T10:25:43Z

cerberus/cerberus.py

@@ -701,6 +722,13 @@ def _validate_allof(self, allof, field, value):
        if not valid:
            self._error(field, errorstack)

+    def _validate_rename(self, rename, field, value):
+        if isinstance(rename, _str_type):


this should be caught upon schema-validation.

I think schema validation will check the field value type only, the rename field is just a configuration option which should be a string since since its value will replace the original field name in the document.

i don't get your point.
what hinders you from checking if the provided value in the schema is a string in validate_schema?

You're right, I'll add the check there and will change the type to collections.Hashable.

funkyfuture · 2015-06-16T10:32:38Z

there should be a remove_unknown-rule, similar to allow_unknown.

funkyfuture · 2015-06-16T10:44:43Z

cerberus/cerberus.py

@@ -298,6 +316,9 @@ def _validate_definition(self, definition, field, value):
            if validator:
                validator(definition[rule], field, value)

+        if 'rename' in definition:


shouldn't this happen before further validation like coercion?

I've purposely added it last since a rename will change a document field which could break other (custom/user) validators looking for the original field name.

remove_unknown is included, see https://github.com/misja/cerberus/blob/references/cerberus/cerberus.py#L134

i mean the possibility to override this option via a rule, that can be placed in a schema-defintion. since allow_unknown (as Validator-property) is inherited by derived validators, there is also the constraint allow_unknown in order to override this. please see the documentation about it.

I've purposely added it last since a rename will change a document field which could break other (custom/user) validators looking for the original field name.

intuitively i'd expect that first a normalization is applied and the result is then validated.

Makes sense, I'll put the rename first. As for adding a remove_unknown rule to the schema, will do.

funkyfuture · 2015-06-16T20:59:21Z

as @misja points out this actually solves #107. which is a very cool incident, so i can leave town w/o that issue open. i'm going to investigate and review more tomorrow regarding this issue.

an extension to the schema like the following works fine to check the correct execution order and result:

'propertyschema': {'type': 'string', 'regex': '^bar$'}

imo, the form a_dict[…] is more suitable in the tests than a_dict.get(…). functionally there's nothing wrong about it, it's that cultural thing.

oh, what happens in a case like this?:

schema = {'foo': {'type': 'string'},
          'bar': {'rename': 'foo'}}
document = {'bar': 42}

misja · 2015-06-16T21:22:49Z

Right, I just did a final commit, just a tweak really. I did leave the rename at the end since changing the field name also requires modifying the schema during the validation, and frankly don't have the time for it anymore ... Same goes with adding remove_unknown to the schema - I tried but I keep losing the reference to the current document there. It will probably have to do with the field and values getting split everywhere to pass field and value to rules where it might probably better to pass the current document (self.current now) and field. It's essentially the difference between the following, where the latter will fail:

x = {'root': {'child': 'my value'}}
y = x['root']
y['child'] = 'other value'
assert x == {'root': {'child': 'other value'}}

x = {'root': {'child': 'my value'}}
y = x['root']['child']
y = 'other value'
assert x == {'root': {'child': 'other value'}}

nicolaiarocci · 2015-06-17T07:48:44Z

Cool stuff.

One thing I'd do however, is make current a property, like errors, with proper self-documenting docstring. This way when docs are generated, devs take notice of this new feature.

nicolaiarocci · 2015-06-17T07:51:13Z

Also, for next time maybe, make sure you split every new feature or fix into individual pull requests. Makes it easier on the maintainer and more importantly on project history and therefore, for others to review.

CD3 · 2015-06-17T13:19:05Z

I have been thinking about the 'reference to current document' feature recently. My though was to keep a current "path", probably just a member named _path that contains the keys required to get from the top of the document down to the current document.

What do you think about doing it this way instead. For my own applications it would be useful to know where in the document validation is occurring, but I think it would have the added benefit of not having to keep the main and current documents in sync.

funkyfuture · 2015-06-17T13:34:02Z

i was also thinking to refactor some stuff. especially a segregation of normalization of a document as a whole before validation.

what you propose as path may well be what i intend as trail with regards to #93.

but i can look into things at earliest as July.

CD3 · 2015-06-17T13:39:19Z

I think you are right. Perhaps the Validator class keeps a trail that is just deep copied into _errors?

funkyfuture · 2015-06-17T13:45:36Z

that's the plan.

CD3 · 2015-06-17T14:26:30Z

nice. then I think it would be better to have a __get_current_document function that gives you a reference to the current document using the main document and trail.

nicolaiarocci · 2015-06-18T07:20:15Z

I want/need to wrap up a 0.9 release soon, ideally as soon as #107 is solved.

I can merge this so it gets included with next release, which I suspect would be ideal for @misja, or keep it on hold until further development is accomplished. Given @funkyfuture and @CD3 planned developments around this, it might be advisable to hold for now. Thoughts?

misja · 2015-06-18T07:27:49Z

@funkyfuture @CD3 as I see it, adding a layer to keep a trail is routing around a more persistent problem. One can perfectly iterate through a dict and keep pointers to elements within the document, it's just the current splitting up and passing of field and value to rules which messes things up.

misja · 2015-06-18T07:31:38Z

@nicolaiarocci #107 could be solved now by cherry picking my commits related to setting current. As for future developments and reference handling, that's up to @funkyfuture and @CD3 :)

nicolaiarocci · 2015-06-18T07:39:01Z

@misja I know you're going to hate me for asking but, would you please resubmit as separate PRs? remove_unknown; rename, #107 fix and/or current property? I'm a lazy bastard.

CD3 · 2015-06-18T12:41:58Z

Having a trail would allow validators to consider things like the current field's parents or values of fields at the level above the current field, which would be useful to me, but this is probably a rare need. Perhaps having copy of current would be more useful to more people. If it fixes #107 I like it.

funkyfuture · 2015-06-18T17:56:01Z

unfortunately stress grew more than anticipated the last days and i won't be in town for two weeks.

i'm not very fond of a release atm. here are some reasons:

as i understand such thing would not fail respectively the behaviour is not determinate depending on order in which fields are validated, which is not inuitive imo:

schema = {'foo': {'type': 'string'},
          'bar': {'rename': 'foo'}}
document = {'bar': 42}

there's no way to change the behaviour concerning remove_unknown in subschemas.

in my impression the code is meanwhile wildly grown and some consolidating refactoring should be done before further features are added. several aspects have been spotted and noted in the recent discussions. few i haven't publicly articulated yet. unfortuantely i don't have the time now to comprehend all that.

this is still unclear to me. as the list as argument for types hasn't been released yet, i'd rather remove it as with anyof there's now a more generic solution.

~~also, @CD3 points out, that *of-rules don't work as expected in any constellation. or is this solved? if not, it seems that some refactoring is a key part in tackling that.~~

if a release is necessary soon, i'd pledge to make it a 0.9-rc0.

CD3 · 2015-06-18T18:08:18Z

This issue has been fixed. The unit tests added in this commit demonstrates that both organizations will work.

funkyfuture · 2015-06-18T18:10:16Z

thanks for the feedback.

misja · 2015-06-18T18:23:32Z

@funkyfuture To be honest I'm fine if this pull request is put on hold, I'm at least glad part of it has helped squash a related issue (serendipity ftw!). Whilst working on it doubt has crept in though if this library is suitable enough for my use, so am leaving it for now.

funkyfuture · 2015-06-18T18:51:58Z

that leaves the question about the removal of checking a list of types from my list. that's not too hard, once it's decided. we can also postpone an 'agglutinating' syntax for *of-rules onto a later release.

CD3 · 2015-06-18T19:03:02Z

I agree (but I realize I've only been contributing for about a week). You should remove the list of types feature before the release so that you/we can implement a generic solution without having to worry about breaking backward compatibility. If you don't remove it now, you won't be able to remove it for a while.

Misja Hoebe added 4 commits June 16, 2015 11:28

Keep a reference to the current document

e6601a5

Add option to remove unknown fields if allow_unknown=False

d56c4e6

Add rename as an option, allowing to change field names

3e000f1

Fix PEP8 errors

715e405

funkyfuture reviewed Jun 16, 2015
View reviewed changes

Misja Hoebe added 4 commits June 16, 2015 13:49

Add documentation for remove_unknown option and rename rule

c9d5ef1

Add some punctuation

b678445

Move rename type validation to validate_schema

07decfe

Point _validate_required_fields to the subdocument, relates to #111

643b74b

Remove the extra copy, use keys instead

66fe1e5

Merge branch 'noneof-oneof'

19b19c0

nicolaiarocci changed the title ~~Keep a reference to the current document~~ Keep a reference to the current document, 'rename' and 'remove_unknown' options Jun 17, 2015

Misja Hoebe and others added 12 commits June 17, 2015 12:45

Make current a property

1ab1204

Keep a reference to the current document

fd7f03b

Add option to remove unknown fields if allow_unknown=False

f4aa56b

Add rename as an option, allowing to change field names

161b9a7

Fix PEP8 errors

37909c0

Add documentation for remove_unknown option and rename rule

845818c

Add some punctuation

08f7457

Move rename type validation to validate_schema

7548338

Point _validate_required_fields to the subdocument, relates to #111

7bf2214

Remove the extra copy, use keys instead

f19f7b6

Make current a property

1affa47

Rebase

5ca044e

Remove redefinition of _validate_rename

9738430

nicolaiarocci closed this Jun 19, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Keep a reference to the current document, 'rename' and 'remove_unknown' options #113

Keep a reference to the current document, 'rename' and 'remove_unknown' options #113

misja commented Jun 16, 2015

funkyfuture Jun 16, 2015

misja Jun 16, 2015

funkyfuture Jun 16, 2015

misja Jun 16, 2015

funkyfuture commented Jun 16, 2015

funkyfuture Jun 16, 2015

misja Jun 16, 2015

misja Jun 16, 2015

funkyfuture Jun 16, 2015

funkyfuture Jun 16, 2015

misja Jun 16, 2015

funkyfuture commented Jun 16, 2015

misja commented Jun 16, 2015

nicolaiarocci commented Jun 17, 2015

nicolaiarocci commented Jun 17, 2015

CD3 commented Jun 17, 2015

funkyfuture commented Jun 17, 2015

CD3 commented Jun 17, 2015

funkyfuture commented Jun 17, 2015

CD3 commented Jun 17, 2015

nicolaiarocci commented Jun 18, 2015

misja commented Jun 18, 2015

misja commented Jun 18, 2015

nicolaiarocci commented Jun 18, 2015

CD3 commented Jun 18, 2015

funkyfuture commented Jun 18, 2015

CD3 commented Jun 18, 2015

funkyfuture commented Jun 18, 2015

misja commented Jun 18, 2015

funkyfuture commented Jun 18, 2015

CD3 commented Jun 18, 2015

Keep a reference to the current document, 'rename' and 'remove_unknown' options #113

Keep a reference to the current document, 'rename' and 'remove_unknown' options #113

Conversation

misja commented Jun 16, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

funkyfuture commented Jun 16, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

funkyfuture commented Jun 16, 2015

misja commented Jun 16, 2015

nicolaiarocci commented Jun 17, 2015

nicolaiarocci commented Jun 17, 2015

CD3 commented Jun 17, 2015

funkyfuture commented Jun 17, 2015

CD3 commented Jun 17, 2015

funkyfuture commented Jun 17, 2015

CD3 commented Jun 17, 2015

nicolaiarocci commented Jun 18, 2015

misja commented Jun 18, 2015

misja commented Jun 18, 2015

nicolaiarocci commented Jun 18, 2015

CD3 commented Jun 18, 2015

funkyfuture commented Jun 18, 2015

CD3 commented Jun 18, 2015

funkyfuture commented Jun 18, 2015

misja commented Jun 18, 2015

funkyfuture commented Jun 18, 2015

CD3 commented Jun 18, 2015