Type coercion validator #71

brettatoms · 2015-02-24T19:44:40Z

By adding a coerce: <some callable> validator to a schema the return value of the callable will be used as the value to validate against in the document. This overwrite the value in the document so requires that you make a copy of the document before passing it to the validator if you don't want your original document changed.

This isn't fool proof as it only catches TypeError and ValueError in coercion validator but it is incredibly handy to have for sanitizing and validating HTTP request data.

nicolaiarocci · 2015-03-13T09:11:53Z

This seems very similar to what is already achievable with function based custom validation.

brettatoms · 2015-03-13T13:08:58Z

The difference is that the coercion always runs first and the value returned by the coerce function is set in the model/document for that key. That means the other validators always check against the coerced value in the model rather than the original value. It's helpful for things like http query parameters which always come in as strings and so you can sanitize and validate them in one step.

brettatoms · 2015-03-13T13:13:26Z

I use it like this so when I get the friend_ids from the request.values its already converted to a list.

schema = {
    'friend_ids': {'type': 'table_id_list', 'coerce': utils.parse_csv}
}
values = validate(request.values.to_dict(), schema)
friend_ids = values.get('friend_ids', [])

nicolaiarocci · 2015-03-19T08:46:13Z

I think that Cerberus should focus on validation. Transformation is a different thing, and we want to follow the "separation of concern" principle in this case.

brettatoms · 2015-03-19T16:22:30Z

It's a pretty common feature to similar frameworks:

https://github.com/alecthomas/voluptuous#validation-functions
https://github.com/podio/valideer#adaptation
http://docs.pylonsproject.org/projects/colander/en/latest/basics.html#deserialization
https://github.com/keleshev/schema#validatables

This is extemely convenient:

schema = {"num": "2", "type": "number"}  // broken
schema = {"num": "2", "type": "number", "coerce": int}  // yay!

nicolaiarocci · 2015-03-19T16:27:53Z

Yup there has been someone else proposing something like this in the past.Let's see if more people would like to add transformation features.

nicolaiarocci · 2015-03-19T16:31:01Z

FYI: https://twitter.com/nicolaiarocci/status/578594394992959488

aleksey-kutepov · 2015-05-07T12:29:11Z

I think that Cerberus should focus on validation

To my mind such type coercion has the same semantics as schema validation. And I'd like to see such errors along with schema errors. So as for me I see no reasons to separate these concerns. In addition it is possible to schema = {'a_list': {'type': 'list', 'schema': {'type': 'integer'}}} which is essentially the same but more limited.

nicolaiarocci · 2015-05-07T12:43:02Z

Alright folks. Looks like the feedback we've been waiting for has arrived, and I stand corrected 😄.

@brettatoms would you also update the documentation?

…coercion-pr

brettatoms · 2015-05-07T17:50:51Z

@nicolaiarocci Thanks! Should be good to go.

aleksey-kutepov · 2015-05-08T07:21:32Z

@nicolaiarocci @brettatoms Thanks you very much

funkyfuture · 2015-05-08T14:24:25Z

some thoughts concerning the design:

schema = {'property': {'type': 'integer', 'coerce': int}}

seems redundant as it will be of type integer implicitly. this should be enough:

schema = {'property': {'coerce': int}}

~~i don't like the idea~~ what i'd like better is something like this, so doc isn't touched necessarily:

coerced_doc = v.validate(doc)  # breaks compatibility

or

coerced_doc = v.coerce_values(doc)  # which implies v.validate()
# or even
c = cerbereus.Coercer(…)  # that instance will have an own Validator-instance as property
coerced_doc = c.validate(doc)

or

v.validate(doc)
coerced_doc = v.coerced_doc

what about defining coerce similar like custome type-checks? that would require a sub-class to use it, but the design is more stringent.

there should be some tests.

funkyfuture · 2015-05-08T14:27:11Z

cerberus/cerberus.py

@@ -327,6 +333,13 @@ def validate_schema(self, schema):
                            raise SchemaError(errors.ERROR_UNKNOWN_RULE % (
                                constraint, field))

+    def _validate_coerce(self, coerce, field, value):
+        try:
+            value = coerce(value)


where is coerce implemented? it should propably be a method.

The coerce argument is the callable that you define in the schema. See https://github.com/nicolaiarocci/cerberus/pull/71/files#diff-aebc53cd4926f3a579adb7ba188de369R214

ah, yeah. with a definition of methods like for checking types, these functions / then methods can be designed to consider context. i guess sooner or later someone will need that.

funkyfuture · 2015-05-08T14:28:22Z

oh, can schema-validation improve fool-proofness?

brettatoms · 2015-05-08T14:58:09Z

You can certainly omit 'type': 'integer' if your doing a type coercion. I also use the coercion to do something like parse a string as a comma separated list and then use the type validator to verify each element in the list matches a regex.
I also don't like that the coercion is destructive but it was way simpler to implement it this way without breaking anything else. I tend to do make a copy of the dict before passing it in if I want to keep the original around: validate(request.values.to_dict(), schema)

I tend to validate my schemas like this to get back the document and avoid breaking compatibility:

class Validator(cerberus.Validator):
    def __call__(self, document, *args, **kwargs):
        self.validate(document, *args, **kwargs)
        return self.document


@contextmanager
def validator():
    v = Validator()
    try:        
        yield v
    finally:
        if len(v.errors) > 0:
            raise ValidationError(v.errors)


with validator() as validate:
    values = validate(doc, schema)

Yes there should be tests.

funkyfuture · 2015-05-08T15:24:08Z

then i propose you add such example to the documentation. this better shows the possibilities and makes sense. examples that include redundancy often lead to misunderstandings. in that case it can be interpreted that 'coerse': int somehow correlates directly to 'type': 'integer'.
i think it's reasonable that the library ensures a non-destructive way. i also showed three approaches on that.
with schema-validation i meant the part where Validator checks the provided schema. so it should check the callable's existence, 'callability' and signature.
~~i like the idea to implement __call__ and propose to upstream it to Validator.~~ already present.
as pointed out in the line-comment: to define coersion-routines in the class offers the possibility to contextualize.

nicolaiarocci · 2015-05-11T07:28:58Z

Only concern I have on this is that it transforms the input. Of course there's a note in the docs, and that's probably enough. Also, in several scenarios this might actually the desired behaviour, which leads me back to my original concern that we this update we are, in fact, transitioning from a pure validation tool to a validation and transformation utility.

I wonder if we should make transformation an explicit opt-in, like setting up a transformation propriety which is False by default. When this property is False then a coerce in the schema will raise a schema error. This property could also be allowed as a initialization argument. Thoughts?

funkyfuture · 2015-05-11T09:45:50Z

transitioning from a pure validation tool to a validation and transformation utility.

in short: i have my doubts that just calling a function is enough to achieve that properly and to fit more complex use-cases. i pointed at my concerns before. if anybody wants, i can elaborate more on these which are unclear.

brettatoms · 2015-05-11T13:00:39Z

@nicolaiarocci How about doing a copy.copy() on the document before setting it to to self.document in Validator._validate() so that its not destructive to the original input?

funkyfuture · 2015-05-11T20:42:35Z

since referenced objects are potentially changed one should use copy.deepcopy.

nicolaiarocci · 2015-05-12T05:56:16Z

@brettatoms yes, absolutely.

I initially ruled it out because I thought that people might actually want to apply transformations (we also had requests like that in the past). Also agree with @funkyfuture that deepcopy is better. Would you update the PR and docs?

martijnvermaat · 2015-05-12T12:02:39Z

For reference, an older similar request was #3

…coercion-pr

aleksey-kutepov · 2015-05-12T13:10:51Z

make transformation an explicit opt-in

How about subclassing cerberus.Validator with something like cerberus.CoerceValidator and do coercion related stuff there. We'd get backward compatibility along with explicit declaration.
Though this requires some refactoring of Validator._validate

funkyfuture · 2015-05-12T18:48:17Z

if there was an extra-class, wouldn't it be simpler to leave the Validator-class as it is and let a CoerceValidator use it? as far as i understood, the coercion would happen first anyway.

    class CoerceValidator():
        def __init__(self, *args, **kwargs):
            self.validator = Validator(*args, **kwargs)

        def validate(…):
            do_coersion()
            self.validator.validate(…)

but i don't see much harm in this now. however, it'd be easy to deepcopy and write changed value-objects to another class-property than document. for example, validated_document.

and i still think that it's also a good idea to implement checks in validate_schema to avoid later runtime-errors. e.g. an object without __call__-method is referenced in a schema.

also, i can anticipate that if i wanted to use that feature (and atm there is actually one project where i would), i would have to rely on context (e.g. a basepath to resolve a path), which is hardly accessible from a function and sould be available as properties of a class-instance. so i'm absolutely pro an extendibility like for type-checking. but that can be added later too.

nicolaiarocci · 2015-05-13T06:43:40Z

Thanks, this has been merged in 8fc11a1. Feel free to update both AUTHORS and CHANGES with your full name.

Brett added 2 commits February 24, 2015 14:19

add a type coercion validator

4023d36

catch TypeError and ValueError in coercian validator

32ba5ec

nicolaiarocci closed this Mar 13, 2015

brettatoms mentioned this pull request May 6, 2015

Pre-processing and collecting valid data #77

Closed

nicolaiarocci reopened this May 7, 2015

Brett added 3 commits May 7, 2015 08:50

Merge branch 'master' of github.com:nicolaiarocci/cerberus into type-…

01c5e28

…coercion-pr

add docs

d96e2fb

flake8 fix

e89fe2a

funkyfuture reviewed May 8, 2015
View reviewed changes

Brett added 5 commits May 12, 2015 08:19

add custom error

4b43cea

type coercion is not destructive

c82aa55

add tests

4fe51c7

updates docs

59a0192

Merge branch 'master' of github.com:nicolaiarocci/cerberus into type-…

b56e527

…coercion-pr

nicolaiarocci added a commit that referenced this pull request May 13, 2015

Changelog update for #71

8555750

nicolaiarocci closed this May 13, 2015

funkyfuture mentioned this pull request Sep 18, 2015

Python 2.6 issue with deepcopy'ing complex objects #147

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Type coercion validator #71

Type coercion validator #71

brettatoms commented Feb 24, 2015

nicolaiarocci commented Mar 13, 2015

brettatoms commented Mar 13, 2015

brettatoms commented Mar 13, 2015

nicolaiarocci commented Mar 19, 2015

brettatoms commented Mar 19, 2015

nicolaiarocci commented Mar 19, 2015

nicolaiarocci commented Mar 19, 2015

aleksey-kutepov commented May 7, 2015

nicolaiarocci commented May 7, 2015

brettatoms commented May 7, 2015

aleksey-kutepov commented May 8, 2015

funkyfuture commented May 8, 2015

funkyfuture May 8, 2015

brettatoms May 8, 2015

funkyfuture May 8, 2015

funkyfuture commented May 8, 2015

brettatoms commented May 8, 2015

funkyfuture commented May 8, 2015

nicolaiarocci commented May 11, 2015

funkyfuture commented May 11, 2015

brettatoms commented May 11, 2015

funkyfuture commented May 11, 2015

nicolaiarocci commented May 12, 2015

martijnvermaat commented May 12, 2015

aleksey-kutepov commented May 12, 2015

funkyfuture commented May 12, 2015

nicolaiarocci commented May 13, 2015

Type coercion validator #71

Type coercion validator #71

Conversation

brettatoms commented Feb 24, 2015

nicolaiarocci commented Mar 13, 2015

brettatoms commented Mar 13, 2015

brettatoms commented Mar 13, 2015

nicolaiarocci commented Mar 19, 2015

brettatoms commented Mar 19, 2015

nicolaiarocci commented Mar 19, 2015

nicolaiarocci commented Mar 19, 2015

aleksey-kutepov commented May 7, 2015

nicolaiarocci commented May 7, 2015

brettatoms commented May 7, 2015

aleksey-kutepov commented May 8, 2015

funkyfuture commented May 8, 2015

funkyfuture May 8, 2015

Choose a reason for hiding this comment

brettatoms May 8, 2015

Choose a reason for hiding this comment

funkyfuture May 8, 2015

Choose a reason for hiding this comment

funkyfuture commented May 8, 2015

brettatoms commented May 8, 2015

funkyfuture commented May 8, 2015

nicolaiarocci commented May 11, 2015

funkyfuture commented May 11, 2015

brettatoms commented May 11, 2015

funkyfuture commented May 11, 2015

nicolaiarocci commented May 12, 2015

martijnvermaat commented May 12, 2015

aleksey-kutepov commented May 12, 2015

funkyfuture commented May 12, 2015

nicolaiarocci commented May 13, 2015