New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Field validators as schema methods? #116

Closed
sloria opened this Issue Dec 27, 2014 · 37 comments

Comments

Projects
None yet
3 participants
@sloria
Member

sloria commented Dec 27, 2014

Often, validators and preprocessors only apply to a single Schema, so it may make sense to define them within the Schema--rather than as separate functions--for better cohesion.

Method names could be passed to __validators__, et. al, like so:

class UserSchema(Schema):
    __validators__ = ['validate_schema']

    def validate_schema(self, data):
        # ...
@philtay

This comment has been minimized.

Contributor

philtay commented Feb 15, 2015

+1

This would be very nice to have in 2.0

@taion

This comment has been minimized.

Contributor

taion commented Mar 29, 2015

Have you considered making this a @property instead of a class field? Then you could do e.g.

@property
def __validators__(self):
    return {
        '__SCHEMA__': self.validate_schema,
        'field_1': self.validate_field_1
    }

Instead of using the names of the validation functions.

@taion

This comment has been minimized.

Contributor

taion commented Mar 29, 2015

Actually, you ought to be able to use decorators for this. Something like

def validates(field_name):
    def wrapper(fn):
        fn._marshmallow_validates = field_name
        return fn

    return wrapper


class SchemaBase(type):
    def __new__(mcs, name, bases, dct):
        for value in dct.itervalues():
            if hasattr(value, '_marshmallow_validates'):
                print value._marshmallow_validates, value


class Schema(object):
    __metaclass__ = SchemaBase

    @validates('bob')
    def validate_bob(self):
        print self

Or if you wanted to be fancy, returning a descriptor from the decorator and doing an isinstance check or something.

I think this has better ergonomics and is more consistent with the rest of the API, though.

@sloria

This comment has been minimized.

Member

sloria commented Mar 29, 2015

Good idea, @taion .

@sloria

This comment has been minimized.

Member

sloria commented Apr 18, 2015

Now that #191 is merged, we can move forward with the decorator-based validators.

@sloria

This comment has been minimized.

Member

sloria commented May 10, 2015

The marshmallow.validator decorator is implemented in 5917299.

See the "Schema-level validation" section of the docs for example usage: https://marshmallow.readthedocs.org/en/dev/extending.html#schema-level-validation.

The actual API has a minor difference from @taion 's proposal above: there is no need to pass the field name to the validator decorator. This is because you can specify field names in the ValidationError constructor:

from marshmallow import Schema, fields, validator, ValidationError

class MySchema(Schema):
    bob = fields.Field()

    @validator
    def validate_bob(self, data):
        #...
        raise ValidationError('bob is invalid', 'bob')

The rationale for specifying field names on ValidationError is discussed in #47.

Now that we have method decorators for preprocessors, postprocessors, and validators, I'm closing the issue for now.

@sloria sloria closed this May 10, 2015

@taion

This comment has been minimized.

Contributor

taion commented May 10, 2015

Ah, that makes sense. I was a little confused with the e.g. @validates('bob') example, in that this was meant to be a field validator. That actually wouldn't have made any sense, though, since you'd just specify that validator on the field, which probably makes more sense and is more consistent.

I was thinking more like http://docs.sqlalchemy.org/en/latest/orm/mapped_attributes.html#simple-validators, but I guess in this case with Marshmallow, you'd use a custom field, right?

@sloria

This comment has been minimized.

Member

sloria commented May 10, 2015

@taion Yes, the validate_bob example is contrived: you would use a field validator in that case. I just wanted to show that this API allows for the validate_* methods proposed in #153 (comment)

@philtay

This comment has been minimized.

Contributor

philtay commented May 11, 2015

@sloria What about something like this?

class User(Schema):
    name = fields.String()
    age = fields.Integer()

    @validator('age')
    def ensure_adult(self, value):
        if value < 18:
            raise ValidationError('Too young!')

It would be a nice convenience.

@taion

This comment has been minimized.

Contributor

taion commented May 11, 2015

I think I'd prefer to write my validators like this as well. It seems a little cleaner visually than defining them within the field specification, and I don't want to make custom fields for everything that needs a custom validator.

@sloria

This comment has been minimized.

Member

sloria commented May 12, 2015

I had proposed something similar a while back in #47 (comment). We decided against it then, and I still think it is the right decision.

  1. Specifying the fields to store errors on in the ValdationError allows a single validator to store different errors on different fields.
  2. It is preferable to have one way to specify where errors are stored. Having two ways makes certain situations confusing. For example:
class MySchema(Schema):
    foo = fields.Int()
    bar = fields.Int()

    @validator('foo')
    def validate_schema(self, data):
        # ... What should be done here?
       raise ValidationError('schema invalid', 'bar')
@taion

This comment has been minimized.

Contributor

taion commented May 12, 2015

How does the issue not arise with the current field-level validators? Can't I do the same thing with raising a ValidationError pointing at the wrong field in the validation function passed into a field?

I see that @philtay was on the original issue, but I think the sample code he has for validating age is a bit cleaner than passing in a validation lambda to the age field.

@philtay

This comment has been minimized.

Contributor

philtay commented May 12, 2015

I can envision the use of dot notation in the @validator decorator.

class Person(Schema):
    age = fields.Integer()

    # I don't want to specify a validator here.
    # This schema is a "base" one. It maps
    # directly to a DB model.


class BuyJuice(Schema):
    customer = fields.Nested(Person)

    # This is the website form (or API method)
    # to buy orange juice. No need to validate
    # the customer's age.


class BuyAlcohol(Schema):
    customer = fields.Nested(Person)

    @validator('customer.age')
    def min_age(self, value):
        # 21 years to purchase

    # Here the age validation is mandatory,
    # but it has nothing to do with the
    # Person schema itself.

It also promotes reusability:

class Vehicle(Schema):
    wheels = fields.Integer()

class Motorcycle(Vehicle):
    @validator('wheels'):
    def two_wheels(self, value):
        if value != 2:
            ....

class Car(Vehicle):
    @validator('wheels'):
    def four_wheels(self, value):
        if value != 4:
            ....
@sloria

This comment has been minimized.

Member

sloria commented May 12, 2015

Dot notation could hypothetically be used with ValidationErrors as well.

raise ValidationError('Must be over 21 years of age', 'customer.age')

Validators can made reusable by using callable objects:

validate.Equals(2)
validate.Equals(4)

@validator is currently used to define schema-level validation. I don't think it makes sense to conflate its usage with field-level validation. For one, the parameters for @validator (raw and pass_original) only make sense in the context of schema-level validation. Therefore, if the above proposal does get implemented, it would probably be as a separate decorator.

@taion

This comment has been minimized.

Contributor

taion commented May 12, 2015

I think by analogy with SQLAlchemy, we'd want to call it @validates: http://docs.sqlalchemy.org/en/latest/orm/mapped_attributes.html

Not sure if the schema-validator decorator should then be called @validates_schema or something.

@taion

This comment has been minimized.

Contributor

taion commented May 12, 2015

Rather, I suppose it would be @validates_object or @validates_instance.

@philtay

This comment has been minimized.

Contributor

philtay commented May 12, 2015

Dot notation could hypothetically be used with ValidationErrors as well.

Sure. Decorators are just sugar. There will always be another way to specify the same validator, but in several circumstances it will be a much less elegant way. We could call it @elegant_sugar!

Validators can made reusable by using callable objects:

If you use a decorated validator it's because you DON'T want to make it reusable. It's mainly because it makes sense only in a particular schema, for a particular use case. In my example what you really make reusable is the field, not the validator.

Therefore, if the above proposal does get implemented, it would probably be as a separate decorator.

Ok. My preference goes to @validates.


Upon further reflection, I think we can do even better and, at the same time, solve the "error storage" problem.

class Person(Schema):
    age = fields.Integer()

    @validates('age')
    def ensure_adult(self, value):
        if value < 18:
            return 'Too young!'

The @validates decorator raises a ValidationError('Too young!', 'age') only if the decorated method returns a string. If None is returned no exception will be raised.

@taion

This comment has been minimized.

Contributor

taion commented May 12, 2015

I feel like returning a string is a bit too magical. I think if someone tries to raise an error for the wrong field in a field validator

@taion

This comment has been minimized.

Contributor

taion commented May 12, 2015

oops finger slipped -

then something surprising should happen to him, because he's doing something totally weird.

@philtay

This comment has been minimized.

Contributor

philtay commented May 12, 2015

Sure, I don't mean it as a safety net and nothing prevents you to raise a ValidationError in the method yourself. It's just redundant. The decorator already has all the info it needs. Only a string is missing. It's just more sugar after all. And not magic sugar IMO. You're in a @validates method after all. The only thing you can return is an error.

@taion

This comment has been minimized.

Contributor

taion commented May 12, 2015

I just feel that it isn't necessary.

raise ValidationError("too young")

is barely any more work than

return "too young"

and is just a lot more explicit, even in the context of already being in a validation function.

We can infer that the ValidationError raised applies to the field being validated.

@philtay

This comment has been minimized.

Contributor

philtay commented May 12, 2015

I can live with both of these solutions. As you pointed out they're very similar. Anyway, just for the sake of completeness: Colander uses the same idea (return a string) in their Function validator.

@sloria

This comment has been minimized.

Member

sloria commented May 14, 2015

Reopening this with a new title, as this is good discussion.

I am open to allowing field validator methods, and I would gladly review PRs for this. However, it isn't a high priority for 2.0.0, since this feature is a convenience rather than a breaking API change.

@sloria sloria reopened this May 14, 2015

@sloria sloria changed the title from Preprocessors, data_handlers, validators as schema methods? to Field validators as schema methods? May 14, 2015

@taion

This comment has been minimized.

Contributor

taion commented May 14, 2015

Do you think it makes sense to have both ways to declare "local" field validators not tied to the field type? I feel like that might be a bit confusing.

@philtay

This comment has been minimized.

Contributor

philtay commented May 14, 2015

Before going ahead with a PR:

  • Do you like the name @validates?
  • Do you like the "return string" feature?
  • What should happen if the field specified in the decorator doesn't exist (i.e. @validates('age') but the age field is not declared in the schema)?

I agree this should not be considered blocking for 2.0. However, IMO, there is another (unrelated) thing to do before 2.0: deprecate the Float field. #206

@taion

This comment has been minimized.

Contributor

taion commented May 14, 2015

FWIW, I'd vote:

  • @validates is a great name
  • Returning a string is unnecessary/confusing
  • No opinion
@sloria

This comment has been minimized.

Member

sloria commented May 15, 2015

  • I do like the conciseness, but I can see it could be confusing to have both @validator and @validates. I suppose @validator could be renamed to @validates_object as @taion suggested, but it's not ideal. Will have to think more about this.
  • I also think this is unnecessary.
  • I also don't feel strongly about this; perhaps throw a ValueError for now.
@taion

This comment has been minimized.

Contributor

taion commented May 16, 2015

I like @validates_object. It's the most straightforward parallel to @validates.

@philtay

This comment has been minimized.

Contributor

philtay commented May 16, 2015

@taion are you going to take a stab at this or should I? Unfortunately this weekend and next week I will be pretty busy, but it would be very nice to have this feature in 2.0. Ideally in 2.0.0.b3.

@taion

This comment has been minimized.

Contributor

taion commented May 16, 2015

I'm going to be pretty busy this weekend. I'm not sure when I'd have time to start working on it. If I do find some time, I'll update this issue, but for now it's not looking too likely over the next week either.

@sloria

This comment has been minimized.

Member

sloria commented May 24, 2015

If we're going to rename @validator, I lean towards @validates_schema rather than @validates_object or @validates_instance. In the common case, the validator will be validating a dict, so validate_object might be slightly misleading.

@taion @philtay Thank you both for offering to help out with this feature. I, too, am pretty occupied over the next two weeks so likely won't be able to work on this. I would gladly review and merge a PR, though.

@taion

This comment has been minimized.

Contributor

taion commented May 24, 2015

With the caveat that I don't actually care that much about the name, I think @validates_schema might be a little confusing because it's not actually validating the schema - it's doing schema-level validation and thus validating the object/instance.

@philtay

This comment has been minimized.

Contributor

philtay commented May 24, 2015

Bikeshedding season is now open...

+1 @validates_schema

@taion

This comment has been minimized.

Contributor

taion commented May 24, 2015

It's only bikeshedding if there's substantive disagreement. I'm okay with @validates_schema.

sloria added a commit that referenced this issue May 26, 2015

@sloria

This comment has been minimized.

Member

sloria commented May 26, 2015

Went with @validates_schema for now.

@philtay philtay referenced this issue May 26, 2015

Closed

@validates #218

@sloria

This comment has been minimized.

Member

sloria commented Jun 14, 2015

Added by #222

@sloria sloria closed this Jun 14, 2015

@taion

This comment has been minimized.

Contributor

taion commented Jun 14, 2015

Awesome! Sorry I didn't get around to coding this up. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment