Skip to content
This repository has been archived by the owner on Jul 7, 2022. It is now read-only.

additional schema keywords #17

Closed
wants to merge 3 commits into from
Closed

additional schema keywords #17

wants to merge 3 commits into from

Conversation

misja
Copy link

@misja misja commented Mar 14, 2018

I really like the parsing approach this projects takes and whilst trying things I was missing some keywords. This pull adds the format keyword to primitives plus two additional string type keywords contentEncoding and contentMediaType which were recently added. As for the format keyword, this doesn't require an implementation, I've added it in a way that it can be implemented by subclassing the validator.

@jasonwalsh
Copy link
Contributor

Hey @misja

Thanks for your PR, I appreciate it!

I like the idea of having a visit_format method. One thing I might suggest instead of raising a NotImplementedError is just to have a return statement. That way the try/except block could be removed from the visit_primitive method.

Eventually, I think there should be a Visitor "interface" that the ValidationVisitor and ResolveVisitor implement, which would raise NotImplementedErrors.

Aside from that, the PR looks good to me!

@misja
Copy link
Author

misja commented Mar 16, 2018

Hi @jasonwalsh , thanks for the review! I must admit this pull request was also to test the waters and see if the project was still active/being watched, glad it is :) I agree with dropping NotImplementedError but I'd like to run another option past you first. The format keyword is an odd one since it can serve both as annotation or assertion, so I also tinkered with implementing it as a component which also deals with its optional nature (if visitor has no implementation, skip it):

class Format(Component, str):

    def accept(self, visitor, *args):
        try:
            if self != '':
                return visitor.visit_format(self, *args)
        except AttributeError:
            pass

Any visitor implementing the method could then use its value for a check -

{
    'email': lambda instance: re.match('email regex', instance) is not None,
    'date-time': ...
}[primitive](instance)

Anyway, let me know your thoughts. And again, really like this project, I'll see what else I can do if time permits.

@jasonwalsh
Copy link
Contributor

jasonwalsh commented Mar 19, 2018

Hey @misja

I like the idea of having a separate class for Format.

Also, in reading the documentation, it states:

Implementations MAY support the "format" keyword as a validation assertion. Should they choose to do so:

  • they SHOULD implement validation for attributes defined below;
  • they SHOULD offer an option to disable validation for this keyword.

Maybe in the Format class, there's an instance attribute named validate of boolean type? Something like:

class Format(str):
    def __new__(cls, value, *args, **kwargs):
        return super().__new__(cls, value)

    def __init__(self, value, validate=False):
        self.validate = validate

    def accept(self, visitor, *args):
        if self:
            visitor.visit_format(self, *args)

And maybe in the command line API, there's a flag like --force-format, which stores a boolean value?

Let me know what you think, and thanks again for your interest in the project!

@misja
Copy link
Author

misja commented Mar 19, 2018

We should be careful or this could end up in a very long thread about spec semantics :) Looking again at my suggestion and your comments (thanks!) I think it would be better to keep a strict separation of concerns. The Format class should ideally be unaware of validation, this is the domain of the visitor, the actual validating implementation.

So I think any configuration should stay limited to the ValidationVisitor only and for now could be implemented as an empty visit_format stub until some format type validation is implemented. Passing a configuration option as a kwarg could get messy, since it will need to get passed to every instance created. Another way to approach this is configuration by implementation - if a ValidationVisitor subclass has implemented the stub then format validation is performed else not. This could even be implemented as a mixin making it easier to switch classes in the cli and for people interested using it as a library (me). Note that the same requirements also apply to the contentMediaType and contentEncoding keywords.

Another reason to keep the configuration limited to the validating visitor is because it would make no sense in a transforming visitor e.g. when creating Avro or Thrift schemas where the format keyword would be more useful as a type hint.

Will try to find some time to add commits to this pull pursuing the mixin approach so that we have some code to talk about. And perhaps some more along the way, I ran the JSON Schema test suite and err, quite a bit of work still to be done ;)

@jasonwalsh
Copy link
Contributor

Hey @misja

Thanks for your comment, that makes sense to me. I like the idea of implementing a mixin class for Format validation. I too agree that keyword arguments can lead to a messy solution and that it does not necessarily make sense to have a validate attribute for the Format class if the ValidationVisitor only uses it.

Another reason to keep the configuration limited to the validating visitor is because it would make no sense in a transforming visitor e.g. when creating Avro or Thrift schemas where the format keyword would be more useful as a type hint.

Yes, I totally agree with this.

Thanks again!

@Mdraugelis Mdraugelis closed this Aug 1, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants