Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for meta data keywords and custom meta data keywords #24

Closed
atrniv opened this issue Jan 16, 2014 · 13 comments
Closed

Support for meta data keywords and custom meta data keywords #24

atrniv opened this issue Jan 16, 2014 · 13 comments

Comments

@atrniv
Copy link

atrniv commented Jan 16, 2014

It would be really great if it was easier to do the following things with Z-Schema

  1. Actually fill in default values into a json instance using the validator.
    This is something which is a real pain especially because of javascript lack of strict type checking. It would be great to be able to ignore keys and just have the validator fill in the default values instead of doing it in a separate piece of code.
  2. Add custom meta data keywords for additional data transformations
    I believe filling in default values would be considered as a meta data operation applied to the json instance before actually performing the validation. Making this operation customizable would be a real time saver for many common scenarios.

The case for a customizable meta data value would be scenario where you may have to perform additional processing on the data before or after validation besides just putting a default value.

For example if we wish to sanitize html which is submitted to the server as a json request. I could simply implement a meta data boolean keyword called sanitize which would then sanitize the input before validating it.

It could also be used for perhaps converting date-time strings to Date or serialized mongoDB objectIDs to ObjectID instances.

@atrniv
Copy link
Author

atrniv commented Jan 16, 2014

So i was sorta impatient and I went ahead and did a basic implementation here
It supports basic synchronous transformations of object properties right now. It covers most of my use cases but I think it would be great to expand further on this functionality.

Some sample code:

{Utils} = zschema

x = { foo: undefined }

zschema.registerMetaDataKeyword 'date', (schema, property, instance, value) ->
  return unless schema.type is 'string' and schema.format is 'date-time' and value
  instance[property] = new Date(instance[property])
  return
, true

zschema.registerSchemaValidator 'date', (report, schema) ->
  report.expect(Utils.isBoolean(schema.date), 'KEYWORD_TYPE_EXPECTED', {keyword: 'date', type: 'boolean'} )
  return

zschema.validate(x, {
  type: 'object'
  properties:
    foo:
      type: 'string'
      format: 'date-time'
      default: new Date().toISOString()
      date: true
})
.then (report) ->
  # Outputs true
  console.log x.foo instanceof Date
  zschema.validate(x, {
    type: 'object'
    properties:
      foo:
        type: 'string'
        format: 'date-time'
        default: new Date().toISOString()
        date: 'Invalid Value'
  })
.catch (report) ->
  # Outputs validation error for invalid date schema value
  console.log report
.done()

@zaggino
Copy link
Owner

zaggino commented Jan 16, 2014

Thanks @atrniv , I'll take a look and incorporate it into the next version.
You can open a pull request if you want, but I might change the code a bit.

@atrniv
Copy link
Author

atrniv commented Jan 16, 2014

Your welcome, the code is missing test cases and support for asynchronous transformations, I was thinking to open a pull request after I added those missing bits.

Btw really great implementation, I was using JSV for a long while and your code was a breath of fresh air after several failed attempts to do something similar in JSV's code base. 👍

@zaggino
Copy link
Owner

zaggino commented Jan 16, 2014

Thanks, one of the reasons I wrote this was because I wasn't able to read the code of others and make the modifications that I needed too :)

Feel free to work on this as long as you want and then open a pull request. I'll review and merge then.

@ghost ghost assigned zaggino Jan 16, 2014
@zaggino
Copy link
Owner

zaggino commented Jan 16, 2014

And please be sure to write some real life usage of the features in the test cases. While I don't object to any new feature that will enable more customization of the validator, I wouldn't want to spam it with features that were never used.

This was referenced Jan 24, 2014
@zaggino
Copy link
Owner

zaggino commented Feb 5, 2014

@atrniv 1) default should be implemented, I'm not arguing that one.

but 2) has raised some concerns in me. How would you want to do the data transformations when validating an object? You can't really put something like that into the validation algorithm itself. Every subschema is launching validation algorithm on its own (recursion through arrays and objects) and if one subschema modifies an object, another schema has to work with modified object and could fail. This could be very tough to debug especially when using a large amount of schemas and references between them.

Even searching for which schema put this default value there? would be quite tough.

@atrniv
Copy link
Author

atrniv commented Feb 5, 2014

I encountered the same road-blocks when i started implementing the feature.

The approach I've taken is as follows:
There are 2 sets of transformers which can modify the data. MetaDataPreValidators and MetaDataPostValidators. The pre validators are applied to the instance before validation and the post validators are applied after the object and all its children have been validated.

The reason for this is because a meta-data keyword like default should not be able to set a value which violates the schema. So default is a pre validator.

However something like transforming a date value should be done after the original value has been validated according to the schema, so it will be a post validator.

The behavior of the default keyword becomes tricky when it is used in combination with anyOf, oneOf and allOf since it modifies the object itself.

I worked around this limitation by implementing a simple rollback mechanism.
Every pre validator is passed an array of rollback functions into which it can push its own rollback function defined in a closure. In case a schema validation fails any data transformations will be rolled back and the next schema validation on that object will behave as expected.

Here's what the default validator looks like:

'default': function (schema, property, instance, value, rollbacks) {
            if (instance[property] == null) {
                rollbacks.push((function (property, instance) {
                    return function () {
                        delete instance[property];
                    };
                })(property, instance));
                instance[property] = value;
            }
        }

This approach also makes it simple to figure out what went wrong as default values don't get permanently applied to the object unless the schema applying it successfully validates the object.

So far my implementation seems to work great, however there is one last issue which I think also exists in other scenarios where validation with anyOf, oneOf and allOf occurs. The error report that is generated is very difficult to decipher in these scenarios. I propose that we add a behavior flag on the validator which would allow it to display only those errors which are generated from the schema which best matches the instance. i.e The schema which gives rise to the least errors.

PS: I hadn't taken into account the not keyword during my implementation, will need to think about it.

@zaggino
Copy link
Owner

zaggino commented Feb 7, 2014

Rollbacks? Auch ... but makes sense. This is going to be a very complicated solution...
What will you do if you have anyOf and let's say two or more schemas that include some defaults will pass? What if defaults in one will be a and defaults in second will be b. You'll have no idea, which of those will be applied (not in async mode at least, if you want to run validations in parallel).

@atrniv
Copy link
Author

atrniv commented Feb 7, 2014

Looking at the current implementation anyOf validates the schemas in order of their declaration both in sync and async modes. I think it should remain that way and not be run in parallel since that would make applying defaults and any other pre-validation transformations impossible unless the validator considers the original instance immutable and returns a new instance after validation.

There are two scenarios I can think of possible here:

  1. Apply defaults of the first schema that validates successfully and stop any further schema checks.
  2. Validate against all schemas specified in anyOf and apply all their defaults. Once a schema has applied the default value for a key, any future schema's will not apply their default values.

In both cases if a schema fails to validate the instance it's default values will be rolled back. Each schema in the anyOf keyword should be validated against the instance in the order it is specified. In this way the user has some control over what default value is used by modifying the order in the schema declaration.

I think Z-Schema is probably the best validator suited to implement this feature using its validation behavior flags since the specification does not say much about how to handle this scenario.

@bzuillsmith
Copy link

I suppose nothing ever came of this? Some way to plug in a transformation library would be awesome. I almost always need to mark certain string fields as needing to be trimmed. It would be awesome to be able to add simple keywords like trim: true to the schema to handle that for me.

At the moment, I'm thinking I'll have to write my own library to transform objects based on certain keywords in the schema.

@atrniv
Copy link
Author

atrniv commented Apr 23, 2014

Actually I completed the implementation quite some time back. However I couldn't find the time to write tests and somehow I just lost track of this. I'll try and make a pull request for the same over the next week.

@zaggino zaggino removed their assignment Apr 23, 2014
@zaggino
Copy link
Owner

zaggino commented Aug 20, 2014

Guys, I've tried to implement this (and I sort of did) and then I failed horribly when writing complex tests. I didn't find a way to implement this for complex cases including a lot of anyOf, oneOf or allOf cases and if I published this, it wouldn't be reliable enough as is the rest of this library. For that reason, this won't make it to the 3.x release and I'm closing this long-term issue.

@zaggino zaggino closed this as completed Aug 20, 2014
@atrniv
Copy link
Author

atrniv commented Jan 31, 2015

@zaggino, @bzuillsmith I've actually been able to implement behavior for default handling including all edges cases I could find in Themis . I couldn't finish the original implementation in ZSchema 3 because there was no clean way to do it without mutating the object instance which is used the specify the default value in the schema itself. The validator would mutate the schema and cause all kinds of unpredictable behaviour.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants