Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should validate without passing schema to method validate #541

Open
estahn opened this issue Nov 5, 2018 · 10 comments
Open

Should validate without passing schema to method validate #541

estahn opened this issue Nov 5, 2018 · 10 comments

Comments

@estahn
Copy link
Contributor

estahn commented Nov 5, 2018

I would expect to receive some failures given the following JSON document:

{
  "$schema": "http://iglucentral.com/schemas/com.snowplowanalytics.self-desc/schema/jsonschema/1-0-0#"
}

and code

$validator = new JsonSchema\Validator;
$validator->validate($data);
var_dump($validator->getErrors());

But it is reporting as valid. The JSON document itself contains the schema and I shouldn't need to pass it in.

@erayd
Copy link
Collaborator

erayd commented Nov 5, 2018

The JSON schema spec doesn't allow for any special treatment of particular keys within a document instance, not even $schema. While this key is special within a schema, you can't use it the way you are trying to here.

The behavior is correct - this issue is user error.

@estahn
Copy link
Contributor Author

estahn commented Nov 5, 2018

@erayd Thanks, so it's expected behaviour, gotcha. Then I don't understand why the schema for validate is optional. It appears counterintuitive.

As for your argument in regards to JSON Schema please see here: https://snowplowanalytics.com/blog/2014/05/15/introducing-self-describing-jsons/#sdjs

While the JSON schema spec doesn't make an explicit statement for the use of $schema in JSONs it also doesn't prohibit it.

Maybe it could be added as a feature flag ENABLE_SELF_DESCR_JSON.

Thoughts?

@erayd
Copy link
Collaborator

erayd commented Nov 5, 2018

@estahn There was a huge discussion regarding this in the actual spec group about a year ago - the outcome, if I recall correctly, was that privileging keywords in a document instance is hugely problematic, because it makes assumptions about user data that are not guaranteed to be correct. It also opens an exploit vector. The spec itself doesn't explicitly prohibit it, because it's completely out of scope. However, 'unspecified' doesn't mean 'acceptable'.

Noting that it's extremely easy to implement this yourself as a preprocessor in just a few lines of code, I'm not happy to add such a feature to this project.

I don't recall why the schema argument for validate is optional, but it's likely either for backwards compatibility with something (I can't think what that might be though), or a mistake. I'm happy for that argument to become compulsory unless someone else has an objection to that.

@estahn
Copy link
Contributor Author

estahn commented Nov 5, 2018

@erayd Thanks for your response and I do see your point.

Is a preprocessor something you see as part of the project or something that should be done outside? I do have another use-case I could use that for. phpunit-json-assertions makes use of justinrainbow/json-schema for validation and as part of that I use the SchemaStorage to map URLs to local files, e.g.

        $directory = new \RecursiveDirectoryIterator(realpath(__DIR__ . '/../../schemas/au.com.domain.events'));
        $flattened = new \RecursiveIteratorIterator($directory);
        $files = new \RegexIterator($flattened, '/^(?<schema>.*?\/schemas\/(?<namespace>(.+?))\/(?<name>.+?)\/jsonschema\/(?<version>\d+-\d+-\d+))$/i', \RecursiveRegexIterator::GET_MATCH);

        foreach ($files as $file) {
            $jsonObject = json_decode(file_get_contents($file['schema']));
            $storage->addSchema($jsonObject->id, $jsonObject);
        }

A preprocessor would allow to attach that behaviour in a nicer way.

$validator = new JsonSchema\Validator;
$validator->addPreprocessor(new JsonSchemaUriToFile());
$validator->addPreprocessor(new JsonSchemaSelfDescr());
$validator->validate($data);
var_dump($validator->getErrors());

On a second thought it may not be a concern of the validator.

@erayd
Copy link
Collaborator

erayd commented Nov 5, 2018

@estahn I see the actual preprocessor as being out of scope for this project. However, I'm open to including a general preprocessor-attachment interface of some kind.

@maximbaz @bighappyface What do you guys think?

@maximbaz
Copy link

maximbaz commented Nov 6, 2018

I think you meant to ping someone else instead of me 😄 Not familiar enough with the project to give you meaningful feedback on this matter.

@erayd
Copy link
Collaborator

erayd commented Nov 6, 2018

@maximbaz Ooops, sorry!

@shmax - That was supposed to be you...

@bighappyface
Copy link
Collaborator

Noting that it's extremely easy to implement this yourself as a preprocessor in just a few lines of code, I'm not happy to add such a feature to this project.

👏

I think working out individual needs is the burden of the individual.

@shmax
Copy link
Collaborator

shmax commented Nov 6, 2018

@shmax - That was supposed to be you...

I'll defer, as I don't fully understand the issue. Five minutes ago I didn't know the library supported this feature, but the optional $schema argument can be explained by the code I see in SchemaConstraint. Given that bit of logic, he does seem to have a point; inline schemas are apparently supported, only we expect the inline schema to be an object or an array, and aren't prepared for the possibility that it's a uri.

But it does seem like a huge hole, and I think it's kind of nuts that it's there at all.

@erayd
Copy link
Collaborator

erayd commented Nov 6, 2018

@shmax Oh man, that's horrible! Thanks for finding that.

I'll do some more digging and see if I can figure out why we're doing this at all (because history maybe?) - IMO we probably shouldn't be (or at least not by default) and v6 is a good time to address that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants