-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#1449: JSON Schema validation and verification #5486
Conversation
✅ Deploy Preview for openpolicyagent ready!
To edit notification comments on pull requests, go to your Netlify site settings. |
f2fab75
to
4be3a6d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for contributing. Some inline comments, and some tests would be required, too. See test/cases/testdata
0f50b95
to
f26c589
Compare
Signed-off-by: Yuri Kulagin <jkulvichi@gmail.com>
f26c589
to
9bd1bfb
Compare
@srenatus, I have covered the functions with tests, and added a bit more comments. Can I have your review? :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I wasn't explicit enough -- it would be preferable to have the tests in test/cases/testdata
, which is a bunch of yaml files. They are also used by other projects to ensure conformance when the builtins are implemented in other ways -- e.g. in Wasm, or in IR-based projects.
That said, I'm not completely certain about the error-returning there; I don't believe we've done that in any other place. @philipaconrad, what do you think? I suppose having some alignment between the graphql schema verification builtins we already have, and the new ones here, would make sense.
), | ||
} | ||
|
||
// JSONMatchSchema returns empty array if the document matches the JSON schema, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need both json.is_match_schema
and json.match_schema
then? It seems like this rego snippet would do to implement the former:
is_match_schema(doc, schema) if json.match_schema(doc, schema) == []
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. Do we also need jsonschema.is_valid
?
This can be implemented in the same way using jsonschema.verify
.
@srenatus You are correct-- no other builtins return error messages as part of their normal output. We've had requests in the past for something like this, for example as part of the GraphQL builtins. I believe we've avoided implementing any such feature so far, because doing so could expose a lot of unnecessary implementation details, and returning error messages does not mesh with the existing style of error-handling/validation logic for policies. 🙁 |
Signed-off-by: Yuri Kulagin <jkulvichi@gmail.com>
@philipaconrad, indeed, I have not seen previously implemented functions return error information. However, in our case, it is required to provide the user not only a binary decision of the rule, but also technical info about why the error occurred. For this reason, simplified |
Signed-off-by: Yuri Kulagin <jkulvichi@gmail.com>
asJSON, err = ast.JSON(value) | ||
if err != nil { | ||
return nil, err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can probably use gojsonschema.NewGoLoader
// Returns an array of errors or empty array. | ||
// Returns an empty array if no errors are found. | ||
// In case of internal error returns empty array. | ||
func builtinJSONMatchSchema(_ BuiltinContext, operands []*ast.Term, iter func(*ast.Term) error) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to specify in the docs that the first argument is the document to match with the schema and second is the schema itself.
topdown/jsonschema.go
Outdated
// Make new schema instance to provide validations. | ||
schema, err := gojsonschema.NewSchema(schemaLoader) | ||
if err != nil { | ||
return err | ||
} | ||
|
||
// Use the schema instance to validate the document. | ||
result, err := schema.Validate(documentLoader) | ||
if err != nil { | ||
return err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also do if you like:
gojsonschema.Validate(schemaLoader, documentLoader)
Directly returning an error string from the builtin does not seem the right approach. How about returning an object from the builtin and one of the fields could be an |
@jkulvich if you like we are happy to carry forward your great work and add tests etc. These are useful builtins and if you need our help to move this along we're happy to do it. Let us know what you think. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've taken the liberty to update this PR according to review comments.
builtin_metadata.json
Outdated
@@ -8726,6 +8753,26 @@ | |||
}, | |||
"wasm": true | |||
}, | |||
"jsonschema.verify": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this perhaps be json.verify_schema
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes json.match_schema
and json.verify_schema
would be better.
586d605
to
0501498
Compare
To link this to the actual issue, this PR implements issue #1449 . |
* Merging `jsonschema.is_valid` into `jsonschema.verify`, and `json.is_match_schema` into `json.match_schema` * Adding yaml test cases in testdata Signed-off-by: Johan Fylling <johan.dev@fylling.se>
0501498
to
85685e3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good. Few comments.
ast/builtins.go
Outdated
), | ||
), | ||
}, nil)). | ||
Description("`output` is of the form `[match, errors]`. If the document is valid given the schema, then `match` is `true`, and `errors` is an empty array. Otherwise, `match` is `true` and `errors` is an array of objects describing the error(s)."), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean "Otherwise, match
is false
" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes, of course 🙄 .
builtin_metadata.json
Outdated
@@ -8726,6 +8753,26 @@ | |||
}, | |||
"wasm": true | |||
}, | |||
"jsonschema.verify": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes json.match_schema
and json.verify_schema
would be better.
topdown/jsonschema.go
Outdated
|
||
// builtinJSONSchemaValidate accepts 1 argument which can be string or object and checks if it is valid JSON schema. | ||
// Returns array [false, <string>] with error string at index 1, or [true, ""] with empty string at index 1 otherwise. | ||
func builtinJSONSchemaValidate(_ BuiltinContext, operands []*ast.Term, iter func(*ast.Term) error) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: lets update the function name to builtinJSONSchemaVerify
[...]*ast.Term{ast.StringTerm("error"), ast.StringTerm(re.String())}, | ||
[...]*ast.Term{ast.StringTerm("type"), ast.StringTerm(re.Type())}, | ||
[...]*ast.Term{ast.StringTerm("field"), ast.StringTerm(re.Field())}, | ||
[...]*ast.Term{ast.StringTerm("desc"), ast.StringTerm(re.Description())}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need both re.String()
and re.Description()
? I mean do they contain different enough that can help in debug etc. or is one them enough ? https://sourcegraph.com/github.com/open-policy-agent/opa@7b8fa09ff8fb3d447abeefa5d257fbc00e4a2ee6/-/blob/internal/gojsonschema/result.go?L159
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the string representation contains field
and description
, which we include in the error object; but it also includes context
and value
. Unless the formatting of the error string makes it much more readable, perhaps we could skip including it, and instead include context
and value
(?).
Although, I don't see harm in including it, as the formatting of the string might make the data more human readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good 👍
|
||
// Check that schema is correct and parses without errors. | ||
if _, err = gojsonschema.NewSchema(loader); err != nil { | ||
return iter(newResultTerm(false, ast.StringTerm("jsonschema: "+err.Error()))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of returning a string here, what if we return an object with an error
field describing the error similar to what http.send
does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now, we're returning an array with a boolean and an error string, similar (but kinda reversed) to what graphql.parse_and_verify does. E.g.
deny[err] {
[false, err] = jsonschema.verify(input.schema)
}
We could return an object instead, if that is preferable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems fine. My reasoning behind using an object was it gives us flexibility to add more fields if needed.
1bbc4f2
to
ee7e293
Compare
Signed-off-by: Johan Fylling <johan.dev@fylling.se>
ee7e293
to
9e8e10a
Compare
|
||
// Check that schema is correct and parses without errors. | ||
if _, err = gojsonschema.NewSchema(loader); err != nil { | ||
return iter(newResultTerm(false, ast.StringTerm("jsonschema: "+err.Error()))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems fine. My reasoning behind using an object was it gives us flexibility to add more fields if needed.
Yes well done. This will be super helpful. |
Is there any official documentation in OPA about using this feature ? |
@hajdukda, please, follow next OPA docs sections json.match_schema, and json.verify_schema |
Added new built-in functions:
schema
anddocument
can be JSON string or object.Signed-off-by: Yuri Kulagin jkulvichi@gmail.com