Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding rule annotations with schema type checking support for OPA #3123

Merged
merged 1 commit into from
Mar 31, 2021

Conversation

aavarghese
Copy link
Contributor

@aavarghese aavarghese commented Feb 3, 2021

New support to upload a directory of JSON schema file(s) via "opa eval --schema". Directory can contain schema file(s) for policy input document(s), and schema file(s) for contextual data document(s). These schema files are used then to improve static type checking and to get more precise error reports as you develop Rego code

Also, there is new support for adding annotations on Rules to specify the schemas to be used specifically for type checking the expressions within the scope of that Rule. It helps address issues with schema overloading, and provides even more precise type error reports for a Rego developer.

For example: https://github.com/aavarghese/opa-schema-examples/blob/ruleAnnotation/acl/

Documentation for new "rule annotation" feature in https://github.com/aavarghese/opa/blob/ruleAnnotation/docs/content/schemas.md

This work is continuation of merged PR #3060.

Co-authored-by: @vazirim Mandana Vaziri mvaziri@us.ibm.com
Co-authored-by: @aavarghese Ansu Varghese avarghese@us.ibm.com
Co-authored-by: @tsandall Torin Sandall torinsandall@gmail.com

Note: addresses some points discussed in #1449

Copy link
Contributor

@srenatus srenatus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skimmed some of the code, please bear with me on reviewing what's marked "WIP" 😅

ast/check.go Outdated Show resolved Hide resolved
ast/check.go Outdated Show resolved Hide resolved
ast/check.go Outdated Show resolved Hide resolved
ast/check.go Outdated Show resolved Hide resolved
@vazirim vazirim force-pushed the ruleAnnotation branch 2 times, most recently from 91ee070 to 9555c11 Compare February 9, 2021 20:11
@aavarghese aavarghese changed the title [WIP] Adding rule annotations with schema type checking support for OPA Adding rule annotations with schema type checking support for OPA Feb 10, 2021
@aavarghese aavarghese marked this pull request as ready for review February 10, 2021 22:13
@aavarghese
Copy link
Contributor Author

@srenatus @tsandall this next PR for rule annotations is now ready for review. Please let us know your feedback.

Copy link
Contributor

@srenatus srenatus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I've given this a thorough read; but I have yet to give it a shake, too. Going to pull this down and run it tomorrow. Looks good, it's a very exciting functionality to have 🚀

ast/check.go Outdated Show resolved Hide resolved
ast/check_test.go Show resolved Hide resolved
ast/check_test.go Outdated Show resolved Hide resolved
ast/check_test.go Outdated Show resolved Hide resolved
ast/env.go Outdated Show resolved Hide resolved
cmd/eval.go Outdated Show resolved Hide resolved
docs/content/schemas.md Outdated Show resolved Hide resolved
docs/content/schemas.md Outdated Show resolved Hide resolved
docs/content/schemas.md Outdated Show resolved Hide resolved
types/types.go Outdated Show resolved Hide resolved
Copy link
Contributor

@srenatus srenatus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for piling on, but here are some more review comments 😅

docs/content/schemas.md Outdated Show resolved Hide resolved
ast/check.go Outdated Show resolved Hide resolved
ast/check.go Outdated Show resolved Hide resolved
ast/check.go Outdated Show resolved Hide resolved
cmd/eval_test.go Show resolved Hide resolved
cmd/eval_test.go Show resolved Hide resolved
docs/content/schemas.md Outdated Show resolved Hide resolved
cmd/eval.go Outdated
The -s/--schema flag provides a single JSON Schema used to validate references to the input document.
The -s/--schema flag provides one or more JSON Schemas used to validate references to the input or data documents.
Loads a single JSON file, applying it to the input document; or all the schema files under the specified directory.
When a directory is passed, the schema file for input must be named 'default-input-schema.json'.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😅 Sorry I'm still not completely on board with this design. If it needs that many words to explain, it's perhaps too complicated and will lead to confusion....

Do we have some sort of precedent? Other tools that work the same?

I suppose we could make --schema repeatable, and a single file argument would be the one for input, a directory would be a schema set. 🤔 Not sure if that's better...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We did contemplate between having --schema as a repeatable flag like --data and --bundle, or only allow for a single path. At the end from our perspective, the latter seems much cleaner and less complicated by reducing the number of scenarios (especially since the path + files a user provides needs to have a simple way of representation in the rule annotation).

Our current design in simple words is just allowing for either of these cases:

  1. opa eval -i input.json -s input-schema.json. (don't have any schemas for data docs)
  2. opa eval -i input.json -d datadocs/ -s schemas/ (where schemas directory can contain any or all of the schemas for the input and data docs)

for 2., we impose a rule to have the schema file for input to be named default-input-schema.json (to differentiate it from other files (could be data schema files or additional input schema files)) and also means the input annotation doesn't need to be explicitly specified for rules.

Having --schema as repeatable means we could have many more complex scenarios, for ex:
opa eval -i input.json -s input-schema.json -s some-file.json -s schemas/ which isn't clear whether some-file schema should be associated to input or some data doc, and how we reference it in a rule annotation.

Open to further discussion though the current design of limiting the number of schema args feels simpler in our mind :). Wdyt?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for providing some insights into your design tradeoff thoughts 😃

Let's roll with that, if it turns into a source of UX-related issues, we can revisit. 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like a good idea. Thanks Stephan.

Copy link
Member

@tsandall tsandall Mar 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I apologize in advance for bringing this up again 😅

I'm hesitant about releasing with the default-input-schema convention exposed to users in the CLI as well as annotations because it'll be difficult to unwind. If we drop the convention now, we could always bring it back in the future. It's always much easier to make additive changes than it is to make destructive ones...

Can we drop the default-inputschemaconvention for now? I think this would just amount to (i) removing the special case from thereadSchemaBytesfunction below and (ii) removing any special casing required to support references todefault-input-schema` in the annotations.

@aavarghese I reopened this thread ^^

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tsandall do you mean that users can either submit a single file, in which case we set the schema for input globally (to maintain backward compatibility with our previous PR), or if they submit a directory then we are not setting the input schema globally at all?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vazirim yes, that's what I had in mind.

ast/check.go Outdated Show resolved Hide resolved
@vazirim vazirim force-pushed the ruleAnnotation branch 2 times, most recently from f3b3a88 to d654ab3 Compare February 18, 2021 04:26
@aavarghese aavarghese changed the title Adding rule annotations with schema type checking support for OPA [WIP] Adding rule annotations with schema type checking support for OPA Feb 19, 2021
ast/check.go Outdated Show resolved Hide resolved
ast/check.go Show resolved Hide resolved
ast/check.go Show resolved Hide resolved
ast/env.go Outdated Show resolved Hide resolved
ast/parser.go Outdated Show resolved Hide resolved
ast/parser_test.go Outdated Show resolved Hide resolved
ast/policy.go Outdated Show resolved Hide resolved
ast/policy.go Show resolved Hide resolved
@vazirim
Copy link
Contributor

vazirim commented Feb 24, 2021

@tsandall Have you considered using a notebook for interacting with Rego code? For example, zeppelin: https://zeppelin.apache.org

It can be extended to allow Rego editors, it could then have the rich features you are asking for in terms of human-readable text, links, figures, etc...

For Rego, it will be something very interactive and could be programmed to display the results nicely in tables for example.

I am not sure that yaml is the best way for us to indicate schema annotations. Maybe in separate files? Yaml is very dependent on spacing and error-prone because of that. It would be hard to edit yaml inside of comments... What do you think?

@aavarghese aavarghese changed the title [WIP] Adding rule annotations with schema type checking support for OPA Adding rule annotations with schema type checking support for OPA Feb 25, 2021
@tsandall
Copy link
Member

tsandall commented Mar 2, 2021

@vazirim I've wondered about notebook integrations in the past. I haven't seen Zeppelin before though, I'm more familiar with IPython/Jupyter notebooks.

We've discussed a bunch of options for writing metadata inside .rego files:

  • First-class syntax (e.g., something like Java annotations). Perhaps this is the best long-term solution but it requires more work up front. Also, policies written to use the new syntax would not be compatible w/ older versions of OPA (though this is probably not a huge issue.)
  • Normal Rego values (see this comment for an example). This is attractive because it enables reuse of metadata values and doesn't require a new syntax. However, it creates implementation challenges that (IMO) aren't worth it.
  • Comments...

If the metadata is encoded into comments we still have to define a syntax for it. We could invent our own (like we currently have inside this PR) or we could use something well known like YAML, JSON, etc. YAML requires minimal boilerplate compared to JSON (which is nice for authors) and would be easy for thirdparty tooling to process.

For example, if a thirdparty tool wants to catalogue/index a bunch of Rego policies, they could simply extract the comments, run them through a YAML parser, and build the index. They wouldn't necessarily have to depend on OPA code (which is particularly nice if they're not working in Go.) On the other hand, if we invent our own syntax, users need to learn it and we have to produce tooling to support it.

In short, I think the "YAML-in-comments" approach is the best in terms of the current tradeoffs (implementation, extensibility, authoring, tooling, etc.)

ast/parser.go Outdated Show resolved Hide resolved
ast/parser_ext.go Outdated Show resolved Hide resolved
@vazirim vazirim force-pushed the ruleAnnotation branch 2 times, most recently from 2f24006 to 1bb6482 Compare March 15, 2021 21:23
Copy link
Contributor

@srenatus srenatus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't wait to get this in! 🎉

Some questions on comments and a nitpicking backwards-incompatible edge cases 👇

cmd/eval.go Outdated Show resolved Hide resolved
ast/parser_ext.go Outdated Show resolved Hide resolved
ast/parser_ext.go Outdated Show resolved Hide resolved
@vazirim vazirim force-pushed the ruleAnnotation branch 2 times, most recently from 814b1a2 to a0ff801 Compare March 17, 2021 19:04
@aavarghese
Copy link
Contributor Author

@srenatus @tsandall all comments have been addressed and changes committed. Please let us know both of your feedback. New documentation at https://deploy-preview-3123--openpolicyagent.netlify.app/docs/edge/schemas/

@vazirim vazirim force-pushed the ruleAnnotation branch 3 times, most recently from 6a7fe64 to 91f36f8 Compare March 22, 2021 13:37
docs/content/schemas.md Outdated Show resolved Hide resolved
docs/content/schemas.md Outdated Show resolved Hide resolved
ast/policy.go Outdated Show resolved Hide resolved
cmd/eval.go Outdated
The -s/--schema flag provides a single JSON Schema used to validate references to the input document.
The -s/--schema flag provides one or more JSON Schemas used to validate references to the input or data documents.
Loads a single JSON file, applying it to the input document; or all the schema files under the specified directory.
When a directory is passed, the schema file for input must be named 'default-input-schema.json'.
Copy link
Member

@tsandall tsandall Mar 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I apologize in advance for bringing this up again 😅

I'm hesitant about releasing with the default-input-schema convention exposed to users in the CLI as well as annotations because it'll be difficult to unwind. If we drop the convention now, we could always bring it back in the future. It's always much easier to make additive changes than it is to make destructive ones...

Can we drop the default-inputschemaconvention for now? I think this would just amount to (i) removing the special case from thereadSchemaBytesfunction below and (ii) removing any special casing required to support references todefault-input-schema` in the annotations.

@aavarghese I reopened this thread ^^

@vazirim vazirim force-pushed the ruleAnnotation branch 5 times, most recently from 01bc930 to fa02bb1 Compare March 30, 2021 15:01
@aavarghese aavarghese force-pushed the ruleAnnotation branch 2 times, most recently from 02ac9a1 to e8d7a4c Compare March 30, 2021 21:47
…l --schema". Directory can contain schema file(s) for policy input document(s), and schema file(s) for contextual data document(s). These schema files are used then to improve static type checking and to get more precise error reports as you develop Rego code.

Also, there is new support for adding annotations on Rules to specify the schemas to be used specifically for type checking the expressions within the scope of that Rule. It helps address issues with schema overloading, and provides even more precise type error reports for a Rego developer.

Also, added support for annotation processing when loading via bundles.

Co-authored-by: @vazirim Mandana Vaziri mvaziri@us.ibm.com
Co-authored-by: @aavarghese Ansu Varghese avarghese@us.ibm.com
Co-authored-by: @tsandall Torin Sandall torinsandall@gmail.com
Signed-off-by: Mandana Vaziri <mvaziri@us.ibm.com>
@tsandall tsandall merged commit 3e3655b into open-policy-agent:master Mar 31, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants