Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add metadata annotation for reusable schema references #6927

Open
tsandall opened this issue Aug 13, 2024 · 1 comment
Open

Add metadata annotation for reusable schema references #6927

tsandall opened this issue Aug 13, 2024 · 1 comment

Comments

@tsandall
Copy link
Member

OPA currently allows users to set the schema for input and data paths via the schemas annotation. This example says that input.foo is an object with a property bar that has type string.

# METADATA
# schemas:
# - input.foo: {properties: {bar: {type: string}}}
package x

Schemas can be defined inline in .rego in annotations or loaded in from disk or via Go APIs and then referenced by path e.g., - input.foo: schema.path.to.schema. This works well when ingesting preexisting schemas from the rest of the world. However, in some cases, it would be nice to be able to define reusable schemas and load them into OPA without having to integrate with the ast.SchemaSet API.

To do this we could add a new annotation, e.g., schema_defs, that would allow users to define arbitrary JSON schemas inline in .rego files. Then they could refer to these definitions in the existing schemas annotation via URL. For example:

schema.rego:

# METADATA
# description: This package contains schema definitions that can be referenced in other policies
# schema_defs:
#   foo: {$id: https://example.com/foo, properties: {type: {bar: string}}}
package schemas

x.rego

# METADATA
# schemas:
# - input.foo: {$ref: https://example.com/foo}
package x

There's an open question around scoping. Specifically, should the schema_defs annotation be supported anywhere or only at the global scope? If we allow schemas to be defined anywhere then we need to deal with conflicts, e.g., suppose two files declare schemas with the same URL. Is this an error? On the other hand, we could implement the feature such that schema_defs are only visible within the scope of the annotations (e.g., package/subpackage) but then that would force users to invent an artificial root for global schemas (👎 ).

@johanfylling
Copy link
Contributor

To reduce ambiguity, the safer approach is probably to not allow annotations to re-declare a schema with conflicting ID/URL.


Applying schema_defs globally rather than according to its associated scope feels contradictory to how annotations otherwise work, and could be confusing. One way around this could be to introduce a global annotation scope, which is applied to the root of all metadata chains. To not introduce another set of ambiguities, multiple declarations of this scope should probably be a compile-time error.


An alternative to looking up schemas by ID could be to lookup by Rego path + annotation key instead. E.g.:

# METADATA
# schemas:
# - input: {$ref: data.schemas#input}
package x
# METADATA
# description: This package contains schema definitions that can be referenced in other policies
# schema_defs:
#   input: {properties: {type: {bar: string}}}
package schemas

This'd still allow for ambiguity, though. Which we could either deal with by compile-time errors, or by only allowing schema_defs on the document and subpackages scopes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants