Skip to content

v1.11.0

Choose a tag to compare

@koponen-styra koponen-styra released this 15 Aug 08:56
· 133 commits to main since this release

This release includes several bugfixes and a powerful new feature for data source integrations: Rego transform rules!

[New Feature] Data transformations are available for all data source integrations

Enterprise OPA now supports Rego transform rules for all data source plugins!

These transform rules allow you to reshape and modify data fetched by the data sources, before that data is stored in EOPA for use by policies.

This feature can be opted into for a data source by adding a rego_transform key to its YAML configuration block.

Example transform rule with the HTTP data source

For this example, we will assume we have an HTTP endpoint that responds with the following JSON document:

[
 {"username": "alice", "roles": ["admin"]},
 {"username": "bob", "roles": []},
 {"username": "catherine", "roles": ["viewer"]}
]

Here's what the OPA configuration might look like for a fictitious HTTP data source:

plugins:
 data:
 http:
 type: http
 url: https://internal.example.com/api/users
 method: POST # default: GET
 body: '{"count": 1000}' # default: no body
 file: /some/file # alternatively, read request body from a file on disk (default: none)
 timeout: "10s" # default: no timeout
 polling_interval: "20s" # default: 30s, minimum: 10s
 follow_redirects: false # default: true
 headers:
 Authorization: Bearer XYZ
 other-header: # multiple values are supported
 - value 1
 - value 2
 rego_transform: data.e2e.transform

The rego_transform key at the end means that we will run the data.e2e.transform Rego rule on the incoming data before that data is made available to policies on this EOPA instance.

We then need to define our data.e2e.transform rule. rego_transform rules generally take incoming messages as JSON via input.incoming and return the transformed JSON for later use by other policies.
Below is an example of what a transform rule might look like for our HTTP data source:

package e2e
import future.keywords
transform[id] := d {
 some entry in input.incoming
 id := entry.username
 d := entry.roles
}

In the above example, the transform policy will populate the data.http.users object with key-value pairs. Of note: the http key comes from the datasource plugin configuration above.

Each key-value pair will be generated by iterating across the JSON list in input.incoming, and for each JSON object, the key will be taken from the username field, and the value from the roles field.

Given our earlier data source, the result stored in EOPA for data.http.users will look like:

{
 "alice": ["admin"],
 "bob": [],
 "catherine": ["viewer"]
}

This general pattern applies to all the data source integrations in Enterprise OPA, including the Kafka data source (covered below).

In addition to input.incoming – containing the incoming information retrieved by the datasource – the value of input.previous can be used to refer to all of the data currently stored in the plugin's data. subtree.

[Changed Behavior] Updates to the Kafka data source's Rego transform rules

The Kafka data source now supports the new rego_transform rule system, the same as all of the other data source integrations. Concretely, It no longer expects the output of the transform rule to be a JSON Patch object to be applied to the existing data, but instead expects the output to be the full data object to be persisted.

Because Kafka messages are often incremental updates, the input.previous value should be used to refer to the rest of the data subtree.

See the Reference documentation for more details and examples of the new transform rules.

[Changed Behavior] Updates to the dynamodb series of builtins

In this release dynamodb.send has been split apart into more specialized variants embodying the same functionality: dynamodb.get and dynamodb.query.

dynamodb.get

For normal key-value lookups in DynamoDB, dynamodb.get provides a straightforward solution.
Here is a brief usage example:

thread := dynamodb.get({
 "endpoint": "dynamodb.us-west-2.amazonaws.com",
 "region": "us-west-2",
 "table": "thread",
 "key": {
 "ForumName": {"S": "help"},
 "Subject": {"S": "How to write rego"}
 }
}) # => { "row": ...}

See the Reference documentation for more details.

dynamodb.query

For queries on DynamoDB, dynamodb.query allows control over the query expression and other parameters:
Here is a brief usage example:

music := dynamodb.query({
 "region": "us-west-1",
 "table": "foo",
 "key_condition_expression": "#music = :name",
 "expression_attribute_values": {":name": {"S": "Acme Band"}},
 "expression_attribute_names": {"#music": "Artist"}
}) # => { "rows": ... }

See the Reference documentation for more details.

[Changed Behavior] Removal of MongoDB plugin keys

The keys configuration for the MongoDB datasource plugin is now deprecated. Instead, MongoDB's native _id value will be used as the primary key for each document.

Any restructuring or renormalization of the data should now be done via rego_transform.