Skip to content

regorus 0.9.1 — partial set rule with iterating body emits only one element #712

@jeromeajot

Description

@jeromeajot

regorus 0.9.1 — partial set rule with iterating body emits only one element

Version: regorus 0.9.1 (crates.io)
Tested on: macOS 26.4.1 (25E253) / Apple Silicon, rustc 1.95.0, library API + CLI binary

Summary

A partial set rule whose body iterates over a collection, e.g. s[x] if { some _, x in input.y }, adds only the first iteration's binding to the rule's set. Every subsequent binding is silently dropped. OPA produces all bindings.

The bug surfaces in both the library API (Engine::eval_rule) and the CLI (regorus eval). It does not surface for the equivalent set-comprehension form.

The same bug independently affects partial set rules with multiple bodies, each body emits at most one element, regardless of how many bindings would have satisfied that body.

Minimal reproducer

repro.rego:

package repro

import future.keywords.if
import future.keywords.in

violations[k] if {
    some k, _ in input.x
}

input.json:

{ "x": { "FOO": 1, "BAR": 2, "BAZ": 3 } }

Expected (per OPA 1.16.1)

opa eval -d repro.rego -i input.json 'data.repro.violations'
{ "BAR": true, "BAZ": true, "FOO": true }

Actual (regorus 0.9.1 CLI)

regorus eval -d repro.rego -i input.json 'data.repro.violations'
{ "BAR": true }

(Only one of the three bindings is added to the set.)

Same result via the library API

let mut engine = regorus::Engine::new();
engine.add_policy("repro.rego".to_string(), REGO.to_string())?;
engine.set_input_json(INPUT)?;
let v = engine.eval_rule("data.repro.violations".to_string())?;
// v contains a single-element set, not three.

Tested with regorus = "0.9" (default features + arc) and rustc 1.95.

Characterization — what works, what doesn't

All four variants run on the same input.json above. Output is the rule's value at data.repro.violations.

# Variant regorus 0.9.1 OPA 1.16.1 Notes
1 violations[k] if { some k, _ in input.x } {BAR} (1) {BAR, BAZ, FOO} (3) iteration in single body —> bug
2 violations[entry] if { some k, _ in input.x; k in {"FOO","BAR","BAZ"}; entry := {"k": k} } 1 entry 3 entries iteration + entry construction —> bug
3 Two bodies, each some k, _ in input.x; k in <subset>; entry := {...} 2 entries 3 entries multi-body iteration —> bug, off-by-N
4 Three static bodies: violations[k] if { k := "FOO" } × 3 {BAR, BAZ, FOO} (3) {BAR, BAZ, FOO} (3) partial set rules with non-iterating bodies are fine
5 Comprehension: violations := {k | some k, _ in input.x} {BAR, BAZ, FOO} (3) {BAR, BAZ, FOO} (3) comprehensions are fine

The bug is contained to partial set rules whose body produces multiple bindings via iteration. The element actually returned is the first one in iteration order (after sort, given object keys come back sorted).

The same shape with array input (input.x = ["FOO", "BAR", "BAZ"] and some _, v in input.x) reproduces the bug, regorus returns {FOO} only.

The single-body and multi-body variants both reproduce on regorus's own CLI binary built from the 0.9.1 crate, ruling out anything specific to library calling conventions.

Reproducer files

# repro.rego — variant #1, ultra-minimal
package repro

import future.keywords.if
import future.keywords.in

violations[k] if {
    some k, _ in input.x
}
// input.json
{ "x": { "FOO": 1, "BAR": 2, "BAZ": 3 } }
# install regorus CLI
cargo install --version 0.9.1 --example regorus regorus

# expected = three keys, observed = one key
regorus eval -d repro.rego -i input.json 'data.repro.violations'

# OPA reference
opa eval -d repro.rego -i input.json 'data.repro.violations'

Workaround

Rewrite affected partial set rules to set comprehensions:

# instead of:
violations[k] if { some k, _ in input.x; k in covered }

# write:
violations := {k | some k, _ in input.x; k in covered}

Comprehensions in regorus 0.9.1 produce the correct full set.

This workaround is not always tractable when the goal is to combine multiple distinct rule bodies into one set (e.g., one body for atomic identifiers and another for grouped identifiers, where each body has its own predicate logic). In that case the comprehension form requires expressing both bodies as branches of one comprehension's filter, readable for two branches, awkward for more.

Why this is not caught by the test suite of typical Rego policies

Many policy test suites assert count(violations) > 0 (boolean) or not allow (boolean) rather than exact entry counts. Both pass even when regorus returns one element instead of N. We discovered the bug while integrating the matrix-org/rust-opa-wasm runtime alongside regorus — opa-wasm and OPA agreed on entry counts, regorus disagreed. The count mismatch was the first signal.

Environment

$ rustc --version
rustc 1.95.0 (59807616e 2026-04-14)

$ regorus --version
regorus 0.9.1     # cargo install --version 0.9.1 --example regorus regorus

$ opa version
Version: 1.16.1

Cargo dependency line that reproduces:

regorus = { version = "0.9", default-features = false, features = ["full-opa", "arc"] }

Severity

For policies whose rule heads use s[x] if { some k, _ in input.y; ... } patterns, a common shape for "for each entity in input, emit a corresponding entry" rules, regorus silently emits one entry instead of N. Evaluations downstream (count(violations), iteration over violations, etc.) produce incorrect results without any error or warning.

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions