-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Aggregates Collection and Usage #323
Merged
anderseknert
merged 5 commits into
StyraInc:main
from
atlassian-forks:issue-topic-282-broader-scope-for-custom-rules
Sep 23, 2023
Merged
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
9c581df
Implement aggregates feature: https://github.com/StyraInc/regal/issue…
sesponda ac297c8
Fix: field is used by `evalAndCollect` so it should not be ignored.
sesponda 611470b
Merge branch 'main' into issue-topic-282-broader-scope-for-custom-rules
sesponda 668075d
Merge remote-tracking branch 'origin/main' into issue-topic-282-broad…
sesponda ea0d793
Fix code style.
sesponda File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
package mypolicy1.public | ||
|
||
my_policy_1 := true |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
package mypolicy2.public | ||
|
||
export := [] |
20 changes: 20 additions & 0 deletions
20
e2e/testdata/aggregates/rules/custom_rules_using_aggregates.rego
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
# METADATA | ||
# description: Collect data in aggregates and validate it | ||
package custom.regal.rules.testcase["aggregates"] | ||
|
||
import future.keywords | ||
import data.regal.result | ||
|
||
aggregate contains entry if { | ||
entry := { "file" : input.regal.file.name } | ||
} | ||
|
||
report contains violation if { | ||
not two_files_processed | ||
violation := result.fail(rego.metadata.chain(), {}) | ||
} | ||
|
||
two_files_processed { | ||
files := [x | x = input.aggregate[_].file] | ||
count(files) == 2 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand why this happens, but it's a bit of a mind bend to me having test fail by not running aggregate computation. I'd expect
report
rules that need aggregates to just fail out as undefined, likely by referencinginput.aggregate
which would be undefined if no aggregate rules had run?The way it's done now, there's no way fo a report rule to know if aggregates have been collected or not, or is there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is.
input.aggregate
is always defined, which allows devs to check if any aggregate is collected simply viacount(input.aggregate[_]) > 0
without worrying if it's defined or not. This also allows a kind of typed check, e.g. "check that there are N aggregates of specific type", for example:The test code linked to this comment sends a single file to the linter, and this is why the violation is reported. Again, a business rule saying "fail if we don't have exactly two Rego files" doesn't make a lot of sense, I've used this to reach complete coverage (i.e. check that violations are reported).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, but that'll only tell you if any of the aggregate rules has something to report — it's not going to tell you that the aggregate rules were not evaluated at all, which is why I'd prefer it if
input.aggregate
was undefined in that case.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not fully follow this. Maybe some code example? I don't know if by "aggregate rules" you mean rules collecting aggregates (i.e.
aggregate contains entry if
) or the rules evaluating aggregates (i.e.report contains violation if { .... input.aggregate ... }
The rationale of merging
not input.aggregates
andcount(input.aggregates[_]) == 0
into the latter was explained in the preliminary PR (in summary: simplify the how one can check aggregates), but that's just my personal preference. If this is a fundamental problem blocking a merge, I don't mind tweaking the PR as needed.What do you have in mind? Would you like to default on having it
undefined
(if the array is zero), or something different?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I should have elaborated more on the why, but I was short on time :)
Consider a rule like this: In package
foo
(which could be distributed across multiple files) I need to ensure that there is exactly one rule namedbar
. Some pseudo-code below, but bear with me:If
input.aggregate
is always an array, this rule will fail even when aggregate collection hasn't happened, like when evaluating only a single file, or pehaps has been disabled by other means. In this case,count(input.aggregate)
will give me0
, which is of course not equal to 1. Ifinput.aggregate
is undefined, evluation will stop at that line, and nothing will be reported.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😮 This discussion has surfaced a bug: your example above makes sense and (regardless of the original point of the discussion), it seems valid to me to write an aggregate that I want to collect always. The whole point of aggregates was to write business rules I can enforce without concerning how I package the code in multiple
.rego
files. In the current version, a refactor merging all files into one would suddenly cause aggregates not being collected, and thus violations could be missed, violating the principle of least surprise. I think we should:I prefer #2. If you agree I'll make the change. If you prefer #1, no worries (however, I can't commit to making those extra changes soon in follow up PRs due to time constraints).
Now, coming back to the original point, i.e. should be undefined or not:
I think that if we encourage writing rules that attempt to find data more specifically, for example:
... the logic will work OK for both cases (empty array or undefined).
While I'm still (slightly) inclined to keep it as an empty array, I'm not strong on this one (and I'm here to help the project contributing code and following your lead). I'd be happy to change things. Let me know what changes you'd like and I'll make then soon this week.
Thanks 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point of aggregate rules is to provide a way to lint Rego at the level of a project, which is something we can't currently do. If your project consists of a single file, there's really no point in using aggregate rules, as there's nothing to aggregate. That case is covered by regular rules. Indeed, this distinction will need to be well documented, and I'd be happy to flesh that out in follow-up PRs.
See the Regal integration that went out in the Rego Playground yesterday for an example of single file linting. This will be increasingly common as we start to look into editor integrations later.
Yes, a refactor merging all files into one would render aggregate rules moot. That's not surprising to me, although I'd be very surprised if someone did that and called it refactoring :)
Agreed! Even better than encouraging, I think, is a helper function, similar to how we provide
result.fail
. I suggested this previously, but much have been said since then, so that it got lost along the way is understandable :)Rather than:
We'd provide a
result.aggregate
helper:Since we pass
rego.metadata.chain()
, this helper would be able to find the name of the rule from the annotation, and group the aggregates accordingly, i.e.input.aggregate["my-rule"]
. It would also addinput.regal.file.name
to each aggregate, and anything else we could think might be useful. The second argument would be for the rule author to populate with whatever is useful in the context of their rule.So yeah 1. would be me preferred choice here:
And I'd be happy to help fill in some of the blanks in follow-up fixes, like docs, the above mentioned helper function, (optionally) display warnings if aggregate rules are skipped, and so on. Just let me know how you want to proceed 😃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, we have two changes:
result.aggregate(...)
IMHO, we could leave 2 for later. Initially, Rego authors can put anything they want in the entry. Later, the project can release the helper function to make it easier to auto-insert the rule name, etc.
I'm happy to help with the minimum change you'd accept a merge, leaving any optionals/wish-to-have for a post-merge. If you think we cannot merge this without #2, no worries... I'll commit to add it (not soon because I have too much on my plate recently, but I still want to end what I started).
The reason I'm leaning towards speed over coding optionals is that the PR has been open for a a while and I have now non-trivial conflicts to merge. I'll work on them while waiting for your answer.