catalog-backend: initial catalog ingestion rules implementation #2118

Rugvip · 2020-08-25T16:36:11Z

This implements the minimal config needed for #1853, and there's plenty of space for expanding the set of rules that can be defined. Keeping it simple for now though.

This adds support for following kind of config:

catalog:
  rules:
  - deny: [User, Group, System]

  locations:
  - type: github
    target: https://github.com/org/repo/blob/master/users.yaml
    allow: [User, Group]
  - type: github
    target: https://github.com/org/repo/blob/master/systems.yaml
    allow: [System]

... and not much more 😁

The catalog.rules config can only contain deny rules, and catalog.locations can only contain allow rules. The rules defined in catalog.locations are applied after the ones defined in catalog.rules.

If catalog.rules is not set in config it falls back to deny: [User, Group].

Hoping that this is good enough for the foreseeable future, and will allow us to safely bring in organizational and identity data into the catalog 😁

Fox32 · 2020-08-26T07:30:11Z

Is there any kind of "deny all" planned? Then one could allow only the ones that are required, but don't has to be afraid that when new entities are added and this configuration was not updated. Kind of a allowlist vs blocklist discussion. Maybe a special semantic like:

catalog:
  rules:
  - deny: all

  locations:
  - type: github
    target: https://github.com/org/repo/blob/master/users.yaml
    allow: [User, Group]

Rugvip · 2020-08-26T07:39:00Z

@Fox32 yep, implementation is set up for that, just didn't hook it up to config yet. Wanted to keep things slim to begin with

freben · 2020-08-26T11:48:39Z

Should rules and locations be separated? The example could just as well be expressed as

catalog
  rules:
  - type: github
    target: https://github.com/org/repo/blob/master/users.yaml
    allow: [User, Group]  # alternatively: deny: [] which means "allow anything"
  - type: github
    target: https://github.com/org/repo/blob/master/systems.yaml
    allow: [System]
  - deny: [User, Group, System] # implicitly: applies to any type and target

then the engine goes through the rules one by one in search of a match, picking the first one.

or even clearer if one wants to separate the filters from the action better:

catalog
  rules:
  - type: github
    target: https://github.com/org/repo/blob/master/users.yaml
    kind: [User, Group]
    action: accept
  - type: github
    target: https://github.com/org/repo/blob/master/systems.yaml
    kind: [System]
    action: accept
  - kind: [User, Group, System] # implicitly: applies to any type and target
    action: reject
  - action: accept

freben · 2020-08-26T11:53:20Z

plugins/catalog-backend/src/ingestion/CatalogRules.test.ts

+};
+
+describe('CatalogRulesEnforcer', () => {
+  it('should allow by default', () => {


I think it would be better to either reject by default, or make it a hard error to not have a fallback rule at the bottom of the list of rules (i.e. to not find any matching rule).

freben · 2020-08-26T12:02:17Z

plugins/catalog-backend/src/ingestion/CatalogRules.ts

+ *
+ * An undefined list of matchers means match all, an empty list of matchers means match none
+ */
+type CatalogRule = {


i was thinking more along the lines of

type CatalogRule = { matcher: (entity: Entity) => boolean; action: 'allow' | 'reject'; };

since that's all that the rules engine needs to know. The constructor could take a defaultAction: 'allow' | 'reject' too I guess, but as mentioned elsewhere I would consider it an error if no matching rule was found (log harshly first time and reject, or even just explode?)

Opted for not going with this to be able to provide more informative error messages. Either way it's an internal implementation and we can switch it out later without much trouble if we need to.

Rugvip · 2020-08-27T06:21:26Z

@freben Well the example would be expressed like this, unless you're suggesting the the rules spawn actual locations to traverse?

catalog:
  locations:
  - type: github
    target: https://github.com/org/repo/blob/master/users.yaml
  - type: github
    target: https://github.com/org/repo/blob/master/systems.yaml

  rules:
  - type: github
    target: https://github.com/org/repo/blob/master/users.yaml
    allow: [User, Group]  # alternatively: deny: [] which means "allow anything"
  - type: github
    target: https://github.com/org/repo/blob/master/systems.yaml
    allow: [System]
  - deny: [User, Group, System] # implicitly: applies to any type and target

Seems really verbose. I think it's fine to have the short-cut of being able to specify some rules in the location config?

I think we could change it to this though, a bit more explicit and provides some better grouping of the fields:

catalog:
  rules:
  - deny: [User, Group, System]

  locations:
  - type: github
    target: https://github.com/org/repo/blob/master/users.yaml
     rules:
      allow: [User, Group]
  - type: github
    target: https://github.com/org/repo/blob/master/systems.yaml
    rules:
      allow: [System]

Regarding the application of the rules, i pretty arbitrarily chose to go with having the last matching rule be the determining one. Stopping on the first matching rule does make the processing a tiny bit faster, and it might end up being a bit more intuitive. Gonna try switching to that.

freben · 2020-08-27T06:54:17Z

Oh right, I somehow entirely forgot about the static locations thing, just looked at it as a standalone rules config - yes, agreed, the locations should have rules directly on them and then the fallback rules could be their own key. Maybe name them defaultRules even, signifying how it's used? The last example makes sense, and yes, first match would be the best probably.

Possible remaining gotcha for a rules writer is that you couldn't (?) write a default rule saying allow: [A] and then a locations entry that has allow: [B], and expect that it would allow both A and B. You'd have to repeat the A, right? Which is OK to me, just thinking out loud.

freben · 2020-08-27T06:57:03Z

But is there any benefit to breaking out matchers as a separate thing from action?

catalog:
  rules:
  - kind: [User, Group, System]
    action: deny

  locations:
  - type: github
    target: https://github.com/org/repo/blob/master/users.yaml
    rules:
    - kind: [User, Group]
      action: allow
  - type: github
    target: https://github.com/org/repo/blob/master/systems.yaml
    rules:
    - kind: [System]
      action: allow

That way you open up to more kinds (hah!) of matchers later on.

And also it would make it possible to say just - action: deny with no matchers to satisfy what @Fox32 mentioned.

Rugvip · 2020-08-31T10:39:37Z

Updated to deny all entities that aren't covered by a rule, and added a default rule to allow: [Component, API].

Tried out a couple of models for a more generic match + action, but tbh I couldn't find anything that I'm happy with that felt forwards compatible enough. The current config format is now a simple allow: EntityMatchers[], so there's no location configuration and so on. That should make it possible for us to add more rules and matchers without too much trouble.

The main issue with making the entire thing generic is that we need more use-cases for catalog rules. Right now I have no idea if we always want to match on location + entity tuples, it could for example be that we want to add rules for relationships, annotations on entities, locations themselves, etc. Basically the extension of the API needs to be driven by some actual use-cases, and I think the current format is a good middle ground between forwards compatibility and a compact easy to read and write format.

freben · 2020-08-31T11:09:57Z

docs/features/software-catalog/configuration.md

+    - allow: [Component, API]
+    - allow: [System]
+      locations:
+        type: github


this is perhaps slightly confusing, looking at rules.locations vs location with implicit rule :) am tempted to add a target here

still humbly wondering if this shouldn't have been

rules: - kind: [Component, API] action: allow - kind: [System] locationType: github action: allow

or turned a bit inside out:

rules: - allow: kind: [Component, API] - allow: kind: [System] locationType: github

both for clarity and extensibility, but not gonna be insisting on it

or to be able to reuse the same structure in two places:

rules: - action: allow match: kind: ... locations: - type: ... target: ... match: kind: ...

Ah this totally shouldn't have been here at all, it's not implemented :p

I was thinking we might do something like

rules: allow: - entities: - kind: Component locations: - type: github

Basically the value of allow would be some kind of list of entity + location matchers. Tbh I'm not sure that's ever gonna be needed though, and hopefully we have some more use-cases to base that off of.

Realized I didn't address your second comment 😁

I tried that out, but was running into trouble with the matcher being specific to the action type, and that it also makes the assumption that we'd be matching on things at all.

The second example kinda solves the above problem, and is why I basically when with that. The current format is just a shorthand, and it should be simple enough to expand the configuration if needed. As mentioned in my previous comment I think we should wait with that though.

…s + test

Rugvip requested a review from a team as a code owner August 25, 2020 16:36

freben reviewed Aug 26, 2020

View reviewed changes

Rugvip added 4 commits August 31, 2020 12:30

catalog-backend: initial catalog ingestion rules implementation

6f3b0b8

catalog-backend: switch catalog rules to deny by default

8d07b54

catalog-backend: add tests for config rules

903cbdc

docs: added catalog configuration docs

f546bb3

Rugvip force-pushed the rugvip/rules branch from f0324f9 to f546bb3 Compare August 31, 2020 10:30

freben approved these changes Aug 31, 2020

View reviewed changes

catalog-backend: updated docs to not include location in catalog.rule…

74e55ef

…s + test

marcuseide merged commit 9086099 into master Aug 31, 2020

marcuseide deleted the rugvip/rules branch August 31, 2020 13:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

catalog-backend: initial catalog ingestion rules implementation #2118

catalog-backend: initial catalog ingestion rules implementation #2118

Rugvip commented Aug 25, 2020

Fox32 commented Aug 26, 2020

Rugvip commented Aug 26, 2020

freben commented Aug 26, 2020

freben Aug 26, 2020

freben Aug 26, 2020

Rugvip Aug 27, 2020

Rugvip commented Aug 27, 2020

freben commented Aug 27, 2020

freben commented Aug 27, 2020 •

edited

Loading

Rugvip commented Aug 31, 2020

freben Aug 31, 2020

freben Aug 31, 2020

freben Aug 31, 2020 •

edited

Loading

Rugvip Aug 31, 2020

Rugvip Aug 31, 2020

catalog-backend: initial catalog ingestion rules implementation #2118

catalog-backend: initial catalog ingestion rules implementation #2118

Conversation

Rugvip commented Aug 25, 2020

Fox32 commented Aug 26, 2020

Rugvip commented Aug 26, 2020

freben commented Aug 26, 2020

freben Aug 26, 2020

Choose a reason for hiding this comment

freben Aug 26, 2020

Choose a reason for hiding this comment

Rugvip Aug 27, 2020

Choose a reason for hiding this comment

Rugvip commented Aug 27, 2020

freben commented Aug 27, 2020

freben commented Aug 27, 2020 • edited Loading

Rugvip commented Aug 31, 2020

freben Aug 31, 2020

Choose a reason for hiding this comment

freben Aug 31, 2020

Choose a reason for hiding this comment

freben Aug 31, 2020 • edited Loading

Choose a reason for hiding this comment

Rugvip Aug 31, 2020

Choose a reason for hiding this comment

Rugvip Aug 31, 2020

Choose a reason for hiding this comment

freben commented Aug 27, 2020 •

edited

Loading

freben Aug 31, 2020 •

edited

Loading