Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

catalog-backend: initial catalog ingestion rules implementation #2118

Merged
merged 5 commits into from
Aug 31, 2020

Conversation

Rugvip
Copy link
Member

@Rugvip Rugvip commented Aug 25, 2020

Fixes #1853

This implements the minimal config needed for #1853, and there's plenty of space for expanding the set of rules that can be defined. Keeping it simple for now though.

This adds support for following kind of config:

catalog:
  rules:
  - deny: [User, Group, System]

  locations:
  - type: github
    target: https://github.com/org/repo/blob/master/users.yaml
    allow: [User, Group]
  - type: github
    target: https://github.com/org/repo/blob/master/systems.yaml
    allow: [System]

... and not much more 😁

The catalog.rules config can only contain deny rules, and catalog.locations can only contain allow rules. The rules defined in catalog.locations are applied after the ones defined in catalog.rules.

If catalog.rules is not set in config it falls back to deny: [User, Group].

Hoping that this is good enough for the foreseeable future, and will allow us to safely bring in organizational and identity data into the catalog 😁

@Rugvip Rugvip requested a review from a team as a code owner August 25, 2020 16:36
@Fox32
Copy link
Contributor

Fox32 commented Aug 26, 2020

Is there any kind of "deny all" planned? Then one could allow only the ones that are required, but don't has to be afraid that when new entities are added and this configuration was not updated. Kind of a allowlist vs blocklist discussion. Maybe a special semantic like:

catalog:
  rules:
  - deny: all

  locations:
  - type: github
    target: https://github.com/org/repo/blob/master/users.yaml
    allow: [User, Group]

@Rugvip
Copy link
Member Author

Rugvip commented Aug 26, 2020

@Fox32 yep, implementation is set up for that, just didn't hook it up to config yet. Wanted to keep things slim to begin with

@freben
Copy link
Member

freben commented Aug 26, 2020

Should rules and locations be separated? The example could just as well be expressed as

catalog
  rules:
  - type: github
    target: https://github.com/org/repo/blob/master/users.yaml
    allow: [User, Group]  # alternatively: deny: [] which means "allow anything"
  - type: github
    target: https://github.com/org/repo/blob/master/systems.yaml
    allow: [System]
  - deny: [User, Group, System] # implicitly: applies to any type and target

then the engine goes through the rules one by one in search of a match, picking the first one.

or even clearer if one wants to separate the filters from the action better:

catalog
  rules:
  - type: github
    target: https://github.com/org/repo/blob/master/users.yaml
    kind: [User, Group]
    action: accept
  - type: github
    target: https://github.com/org/repo/blob/master/systems.yaml
    kind: [System]
    action: accept
  - kind: [User, Group, System] # implicitly: applies to any type and target
    action: reject
  - action: accept

};

describe('CatalogRulesEnforcer', () => {
it('should allow by default', () => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to either reject by default, or make it a hard error to not have a fallback rule at the bottom of the list of rules (i.e. to not find any matching rule).

*
* An undefined list of matchers means match all, an empty list of matchers means match none
*/
type CatalogRule = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i was thinking more along the lines of

type CatalogRule = {
  matcher: (entity: Entity) => boolean;
  action: 'allow' | 'reject';
};

since that's all that the rules engine needs to know. The constructor could take a defaultAction: 'allow' | 'reject' too I guess, but as mentioned elsewhere I would consider it an error if no matching rule was found (log harshly first time and reject, or even just explode?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opted for not going with this to be able to provide more informative error messages. Either way it's an internal implementation and we can switch it out later without much trouble if we need to.

@Rugvip
Copy link
Member Author

Rugvip commented Aug 27, 2020

@freben Well the example would be expressed like this, unless you're suggesting the the rules spawn actual locations to traverse?

catalog:
  locations:
  - type: github
    target: https://github.com/org/repo/blob/master/users.yaml
  - type: github
    target: https://github.com/org/repo/blob/master/systems.yaml

  rules:
  - type: github
    target: https://github.com/org/repo/blob/master/users.yaml
    allow: [User, Group]  # alternatively: deny: [] which means "allow anything"
  - type: github
    target: https://github.com/org/repo/blob/master/systems.yaml
    allow: [System]
  - deny: [User, Group, System] # implicitly: applies to any type and target

Seems really verbose. I think it's fine to have the short-cut of being able to specify some rules in the location config?

I think we could change it to this though, a bit more explicit and provides some better grouping of the fields:

catalog:
  rules:
  - deny: [User, Group, System]

  locations:
  - type: github
    target: https://github.com/org/repo/blob/master/users.yaml
     rules:
      allow: [User, Group]
  - type: github
    target: https://github.com/org/repo/blob/master/systems.yaml
    rules:
      allow: [System]

Regarding the application of the rules, i pretty arbitrarily chose to go with having the last matching rule be the determining one. Stopping on the first matching rule does make the processing a tiny bit faster, and it might end up being a bit more intuitive. Gonna try switching to that.

@freben
Copy link
Member

freben commented Aug 27, 2020

Oh right, I somehow entirely forgot about the static locations thing, just looked at it as a standalone rules config - yes, agreed, the locations should have rules directly on them and then the fallback rules could be their own key. Maybe name them defaultRules even, signifying how it's used? The last example makes sense, and yes, first match would be the best probably.

Possible remaining gotcha for a rules writer is that you couldn't (?) write a default rule saying allow: [A] and then a locations entry that has allow: [B], and expect that it would allow both A and B. You'd have to repeat the A, right? Which is OK to me, just thinking out loud.

@freben
Copy link
Member

freben commented Aug 27, 2020

But is there any benefit to breaking out matchers as a separate thing from action?

catalog:
  rules:
  - kind: [User, Group, System]
    action: deny

  locations:
  - type: github
    target: https://github.com/org/repo/blob/master/users.yaml
    rules:
    - kind: [User, Group]
      action: allow
  - type: github
    target: https://github.com/org/repo/blob/master/systems.yaml
    rules:
    - kind: [System]
      action: allow

That way you open up to more kinds (hah!) of matchers later on.

And also it would make it possible to say just - action: deny with no matchers to satisfy what @Fox32 mentioned.

@Rugvip
Copy link
Member Author

Rugvip commented Aug 31, 2020

Updated to deny all entities that aren't covered by a rule, and added a default rule to allow: [Component, API].

Tried out a couple of models for a more generic match + action, but tbh I couldn't find anything that I'm happy with that felt forwards compatible enough. The current config format is now a simple allow: EntityMatchers[], so there's no location configuration and so on. That should make it possible for us to add more rules and matchers without too much trouble.

The main issue with making the entire thing generic is that we need more use-cases for catalog rules. Right now I have no idea if we always want to match on location + entity tuples, it could for example be that we want to add rules for relationships, annotations on entities, locations themselves, etc. Basically the extension of the API needs to be driven by some actual use-cases, and I think the current format is a good middle ground between forwards compatibility and a compact easy to read and write format.

- allow: [Component, API]
- allow: [System]
locations:
type: github
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is perhaps slightly confusing, looking at rules.locations vs location with implicit rule :) am tempted to add a target here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still humbly wondering if this shouldn't have been

rules:
  - kind: [Component, API]
    action: allow
  - kind: [System] 
    locationType: github
    action: allow

or turned a bit inside out:

rules:
  - allow:
      kind: [Component, API]
  - allow:
      kind: [System] 
      locationType: github

both for clarity and extensibility, but not gonna be insisting on it

Copy link
Member

@freben freben Aug 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or to be able to reuse the same structure in two places:

rules:
  - action: allow
    match:
      kind: ...
locations:
  - type: ...
    target: ...
    match:
      kind: ...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah this totally shouldn't have been here at all, it's not implemented :p

I was thinking we might do something like

rules:
  allow:
  - entities:
    - kind: Component
    locations:
    - type: github

Basically the value of allow would be some kind of list of entity + location matchers. Tbh I'm not sure that's ever gonna be needed though, and hopefully we have some more use-cases to base that off of.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Realized I didn't address your second comment 😁

I tried that out, but was running into trouble with the matcher being specific to the action type, and that it also makes the assumption that we'd be matching on things at all.

The second example kinda solves the above problem, and is why I basically when with that. The current format is just a shorthand, and it should be simple enough to expand the configuration if needed. As mentioned in my previous comment I think we should wait with that though.

@marcuseide marcuseide merged commit 9086099 into master Aug 31, 2020
@marcuseide marcuseide deleted the rugvip/rules branch August 31, 2020 13:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Catalog: support mapping of allowed entity kinds per location type
4 participants