-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Providers as a fixed interface per type #725
Conversation
@@ -7,41 +7,52 @@ context: | |||
group: Root Group | |||
provider: github | |||
repository: | |||
- context: github | |||
- context: repo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what would the context mean in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is basically a mapping to a provider information call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're going to have different provider implementations, we might want to have a format similar to this:
context:
provider: GitHub
object: repo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The provider is on line 8; I'd like the repository rules to be able to apply to multiple providers. This means that repo
maps to the output of GetRepository
from any provider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(actually, I think I'd prefer that the provider not be part of the policy at all, or that it was a property-match situation when we have that in the future)
Overall I can see the benefits of this proposal. It decouples a lot of the complexity outside of Mediator and stashes that into each provider implementation. I understand where this is going in terms of transforming rule types into RPC implementations, and making policies apply to those instead. I also like that this would allow us to mock providers in an easier way. I also like that this would move signaling away from mediator and into the providers themselves. We could have an RPC in mediator that a provider could call in case a signal is received and reconciliation of data is needed. Sooner rather than later, we'll need to evaluate policies on git repository contents (e.g. checking specific files). I can see a way to do it in the current model by creating a new upstream retrieval type. How do you envision this working on this contractual model? For systems that are not fully relying on GitHub (e.g. using an alternative CI system) we could start doing checks that specific steps in their pipelines exist. This requires some dynamism if we want to cover more CI systems that may not have a proper API contract in place (e.g. Buildkite). |
That's a good question -- I could imagine a different type of rule than
Is Buildkite substantially different in that respect than GitHub Actions or CircleCI? |
From previous experience, most of the time you'd want to check the contents of files. Comparisons would need to be done. Here's a use case: As an organization I want to check that my teams have adopted my approved SAST tool and have it enabled in their pipelines. |
It isn't substantially different. I just wanted to give an example of a tool that we couldn't simply check with an API call as we could with github. |
One example might be "make sure that the But I was actually wondering about this in the context of the vulnerability scanning work. There, we'd like to check if a file that contains the list of dependencies (e.g. Would it be too crazy to have the policy say just "check this file pattern with this plugin" where a plugin might be e.g. a WASM plugin that the policy would point to? |
I'm wondering whether we want mediator to do this directly, or to ensure that a tool is present that does this, e.g. "ensure that dependabot OR renovate is set up", rather than "flag dependencies that need updates". Then our job is not to duplicate those tools, but to help guide people to getting them set up properly (the remediation would be to set up one of the tools, and we could eventually do that automatically / PR it into a repo). |
My bias is that "PR a file into a repo to remediate" is probably going to be tricky to align with a policy language unless we get esoteric. Being able to extract that into some external imperative code seems like it could be helpful, e.g. "to remediate, call X to do the remediation"... and then we need to work out the trust model for the credentials needed for the pull request. To start with, we could certainly just use a stacklok bot. |
Actually, thinking about this, it might be even trickier for build tools which work across both GitHub and GitLab. Presumably, we'd want the architecture to look like:
Which suggests that the Buildkite build environment would need to link in some way to the repository configuration. Even more fun is that you can have setups (like ourselves or |
I think what I'm saying is that we'll discover a really beautiful generalized architecture about 3.5 years after we get a bunch of users, at which point we'll all wail and gnash our teeth, and proclaim that "we could do it so much better the next time". |
Per discussion with @JAORMX earlier today, a sketch of what policy (
examples/github
), providers (pkg/providers/providers.go
), and our interface with providers (providers.go
again andproto/mediator/providers/providers.proto
) would look like in a world where the provider contract was a fixed set of RPCs rather than a chained-JSON-API-fetch-rule world.It might be possible to implement some of these interfaces via chained-JSON-API-fetch behind the interface, but the key point would be that we could extract
RepoProvider
,BuildProvider
, etc into grpc services which we could call from Mediator. This would allow users to implement their own providers and contribute to Mediator without needing Stacklok to run every provider in-core (which reduces the amount of blocking review Stacklok needs to provide for e.g. "Apache Foundation's self-hosted git provider").