Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(integrations) guacsec/guac #209

Closed
migmartri opened this issue Jun 26, 2023 · 4 comments · Fixed by #211
Closed

feat(integrations) guacsec/guac #209

migmartri opened this issue Jun 26, 2023 · 4 comments · Fixed by #211

Comments

@migmartri
Copy link
Member

migmartri commented Jun 26, 2023

Look into how an initial integration with https://github.com/guacsec/guac would look like. It's feasibility and scope.

At first glance, one way to approach this integration would be for Chainloop to populate a network reachable location (OCI, blob storage, ...) that then can be continuously scanned by a guac collector.

Untitled-2022-12-20-1126

The data source provided by Chainloop will contain different pieces of evidence, for now we could push SBOMs (cyclonedx, spdx), in-toto statements and DSSE envelopes.

@migmartri
Copy link
Member Author

migmartri commented Jun 26, 2023

I've been taking a look at the list of possible collectors and this is my, probably non-exhaustive reading

OCI collector

It seems to be focusing only on a single OCI image for which it extracts attached files with .att and .sbom suffixes. There is an issue reported to support registry-wide collection that could potentially solve the discovery problem "in some registries" guacsec/guac#298

Google Cloud Storage (GCS)

There seems to be an implementation of blob storage on Google Cloud which could work with Chainloop if we push artifacts to a bucket. The issue with this collector though is that it does not seem to be linked or used from anywhere so I am not sure it's functional or deprecated.

From my quick scan on the repo source code it looks to me that we expect guac for now from the following data sources, oci, git repositories, github releases and purl URIs. Which represent single data items and get scheduled dynamically by the collector subscriber service it seems.

type DataSources struct {
	OciDataSources []Source
	GitDataSources []Source
	GithubReleaseDataSources []Source
	// PurlDataSources encodes the list of PURLs
	PurlDataSources []Source
}

https://github.com/guacsec/guac/blob/8282449c65a3d0a79016a3ce9fb8434916f2662b/pkg/collectsub/datasource/datasource.go#L37

Workarounds

  • OCI collector: We could push single images and forget about the discovery part just yet.
  • File collector: Allow chainloop to expose a tarball of sorts that can be untar it and processed by the user i.e ./bin/guacone collect files ...

Next steps

At this point we should probably engage with the guac community to get their take on what would be the best way moving forward. I might be missing some important bits of information on how to leverage blob storage or similar since their beta 0.1 announcement explicitly mentions s3, and Google Cloud buckets.

@migmartri migmartri changed the title initial guac integration feat(integrations) guacsec/guac Jun 26, 2023
@migmartri
Copy link
Member Author

@lumjjb recently filled out a very interesting PR that I am wondering if it has some overlap with our use-case guacsec/guac#970

@lumjjb
Copy link

lumjjb commented Jun 26, 2023

Hi @migmartri thanks for the tag here!

The architecture/design looks great and that's exactly how we'd hope to integrate!

I would say that the GCS bucket or the git collector are probably the better bet here . Due to the restrictions of OCI making it a little harder to perform discovery and tracking updates, thus the current focus only on .att and .sbom..

We do not current have the GCS collector run as part of the compose, but that is an interface that we will actively maintain. We do not currently have active users for this, and if there's any issues coming up from it, do file an issue (and/or a PR)!

On a side note guacsec/guac#970 is a little bit of a different internal sub-system (it is more for internal caching and solving the NATS queue limitation issue), so I would not rely on that right now - unless we figure out that an external solution requires a similar solution then we'd consider maintaining that as a public interface..

@migmartri
Copy link
Member Author

@lumjjb makes sense, thanks for the context! As a matter of fact, I've created an initial PR, feedback is welcome! guacsec/guac#989

Also, on the Chainloop side. Just a sneak peak on how it's going

guac

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants