Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] Continuous, unattended Google Cloud Storage collection #1005

Open
migmartri opened this issue Jul 2, 2023 · 2 comments
Open

[feature] Continuous, unattended Google Cloud Storage collection #1005

migmartri opened this issue Jul 2, 2023 · 2 comments
Labels
enhancement New feature or request

Comments

@migmartri
Copy link
Contributor

Hi,

In this PR #989, we exposed the GCS collector via the guacone CLI, this means that an user can on-demand collect SBOMs and other pieces of metadata form a GCS bucket.

This issue is about being able to configure such process but in such as way that is run periodically and unattended.

Describe the solution you'd like

I want to be able to configure Guac with tuples of bucket + credentials that the system could use to fetch periodically data from those data sources.

Describe alternatives you've considered

I've considered using guacone itself with a cron-like daemon, but I wanted to explore if this could become a first-class feature, since some of the foundations seems to be there (oci+git datasources)

Additional context

Our goal is to allow Chainloop users to be able to send SBOMs end to end automatically.

The first leg of the journey (CI -> GCS bucket) is fully automated but the last leg (GCS -> Guac) requires manual intervention via guacone collect #989. And it is this last leg what we want to automate too.

Untitled-2022-12-20-1126

Note: it might be possible that this feature might exist already and I am just not able to figure out how to configure it.

Thanks!

Refs chainloop-dev/chainloop#209

@migmartri migmartri added the enhancement New feature or request label Jul 2, 2023
@lumjjb
Copy link
Contributor

lumjjb commented Jul 5, 2023

Ah yes - we have collectors that can run as daemons - which I believe should do exactly what you're asking for.

We have this being done for files, would something like this work?
https://github.com/guacsec/guac/blob/main/cmd/guaccollect/cmd/files.go

$ bin/guaccollect files --help
take a folder of files and create a GUAC graph utilizing Nats pubsub

Usage:
  guaccollect files [flags] file_path

Flags:
  -h, --help   help for files

Global Flags:
      --csub-addr string   address to connect to collect-sub service (default "localhost:2782")
      --nats-addr string   address to connect to NATs Server (default "nats://127.0.0.1:4222")
      --service-poll       sets the collector or certifier to polling mode (default true)
      --use-csub           use collectsub server for datasource (default true)

The only one caveat about this (for now) is there's a current known issue for large document files #731, which I am currently working on in the coming weeks.

@pxp928
Copy link
Collaborator

pxp928 commented Jul 5, 2023

+1 to @lumjjb, the GCS (and all the other collectors) are already set up to do polling to fetch periodically.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants