Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update S3 collector to support collecting from a directory within the bucket #1871

Merged
merged 4 commits into from
May 7, 2024

Conversation

huggingpixels
Copy link
Contributor

Description of the PR

  • Implemented passing prefix argument to s3 bucket SDK function listing files to download. It is the way to find s3 file objects in directory objects.
  • The CLI argument is called path to match other collectors also I find it easier to read and understand the usage. But in the code parts working directly with AWS CDK named variables as prefix to stay consistent with AWS naming.
  • The list files function lists all s3 objects, resulting in error when the download function attempts to download directory objects as files. Added filter to skip over directory objects.

"Fixes #1795"

PR Checklist

  • All commits have a Developer Certificate of Origin (DCO) -- they are generated using -s flag to git commit.
  • All new changes are covered by tests
  • If GraphQL schema is changed, make generate has been run
  • If OpenAPI spec is changed, make generate has been run
  • If collectsub protobuf has been changed, make proto has been run
  • All CI checks are passing (tests and formatting)
  • All dependent PRs have already been merged

Copy link
Contributor

@dejanb dejanb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@pxp928 pxp928 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank You!

cmd/guaccollect/cmd/s3.go Show resolved Hide resolved
Copy link
Collaborator

@mihaimaruseac mihaimaruseac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you

cmd/guaccollect/cmd/s3.go Show resolved Hide resolved
@pxp928
Copy link
Collaborator

pxp928 commented May 7, 2024

@huggingpixels a rebase will auto-merge this PR! Thank you!

@huggingpixels huggingpixels force-pushed the s3-collect-from-folder branch 2 times, most recently from 66c1aa5 to 36c7195 Compare May 7, 2024 12:44
Signed-off-by: Eszter Szucs-Matyas <eszter.szucs.matyas@gmail.com>
Signed-off-by: Eszter Szucs-Matyas <eszter.szucs.matyas@gmail.com>
Signed-off-by: Eszter Szucs-Matyas <eszter.szucs.matyas@gmail.com>
Signed-off-by: Eszter Szucs-Matyas <eszter.szucs.matyas@gmail.com>
@kodiakhq kodiakhq bot merged commit 73108d1 into guacsec:main May 7, 2024
8 checks passed
arorasoham9 pushed a commit to arorasoham9/guac that referenced this pull request May 17, 2024
… bucket (guacsec#1871)

* Use prefix parameter to narrow list of s3 bucket files

Signed-off-by: Eszter Szucs-Matyas <eszter.szucs.matyas@gmail.com>

* Do not try handle s3 folders as files

Signed-off-by: Eszter Szucs-Matyas <eszter.szucs.matyas@gmail.com>

* Clean up variable naming and docs

Signed-off-by: Eszter Szucs-Matyas <eszter.szucs.matyas@gmail.com>

* add s3 path collect example command

Signed-off-by: Eszter Szucs-Matyas <eszter.szucs.matyas@gmail.com>

---------

Signed-off-by: Eszter Szucs-Matyas <eszter.szucs.matyas@gmail.com>
Signed-off-by: Soham Arora <arorasoham9@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[feature] update S3 collector to support collecting from a directory within the bucket
4 participants