-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filter scans by GitHub repository visibility #1777
base: main
Are you sure you want to change the base?
Filter scans by GitHub repository visibility #1777
Conversation
Thanks for opening this PR @0x736E, would you mind fixing the merge conflict? We'll get a review on this soon. |
Overall, this PR LGTM and I think this is a useful feature. @ahrav @0x1 @mcastorina would any of you care to give this a review too? |
Merged with main and removed whitespace |
@@ -87,6 +87,7 @@ var ( | |||
githubScanIssueComments = githubScan.Flag("issue-comments", "Include issue comments in scan.").Bool() | |||
githubScanPRComments = githubScan.Flag("pr-comments", "Include pull request comments in scan.").Bool() | |||
githubScanGistComments = githubScan.Flag("gist-comments", "Include gist comments in scan.").Bool() | |||
githubVisibility = githubScan.Flag("visibility", `Repositories to include based on their visibility state. Example: "public", "private", "shared", "unknown"`).String() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: If the only valid values are the ones listed in the help text, we could use Enum() instead of String()
.
githubVisibility = githubScan.Flag("visibility", `Repositories to include based on their visibility state. Example: "public", "private", "shared", "unknown"`).String() | |
githubVisibility = githubScan.Flag("visibility", `Repositories to include based on their visibility state. Example: "public", "private", "shared", "unknown"`).Enum("public", "private", "shared", "unknown") |
// AND the visibility of this repo is not in the allowed list | ||
if len(s.scanOptions.Visibility) > 0 { | ||
vis := s.visibilityOf(ctx, repoURL) | ||
if !slices.Contains(s.scanOptions.Visibility, vis) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: Converting the slice to a set will have better performance as the number of repos scanned increases.
visibilitySet := make(map[source_metadatapb.Visibility]struct{}, len(s.scanOptions.Visibility))
for _, vis := range s.scanOptions.Visibility {
visibilitySet[vis] = struct{}{}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is where using a PR as an excuse to learn a new language meets with reality. Thanks for the suggestion, I'll look into it and make the changes.
if len(s.scanOptions.Visibility) > 0 { | ||
vis := s.visibilityOf(ctx, repoURL) | ||
if !slices.Contains(s.scanOptions.Visibility, vis) { | ||
s.log.V(2).Info("Skipping repository due to visibility requirements " + repoURL + " (" + vis.String() + ")") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit (optional): Logs are more useful when the message is static
s.log.V(2).Info("Skipping repository due to visibility requirements " + repoURL + " (" + vis.String() + ")") | |
s.log.V(2).Info("Skipping repository due to visibility requirements", "repo", repoURL, "visibility", vis.String()) |
func NewScanOptions(options ...ScanOption) *ScanOptions { | ||
scanOptions := &ScanOptions{ | ||
GitScanOptions: nil, | ||
Visibility: []source_metadatapb.Visibility{}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit (optional): nil
is a valid and safe default slice value.
func getValidVisibilityScanOption(optVis string) source_metadatapb.Visibility { | ||
optVis = strings.TrimSpace(optVis) | ||
switch optVis { | ||
case "public", "private", "shared": | ||
return source_metadatapb.Visibility(source_metadatapb.Visibility_value[optVis]) | ||
|
||
default: | ||
return source_metadatapb.Visibility_invalid | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why allow unknown
in the flag if it's going to result in an invalid visibility?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a very good point, it shouldn't.
Unknown is more or less the result of an error in determining the visibility of a repo, for this use case it seems unwise to allow users to target only repos which cause errors.
@@ -159,6 +159,8 @@ enum Visibility { | |||
private = 1; | |||
shared = 2; | |||
unknown = 3; | |||
|
|||
_invalid = -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is something we should add to the proto. Instead, the validation should be performed by the CLI parsing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't particularly happy with this in the first instance, I will review it.
Overall this looks like a great addition! I'd prefer to leave the proto untouched though, which is why I requested changes. |
I've reviewed your suggestions, I agree with them, so I'll make the appropriate changes and update this PR when ready. |
Description:
Implementation of Filter scans to only public repositories #1483.
Adds
--visibility
commandline argument which enables filtering scanning based on GitHub repository visibility status by providing a comma separated list. Acceptable visibility configuration includes "public", "private" and "shared" . For example, to only scan by repositories which are either private or shared:--visibility="private,shared"
Default behaviour is not changed as no filtering is applied in the absense of user defined visibility filter configuration.
Internally, a new ScanOptions object is created to represent GitHub specific scan options.
No new tests were created because there appears to be an absence of unit tests for credential verification, and the manner in which this feature works requires checking the user's credentials against a repository.
Checklist:
make test-community
)?make lint
this requires golangci-lint)?