Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 fix Docker remediations for unpinned GHA dependencies #4131

Merged

Conversation

raboof
Copy link
Contributor

@raboof raboof commented May 29, 2024

What kind of change does this PR introduce?

Bug fix

What is the current behavior?

As both the check for unpinned dependencies in GitHub Actions and the check for unpinned Docker dependencies contribute to d.Dependencies, the loop that created remediations for Docker dependencies would also create try to create Docker remediations for the unpinned GitHub Actions dependencies.

This could get really slow, especially when scanning a repo with many GitHub Actions such as https://github.com/apache/beam.

What is the new behavior (if this is a feature change)?**

Only create Docker remediations for Docker dependencies (and similar for GitHub workflow remediations)

  • Tests for the changes have been added (for bug fixes/features)

Which issue(s) this PR fixes

NONE

Special notes for your reviewer

Does this PR introduce a user-facing change?

For user-facing changes, please add a concise, human-readable release note to
the release-note

(In particular, describe what changes users might need to make in their
application as a result of this pull request.)

Improve Pinned-Dependencies remediation creation

Previously, as both the check for unpinned dependencies in
GitHub Actions and the check for unpinned Docker dependencies
contribute to d.Dependencies, the loop that created remediations
for Docker dependencies would also create try to create Docker
remediations for the unpinned GitHub Actions dependencies.

This could get really slow, especially when scanning a repo
with many GitHub Actions such as https://github.com/apache/beam.

Signed-off-by: Arnout Engelen <arnout@bzzt.net>
@raboof raboof force-pushed the fix-pinned-dependency-spurious-remediations branch from 789b16e to fcdfdb7 Compare May 29, 2024 12:38
@raboof raboof temporarily deployed to integration-test May 29, 2024 12:39 — with GitHub Actions Inactive
@raboof raboof temporarily deployed to integration-test May 29, 2024 17:25 — with GitHub Actions Inactive
@raboof raboof force-pushed the fix-pinned-dependency-spurious-remediations branch from fdd03b2 to fa6b0df Compare May 29, 2024 17:29
@raboof raboof temporarily deployed to integration-test May 29, 2024 17:30 — with GitHub Actions Inactive
Signed-off-by: Arnout Engelen <arnout@bzzt.net>
@raboof raboof force-pushed the fix-pinned-dependency-spurious-remediations branch from fa6b0df to 0a2dcf0 Compare May 29, 2024 17:33
@raboof raboof temporarily deployed to integration-test May 29, 2024 17:33 — with GitHub Actions Inactive
@raboof raboof marked this pull request as ready for review May 29, 2024 17:41
@raboof raboof requested a review from a team as a code owner May 29, 2024 17:41
@raboof raboof requested review from naveensrinivasan and raghavkaul and removed request for a team May 29, 2024 17:41
Copy link
Contributor

@spencerschrock spencerschrock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the context of how the code works now, I think this change seems reasonable.

How did the speed-up look on your end? Is there still need for improvement, we can try to get rid of the quadratic behavior currently exhibited as the code iterates over dependencies multiple times, which for large projects like the ASF may take a lot of time still.

These functions are all called from

func PinningDependencies(c *checker.CheckRequest) (checker.PinningDependenciesData, error) {
var results checker.PinningDependenciesData
// GitHub actions.
if err := collectGitHubActionsWorkflowPinning(c, &results); err != nil {
return checker.PinningDependenciesData{}, err
}
// // Docker files.
if err := collectDockerfilePinning(c, &results); err != nil {
return checker.PinningDependenciesData{}, err
}
// Docker downloads.
if err := collectDockerfileInsecureDownloads(c, &results); err != nil {
return checker.PinningDependenciesData{}, err
}
// Script downloads.
if err := collectShellScriptInsecureDownloads(c, &results); err != nil {
return checker.PinningDependenciesData{}, err
}
// Action script downloads.
if err := collectGitHubWorkflowScriptInsecureDownloads(c, &results); err != nil {
return checker.PinningDependenciesData{}, err
}

So collectGitHubActionsWorkflowPinning identifies workflow deps W1, W2, and W3 via validateGitHubActionWorkflow and then cycles through them.

Then collectDockerfilePinning identifies docker images D1, D2, and D3, and via validateDockerfilesPinning, and then cycles through W1, W2, W3, AND D1, D2, and D3.

This continues with collectDockerfileInsecureDownloads, collectShellScriptInsecureDownloads, and collectGitHubWorkflowScriptInsecureDownloads.

This could be fixed by changing where we append to the the final dep slice, and iterating over temporary slices, but if its not worth the effort we don't need to do it yet.

checks/raw/pinned_dependencies_test.go Outdated Show resolved Hide resolved
@spencerschrock
Copy link
Contributor

/scdiff generate Pinned-Dependencies

Copy link

@raboof
Copy link
Contributor Author

raboof commented May 30, 2024

How did the speed-up look on your end?

scorecard --repo=https://github.com/apache/beam --checks Pinned-Dependencies went from 16 minutes to 24 seconds on my machine ;)

Is there still need for improvement

Faster is always better, and I haven't done any further profiling, but I suspect the bottlenecks will be elsewhere now.

This continues with collectDockerfileInsecureDownloads, collectShellScriptInsecureDownloads, and collectGitHubWorkflowScriptInsecureDownloads.

These don't iterate over the Dependencies, though, as far as I noticed

Signed-off-by: Arnout Engelen <arnout@bzzt.net>
@raboof raboof force-pushed the fix-pinned-dependency-spurious-remediations branch from 4a1c34f to 1d01c01 Compare May 30, 2024 07:32
@raboof raboof temporarily deployed to integration-test May 30, 2024 07:32 — with GitHub Actions Inactive
@spencerschrock
Copy link
Contributor

scorecard --repo=https://github.com/apache/beam --checks Pinned-Dependencies went from 16 minutes to 24 seconds on my machine ;)

Fantastic. Thanks for catching this!

Faster is always better, and I haven't done any further profiling, but I suspect the bottlenecks will be elsewhere now.

This continues with collectDockerfileInsecureDownloads, collectShellScriptInsecureDownloads, and collectGitHubWorkflowScriptInsecureDownloads.

These don't iterate over the Dependencies, though, as far as I noticed

I did a quick and dirty test to avoid going over the GHA deps again in collectDockerfilePinning and got another 20% speedup: 24s to 20s on my machine forgithub.com/apache/beam, which is better but like you said probably no longer a bottleneck.

@spencerschrock
Copy link
Contributor

/scdiff generate Pinned-Dependencies

Copy link

@spencerschrock spencerschrock enabled auto-merge (squash) May 30, 2024 18:38
@spencerschrock spencerschrock merged commit 7e6a09e into ossf:main May 30, 2024
36 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

2 participants