-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Fix git commit validation #4192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to fix the code (so that it works correctly no matter what directory you run in) instead of deleting it entirely?
If we know the path to the repo directory then
|
Sure, the ticket mentioned to cleanup this logic so I focused on that part. I'll check if we can make it work. |
d564d07
to
8f19f9a
Compare
@@ -1038,3 +1029,22 @@ func printAverageDetectorTime(e *engine.Engine) { | |||
fmt.Fprintf(os.Stderr, "%s: %s\n", detectorName, duration) | |||
} | |||
} | |||
|
|||
// Function to check if the commit is valid | |||
func isValidCommit(uri, commit string) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the validation code and moved the logic into the Git command flow so that validation is now performed before creating the Git config.
For local files, the validation is working as expected. However, for remote repositories passed via a https://<> URL, we would need to temporarily clone the repo to perform validation. I'm not very inclined to go down that path, especially since it's just for printing a warning log. Cloning a repo solely for commit validation feels a bit too much to me. Open to any suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I completely agree that we should not clone the repo for this. I think that validating for local repositories only is a great compromise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kashifkhan0771 I think I found a middle ground for remote repos:
git clone --filter=blob:none --no-checkout --bare <repo>
That command only pulls down tree, commit and tag objects from the remote git server. The actual repo data (found in blobs) is not sent.
This should take a fraction of a second to get all commit objects and then you can run the same git cat-file
command to validate hash validity. We might want to consider adding the git clone config -c "remote.origin.fetch=+refs/*:refs/remotes/origin/*"
that we include elsewhere, depending on the exact implementation. Note that removing that config is controllable via the flag skip-additional-refs
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for sharing this @joeleonjr - I'll check this further
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joeleonjr @rosecodym I added the remote repo commit validation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a pretty cool trick! But I feel that its maintenance cost is not worth the value it adds. Performing a clone (even a blobless one) is a relatively complex operation from a maintenance perspective, because it involves both the network and arcane git command-line flags. As Joe pointed out, if we're going to go down this route, we should unify these clone flags with the clone flags we use elsewhere, which would entail entangling this validation code with the more business-critical cloning code. And since this is a network operation, we should also add intelligible handling for network errors (so that if problems occur we clearly surface them to users), which is even more complexity that I do not feel is justified for what is fundamentally a simple validation check.
I think the UX of local-only validation is fine - as long as we don't erroneously report a validation error for remote scans. In those cases, we should skip the validation check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense @rosecodym
main.go
Outdated
return strings.TrimSpace(string(output)) == "commit" | ||
} else if strings.HasPrefix(uri, "https") { | ||
// handle https:// urls | ||
return false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't this cause the error message to get printed for every remote repo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the warning message will be printed for remote repositories. I mentioned the details here in the comment.
d6dadbf
to
f0a1d4a
Compare
2fbc12e
to
7ba1c2c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking care of this, and for tolerating my incessant pestering!
Description:
The git commit validation performs check by executing
git cat-file -t <hash>
, which means that it only works if TruffleHog is being run within the git repo being scanned. If you try to run TruffleHog in a different directory (e.g. trufflehog git /src/my-repo) the check will fail. This PR fix that validation code.Checklist:
make test-community
)?make lint
this requires golangci-lint)?