Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scan commit metadata #2713

Merged
merged 1 commit into from
Apr 25, 2024
Merged

Conversation

rgmz
Copy link
Contributor

@rgmz rgmz commented Apr 18, 2024

Description:

This fixes #2683. It scans the commit author, committer (which is typically GitHub <noreply@github.com> for GitHub, but can be different), and message.

commit 8104611d6e2e3993c104974ea4c507faced7e847
Author:     Richard Gomez <32133502+rgmz@users.noreply.github.com>
AuthorDate: Tue Feb 6 10:13:53 2024 -0500
Commit:     GitHub <noreply@github.com>
CommitDate: Tue Feb 6 10:13:53 2024 -0500

    fix: case-insensitive ext check (#2383)

Admittedly, this implementation is a bit of guess-work and may not be the best way.

Checklist:

  • Tests passing (make test-community)?
  • Lint passing (make lint this requires golangci-lint)?

@rgmz rgmz requested review from a team as code owners April 18, 2024 00:34
Copy link
Contributor

@rosecodym rosecodym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is cool! I never would have guessed that secrets could leak out in the author email.

pkg/gitparse/gitparse.go Show resolved Hide resolved
pkg/gitparse/gitparse.go Outdated Show resolved Hide resolved
pkg/gitparse/gitparse.go Show resolved Hide resolved
// Scan the commit metadata.
// See https://github.com/trufflesecurity/trufflehog/issues/2683
var (
metadata = s.sourceMetadataFunc("", email, fullHash, when, remoteURL, 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use a magic constant filename to signal to users that the chunk came from commit metadata? I'm a little worried that a missing filename might confuse people.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do something obvious like COMMIT. I will mention that empty filename is currently how this is handled elsewhere, e.g., comments

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, if this isn't a new approach I'm less concerned about it, but it still seems like if it's low-hanging fruit we might as well grab it while we're here. I'll leave the decision on fruit height up to you.

pkg/sources/git/git.go Show resolved Hide resolved
@rgmz rgmz force-pushed the feat/scan-commit-metadata branch 3 times, most recently from 6de32aa to 76d5cb5 Compare April 23, 2024 14:40
@rgmz rgmz force-pushed the feat/scan-commit-metadata branch from 76d5cb5 to e4f4af8 Compare April 23, 2024 14:56
Copy link
Contributor Author

@rgmz rgmz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed some changes to handle Git Notes as well. However, these may not be pulled by default and would need to be added as an additional ref to checkout in #1918.

@@ -22,7 +22,7 @@ import (

const (
// defaultDateFormat is the standard date format for git.
defaultDateFormat = "Mon Jan 02 15:04:05 2006 -0700"
defaultDateFormat = "Mon Jan 2 15:04:05 2006 -0700"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is a bug as git log appears to use d and not dd formatting. It's unclear if or how frequently this failure occurred, as the logging in isDateLine/isAuthorDateLine was level 2.

pkg/gitparse/gitparse.go Show resolved Hide resolved
Copy link
Contributor

@rosecodym rosecodym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool stuff!

// Scan the commit metadata.
// See https://github.com/trufflesecurity/trufflehog/issues/2683
var (
metadata = s.sourceMetadataFunc("", email, fullHash, when, remoteURL, 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, if this isn't a new approach I'm less concerned about it, but it still seems like if it's low-hanging fruit we might as well grab it while we're here. I'll leave the decision on fruit height up to you.

pkg/gitparse/gitparse.go Show resolved Hide resolved
pkg/gitparse/gitparse.go Show resolved Hide resolved
@rgmz
Copy link
Contributor Author

rgmz commented Apr 24, 2024

Linting issues seem to be carried over from #2643.

https://github.com/trufflesecurity/trufflehog/actions/runs/8802614804

@rosecodym rosecodym merged commit 81a9c81 into trufflesecurity:main Apr 25, 2024
10 of 12 checks passed
@rgmz rgmz deleted the feat/scan-commit-metadata branch April 25, 2024 14:24
@rgmz
Copy link
Contributor Author

rgmz commented Apr 25, 2024

@rosecodym Any idea what might be causing the test failure?

=== Failed
=== FAIL: pkg/sources/git TestSource_Chunks_Integration/remote_repo,_unauthenticated (0.23s)
2024-04-25T14:15:48Z	error	context	error chunking	{"error": "context canceled"}
2024-04-25T14:15:48Z	info-0	context	error scanning repository	{"repo": "../../../", "error": "context canceled"}
2024-04-25T14:15:48Z	error	context	error chunking	{"error": "context canceled"}
2024-04-25T14:15:48Z	info-0	context	error scanning repository	{"repo": "https://github.com/dustin-decker/secretsandstuff.git", "error": "context canceled"}
2024-04-25T14:15:48Z	error	context	error chunking	{"error": "context canceled"}
2024-04-25T14:15:48Z	info-0	context	error scanning repository	{"repo": "https://github.com/dustin-decker/secretsandstuff.git", "error": "context canceled"}
2024-04-25T14:15:48Z	error	context	error chunking	{"error": "context canceled"}
2024-04-25T14:15:48Z	info-0	context	error scanning repository	{"repo": "https://github.com/dustin-decker/secretsandstuff.git", "error": "context canceled"}
    git_test.go:254: A chunk exists that was not expected with key "70001020fab32b1fcf2f1f0e5c66424eae649826-"
    git_test.go:254: A chunk exists that was not expected with key "a6f8aa55736d4a85be31a0048a4607396898647a-"
    git_test.go:254: A chunk exists that was not expected with key "73ab4713057944753f1bdeb80e757380e64c6b5b-"
    git_test.go:254: A chunk exists that was not expected with key "2f251b8c1e72135a375b659951097ec7749d4af9-"
    git_test.go:254: A chunk exists that was not expected with key "84e9c75e388ae3e866e121087ea2dd45a71068f2-"
    git_test.go:254: A chunk exists that was not expected with key "e6c8bbabd8796ea3cd85bfc2e55b27e0a491747f-"
    git_test.go:254: A chunk exists that was not expected with key "735b52b0eb40610002bb1088e902bd61824eb305-"
    git_test.go:254: A chunk exists that was not expected with key "ce62d79908803153ef6e145e042d3e80488ef747-"
    git_test.go:254: A chunk exists that was not expected with key "27fbead3bf883cdb7de9d7825ed401f28f9398f1-"
    git_test.go:254: A chunk exists that was not expected with key "8afb0ecd4998b1179e428db5ebbcdc8221214432-"
    git_test.go:254: A chunk exists that was not expected with key "8fe6f04ef1839e3fc54b5147e3d0e0b7ab971bd5-"

=== FAIL: pkg/sources/git TestSource_Chunks_Integration/remote_repo,_limited (0.19s)
    git_test.go:254: A chunk exists that was not expected with key "70001020fab32b1fcf2f1f0e5c66424eae649826-"

=== FAIL: pkg/sources/git TestSource_Chunks_Integration/remote_repo,_main_ahead_of_branch (0.20s)
    git_test.go:254: A chunk exists that was not expected with key "547865c6cc0da46622306902b1b66f7e25dd0412-"

Edit: perhaps a slight mishap causing the commit to have a - at the end? Idk, I'll look into it.

@rosecodym
Copy link
Contributor

@rgmz should we revert this until we can figure out what's going on?

@rgmz
Copy link
Contributor Author

rgmz commented Apr 25, 2024

Perhaps. The specific error seems to be caused by the test logic (commit + - + file); it's unclear whether that's indicative of an actual issue.

case *source_metadatapb.MetaData_Git:
key = strings.TrimRight(meta.Git.Commit+"-"+meta.Git.File, "\n")
}

I think it's caused by the commit metadata being sent, which is introducing new chunks that aren't accounted for.

expectedChunkData: map[string]*byteCompare{
"70001020fab32b1fcf2f1f0e5c66424eae649826-aws": {B: []byte("[default]\naws_access_key_id = AKIAXYZDQCEN4B6JSJQI\naws_secret_access_key = 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\noutput = json\nregion = us-east-2\n")},
"a6f8aa55736d4a85be31a0048a4607396898647a-bump": {B: []byte("\n\nf\n")},

@rosecodym
Copy link
Contributor

Ok, I'm going to revert. Thanks for catching this so early!

rosecodym added a commit that referenced this pull request Apr 25, 2024
rosecodym added a commit that referenced this pull request Apr 25, 2024
@rgmz rgmz mentioned this pull request Apr 26, 2024
2 tasks
rosecodym pushed a commit that referenced this pull request Apr 29, 2024
This is a follow-up to #2713 that fixes the strange test error.

As suspected, the failure was caused by additional diffs not being included in the test's expected data.
haraldh added a commit to matter-labs/vault-auth-tee that referenced this pull request May 6, 2024
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[trufflesecurity/trufflehog](https://togithub.com/trufflesecurity/trufflehog)
| action | minor | `v3.74.0` -> `v3.75.0` |

---

### Release Notes

<details>
<summary>trufflesecurity/trufflehog
(trufflesecurity/trufflehog)</summary>

###
[`v3.75.0`](https://togithub.com/trufflesecurity/trufflehog/releases/tag/v3.75.0)

[Compare
Source](https://togithub.com/trufflesecurity/trufflehog/compare/v3.74.0...v3.75.0)

#### What's Changed

- \[chore] - update buffer metrics by
[@&#8203;ahrav](https://togithub.com/ahrav) in
[trufflesecurity/trufflehog#2737
- fix(deps): update module github.com/aws/aws-sdk-go to v1.51.28 by
[@&#8203;renovate](https://togithub.com/renovate) in
[trufflesecurity/trufflehog#2741
- chore(deps): update golangci/golangci-lint-action action to v5 by
[@&#8203;renovate](https://togithub.com/renovate) in
[trufflesecurity/trufflehog#2744
- Scan commit metadata by [@&#8203;rgmz](https://togithub.com/rgmz) in
[trufflesecurity/trufflehog#2713
- Fix SQL Server detector tests by
[@&#8203;rosecodym](https://togithub.com/rosecodym) in
[trufflesecurity/trufflehog#2716
- Revert "Scan commit metadata" by
[@&#8203;rosecodym](https://togithub.com/rosecodym) in
[trufflesecurity/trufflehog#2747
- \[bug] - Refactor newDiff constructor to avoid double initialization
of contentWriter by [@&#8203;ahrav](https://togithub.com/ahrav) in
[trufflesecurity/trufflehog#2742
- \[chore] - update buffered file writer metric by
[@&#8203;ahrav](https://togithub.com/ahrav) in
[trufflesecurity/trufflehog#2740
- \[refactor] - lazy buffer retrieval by
[@&#8203;ahrav](https://togithub.com/ahrav) in
[trufflesecurity/trufflehog#2745
- \[chore] Remove broken test by
[@&#8203;mcastorina](https://togithub.com/mcastorina) in
[trufflesecurity/trufflehog#2748
- \[bug] - fix buffer size metric by
[@&#8203;ahrav](https://togithub.com/ahrav) in
[trufflesecurity/trufflehog#2749
- \[bug] - Fix the metric for buffered file writer writes by
[@&#8203;ahrav](https://togithub.com/ahrav) in
[trufflesecurity/trufflehog#2750
- fix(deps): update module github.com/aws/aws-sdk-go to v1.51.29 by
[@&#8203;renovate](https://togithub.com/renovate) in
[trufflesecurity/trufflehog#2751
- update integration logos by
[@&#8203;dustin-decker](https://togithub.com/dustin-decker) in
[trufflesecurity/trufflehog#2752
- fix(deps): update module github.com/aws/aws-sdk-go to v1.51.30 by
[@&#8203;renovate](https://togithub.com/renovate) in
[trufflesecurity/trufflehog#2756
- \[chore] - add additional binary extension by
[@&#8203;ahrav](https://togithub.com/ahrav) in
[trufflesecurity/trufflehog#2760
- pkg: fix function names in comment by
[@&#8203;mountcount](https://togithub.com/mountcount) in
[trufflesecurity/trufflehog#2761
- \[chore] - ignore pbix and vsdx files by
[@&#8203;ahrav](https://togithub.com/ahrav) in
[trufflesecurity/trufflehog#2762
- fix(deps): update module github.com/aws/aws-sdk-go to v1.51.31 by
[@&#8203;renovate](https://togithub.com/renovate) in
[trufflesecurity/trufflehog#2763
- Scan commit metadata by [@&#8203;rgmz](https://togithub.com/rgmz) in
[trufflesecurity/trufflehog#2754
- \[bug] - Correctly set metrics for enumerated orgs by
[@&#8203;ahrav](https://togithub.com/ahrav) in
[trufflesecurity/trufflehog#2757
- \[chore ] -Update ignore extensions by
[@&#8203;ahrav](https://togithub.com/ahrav) in
[trufflesecurity/trufflehog#2764
- \[chore] Add some happy path logs to GitLab by
[@&#8203;mcastorina](https://togithub.com/mcastorina) in
[trufflesecurity/trufflehog#2765
- Fix Git source test by [@&#8203;rgmz](https://togithub.com/rgmz) in
[trufflesecurity/trufflehog#2767
- \[feat] - buffered file reader by
[@&#8203;ahrav](https://togithub.com/ahrav) in
[trufflesecurity/trufflehog#2731
- \[feat] - Add ReadFrom method to BufferedFileWriter by
[@&#8203;ahrav](https://togithub.com/ahrav) in
[trufflesecurity/trufflehog#2759
- fix(deps): update module google.golang.org/protobuf to v1.34.0 by
[@&#8203;renovate](https://togithub.com/renovate) in
[trufflesecurity/trufflehog#2766
- \[bug] - Improve BufferedFileReader Close Behavior by
[@&#8203;ahrav](https://togithub.com/ahrav) in
[trufflesecurity/trufflehog#2768
- fixes calendly api key regex by
[@&#8203;ankushgoel27](https://togithub.com/ankushgoel27) in
[trufflesecurity/trufflehog#2368
- Expose detector-specific false positive logic by
[@&#8203;rosecodym](https://togithub.com/rosecodym) in
[trufflesecurity/trufflehog#2743
- Detector-Fix: Reintroduce Cloudflareglobalapikey by
[@&#8203;ankushgoel27](https://togithub.com/ankushgoel27) in
[trufflesecurity/trufflehog#2101
- Detector-Competition-Fix - fixed the alchemy detector regex by
[@&#8203;ankushgoel27](https://togithub.com/ankushgoel27) in
[trufflesecurity/trufflehog#1821
- fix(deps): update module github.com/aws/aws-sdk-go to v1.51.32 by
[@&#8203;renovate](https://togithub.com/renovate) in
[trufflesecurity/trufflehog#2769
- fix(deps): update module google.golang.org/api to v0.177.0 by
[@&#8203;renovate](https://togithub.com/renovate) in
[trufflesecurity/trufflehog#2770
- \[chore] - update imports by
[@&#8203;ahrav](https://togithub.com/ahrav) in
[trufflesecurity/trufflehog#2772
- adds build version to finished scanning log by
[@&#8203;zricethezav](https://togithub.com/zricethezav) in
[trufflesecurity/trufflehog#2773
- Update rabbitmq.go regex detect amqps protocol by
[@&#8203;NikhilPanwar](https://togithub.com/NikhilPanwar) in
[trufflesecurity/trufflehog#2609
- fix for infinite recursion in Postman var sub by
[@&#8203;zricethezav](https://togithub.com/zricethezav) in
[trufflesecurity/trufflehog#2780

#### New Contributors

- [@&#8203;mountcount](https://togithub.com/mountcount) made their first
contribution in
[trufflesecurity/trufflehog#2761

**Full Changelog**:
trufflesecurity/trufflehog@v3.74.0...v3.75.0

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/matter-labs/vault-auth-tee).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy4zMzEuMCIsInVwZGF0ZWRJblZlciI6IjM3LjMzMS4wIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6W119-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

[Request] Scan the Commit Metadata (email, name and commit message) for secrets
2 participants