Pass additional Chunk
information to detectors
#1517
Labels
enhancement
pkg/engine
PRs and Issues related to the `engine` package
pkg/sources
PRs and Issues related to the `sources` package
Community Note
Description
Presently, detectors have no knowledge of the source (e.g., "Git") or metadata (e.g., "file: package-lock.json"), and only receive a stream of bytes.
trufflehog/pkg/detectors/detectors.go
Lines 20 to 21 in 20b7793
While this design makes sense given TruffleHog's goal of scanning a multitude of sources (e.g., Git, Confluence, Slack), the lack of contextual information limits the power/usefulness of the detectors. For example, you cannot skip known bad filetypes like
yarn.lock
(#1460)1, nor can you write filetype/language-specific rules like checking for JDBC credentials in .java/JVM code2.Problem to be Addressed
Provide more context to Decoders so that it's possible to ignore known bad files/filetypes and write file/filetype-specific rules.
Description of the Preferred Solution
A few potential solutions come to mind:
FromData(ctx context.Context, verify bool, data []byte) ([]Result, error)
function withFromChunk(ctx context.Context, chunk Chunk) ([]Result, error)
https://github.com/trufflesecurity/trufflehog/blob/20b77938285b82bc80531ba176989b7f8bae8c4b/pkg/sources/sources.go#L14C1-L29
FromData
to includeSourceType
as well asSourceMetadata
(presumably you'd wantSourceType
to make pulling relevant metadata easier).FromData
, to determine whether or not it should run.Additional Context
N/A
References
Footnotes
As far as I can tell ↩
You can write that rule, however, it seems like it would run on every chunk which could adversely affect performance. ↩
The text was updated successfully, but these errors were encountered: