Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an audit --display option #191

Closed
KevinHock opened this issue Jun 10, 2019 · 8 comments
Closed

Add an audit --display option #191

KevinHock opened this issue Jun 10, 2019 · 8 comments
Assignees
Labels
good first issue The issue can be tackled by someone who has little to no knowledge about the project.

Comments

@KevinHock
Copy link
Collaborator

This will be extraordinarily helpful when developing new plugins, so I think it might be great to incorporate into detect-secrets.

--display would display, per plugin, the

Plugin A:
    Positives:
         - some_real_key
    Negatives:
         - "foo"
    Unknowns:
        - "bar"
...

This would take some of the manual effort involved with adding a new plugin and adjusting the sensitivities and regexes, e.g. right now it is roughly: run scan, cat the baseline and see what was found by what plugins by comparing line numbers w/ the file you scanned, and then auditing to see which plaintext group was captured as a secret.

We can probably use textwrap more or something, to make the output look pretty.

@KevinHock KevinHock added good first issue The issue can be tackled by someone who has little to no knowledge about the project. development ease labels Jun 10, 2019
@domanchi
Copy link
Contributor

To be clear, are you suggesting we determine the positives and negatives by the is_secret attribute?

If this is so, might I suggest making this machine readable instead? And maybe the flag to be --results?

{
    "Base64HighEntropyString": {
        "results": {
            "positive": [
                "some_real_key",
            ],
            "negative": [
                 "foo",
            ],
            "unknown": [
                 "bar",
            ],
        },
        "config": {
            "base64_limit": 4.5,
        }
    }
}

@KevinHock
Copy link
Collaborator Author

re: is_secret attribute: Yup

++, that sounds great.

@KevinHock
Copy link
Collaborator Author

KevinHock commented Jun 25, 2019

Maybe --display-results since results is sort of the baseline itself.

Just noting for later: we could also have this display the entropy count of each secret found. That'd be nice. (the one outputted from detect-secrets scan --string "foo")

@OiCMudkips
Copy link
Contributor

What about this?

New option --output-format with three options: baseline-view, file-view and plugin-view:

baseline-view and file-view are the same. It is the default option. It makes detect-secrets output results in the existing format. baseline-view because this is what pre-commit consumes as a baseline, and file-view because this is literally what the hierarchy has at the upper-most level under the results key:

{
    ...existing metadata...,
    "results":{
        {filename1}: [SECRETS],
        {filename2}: [SECRETS]
    }
}

plugin-view would have a plugin at the upper-most level under the results key:

{
    ...existing metadata...,
    "results":{
        {plugin_name1}: {
            "config": {bla},
            "plugin_results": {
                "positive": [SECRETS],
                "negative": [SECRETS],
                "unknown": [SECRETS]
            }
        },
        {plugin_name2}: {
            "config": {bla},
            "plugin_results": {
                "positive": [SECRETS],
                "negative": [SECRETS],
                "unknown": [SECRETS]
            }
        }
    }
}

@KevinHock
Copy link
Collaborator Author

I like how we can all agree on argument names 😆

I think we want this to be fairly separate and hard-to-confuse with the existing scan output, since this is more specifically for audited baselines, as well as this outputs the plaintext as opposed to our non-intuitive (but I can't think of how to improve) hashed secret format.

There is a lot of baggage around baseline format's, this will be fresh, clean and separate from all backwards/forwards compatibility concerns.

The format in @domanchi's comment is good, with the possible improvement of the number in the following output can go next to the plaintext e.g. ('foo',0.918), where the number is from:

detect-secrets scan --string "foo"
...
Base64HighEntropyString: False (0.918)

This will remove the manual trial and error of running multiple times to determine where entropy drop-off's happen etc.

@KevinHock
Copy link
Collaborator Author

I'm wondering if it'd be helpful, to have the repo name and the current HEAD in the output, so it'd be easier to remember w/out manually naming the output [some git hash]_[the repo name], if you were developing a plugin and wanted good posterity built into it.

Since you wouldn't commit this output ever, due to the plaintext secrets, you can't really use git history to your advantage.

Thoughts?

@OiCMudkips
Copy link
Contributor

I guess this would sort of introduce the requirement that detect-secrets run on a clean repo (as in no changes) otherwise the hash wouldn't be too useful, but since this is for plugin development only, I guess it's fine.

@OiCMudkips
Copy link
Contributor

Fixed in #205

killuazhu pushed a commit to killuazhu/detect-secrets that referenced this issue Oct 30, 2019
* 1st pass cloudant tests and detector

* cleaning debugs

* whitelisting secret false positive

* correcting lint errors

* correct line break errors

* more lint

* more lint

* more lint

* more lint

* typo

* more lint

* more lint

* PR responses
killuazhu pushed a commit to IBM/detect-secrets that referenced this issue May 28, 2020
* 1st pass cloudant tests and detector

* cleaning debugs

* whitelisting secret false positive

* correcting lint errors

* correct line break errors

* more lint

* more lint

* more lint

* more lint

* typo

* more lint

* more lint

* PR responses
killuazhu pushed a commit to IBM/detect-secrets that referenced this issue Jul 9, 2020
* 1st pass cloudant tests and detector

* cleaning debugs

* whitelisting secret false positive

* correcting lint errors

* correct line break errors

* more lint

* more lint

* more lint

* more lint

* typo

* more lint

* more lint

* PR responses
killuazhu pushed a commit to IBM/detect-secrets that referenced this issue Sep 17, 2020
* 1st pass cloudant tests and detector

* cleaning debugs

* whitelisting secret false positive

* correcting lint errors

* correct line break errors

* more lint

* more lint

* more lint

* more lint

* typo

* more lint

* more lint

* PR responses
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue The issue can be tackled by someone who has little to no knowledge about the project.
Projects
None yet
Development

No branches or pull requests

3 participants