Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added a KeywordDetector plugin #76

Merged
merged 6 commits into from
Sep 20, 2018
Merged

Added a KeywordDetector plugin #76

merged 6 commits into from
Sep 20, 2018

Conversation

KevinHock
Copy link
Collaborator

@KevinHock KevinHock commented Sep 17, 2018

Added _remove_password_secrets_if_line_reported_already

Because is often the case that e.g.
    SUPER_SECRET_VALUE = 'c3VwZXIgbG9uZyBzdHJ'
is reported both by the PasswordDetector and another plugin.

To minimize diff size, we will simply not report findings from
the PasswordDetector if another plugin reports a secret on the
same line.
@@ -0,0 +1,68 @@
"""
This code was extracted in part from
https://github.com/PyCQA/bandit. Using similar heuristic logic,
Copy link
Collaborator Author

@KevinHock KevinHock Sep 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greetings @ericwb and @viraptor, you both seem like the primary maintainers of Bandit.

This code is largely borrowed from plugins/general_hardcoded_password.py and I've tried my best to reference it as such per the APACHE license of Bandit. Can you please give this your blessing, or let me know if it's not to your liking?

Cheers,
Kevin

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks 👌 to me. Thank you very much for checking!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @viraptor! Feel free to checkout PyT sometime ;)

)


class PasswordDetector(BasePlugin):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

KeywordDetector?



class PasswordDetector(BasePlugin):
"""This checks for private keys by determining whether the blacklisted
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whoops. copy pasta?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed

self.data[filename].update(results)
self.data[filename].update(file_results)

def _remove_password_secrets_if_line_reported_already(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of removing, why don't we change our baseline schema to display types in list form? This way, we can preserve all the intelligence that we gather on a particular flagged secret.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it would break all the existing baseline's we made (like when we changed the high entropy string types). I'm not sure how hard that would be to implement though, I can look into that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, good point. Though, arguably a step in the right direction?

Thankfully, we built things for doing better version bumps, since the last time. We would need to update the merge_baseline function, and the pre-commit hook should get all active clients slowly upgrading.

The only gotchas I see are:

  • Server needs to be updated (and handle stale clients through detected baseline versions)
  • We need to handle this case in audit.

Perhaps the fix is merely to implement the dedup logic in auditing (as it's the only place that I can find that uses the type output from the baseline)?

e.g.

def to_list(value):
    try:
        iter(value)
        return value
    except ValueError:
        return [value]

...

plugin = initialize.from_secret_type(
    to_list(secret['type']),
    plugin_settings,
)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: Instead of removing, why don't we change our baseline schema to display types in list form? This is gonna lead to some super duper ugly code b/c a lot of the code was built around how a single plugin generates a PotentialSecret. I think it is okay to write, but I'm just mentioning b/c it'll get pretty ugly. 😁

e.g. we would still want to yield each individual plugin in _results_accumulator, and so after changing the type attribute of a PotentialSecret to be a list, I would do what I'm doing now, except in addition to removing, I would also add the type to the secret reported by another plugin:

        for secret in file_results:
            if secret.type == 'Password':
                password_secrets.append(secret)
            else:
                line_numbers_of_other_plugins.add(secret.lineno)

        for password_secret in password_secrets:
            if password_secret.lineno in line_numbers_of_other_plugins:
                del file_results[password_secret]
+               for secret in file_results:
+                   if password_secret.lineno == secret.lineno:
+                       secret.type.append('Password')

If you can give that your blessing I'm happy to make the types into a list 🙏

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you said Perhaps the fix is merely to implement the dedup logic in auditing, did you mean you don't mind having all the Password ones separate from the others in the baseline? We might double our baseline sizes but I think I'm leaning towards that a little :D

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Stating for the record, discussed in person yesterday, that we will just take the baseline size increase, and punt of de-duping in the audit functionality until later.)

@KevinHock KevinHock changed the title Added a PasswordDetector plugin Added a KeywordDetector plugin Sep 20, 2018
@OiCMudkips
Copy link
Contributor

Maybe add a note to the README that this at least attempts to block these now? Otherwise looks good to me.

@KevinHock
Copy link
Collaborator Author

Good point!

@KevinHock KevinHock merged commit c0bea0b into master Sep 20, 2018
@KevinHock KevinHock deleted the password_plugin branch March 21, 2019 22:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants